Date
1 - 4 of 4
Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications
rcanz...@...
Has everyone seen this article out of the University of Waterloo, which concludes TinkerPop 3 to be not ready for prime time?
Do We Need Specialized Graph Databases? Benchmarking
Real-Time Social Networking Applications
Anil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu
10.1145/3078447.3078459
https://event.cwi.nl/grades/2017/12-Apaci.pdf
Interested to know what other folks think of this testing setup and set of conclusions.
Jason Plurad <plu...@...>
This blew up a while ago on the Twitter last month https://twitter.com/adriancolyer/status/883226836561518594
The testing set up was less than ideal for Titan. Cassandra isn't really meant for a single node install.
The paper picked on Gremlin Server, but it didn't disclose anything about the server configuration. Some of the latency for the Gremlin Server-based runs could have been because they weren't using parameterized script bindings. Using the Gremlin Server is not a requirement for using Titan at all, and I'm aware of projects that don't even use it.
There's a team in my company that is trying to reproduce the results in that paper, then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph.
The testing set up was less than ideal for Titan. Cassandra isn't really meant for a single node install.
The paper picked on Gremlin Server, but it didn't disclose anything about the server configuration. Some of the latency for the Gremlin Server-based runs could have been because they weren't using parameterized script bindings. Using the Gremlin Server is not a requirement for using Titan at all, and I'm aware of projects that don't even use it.
There's a team in my company that is trying to reproduce the results in that paper, then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph.
On Thursday, August 3, 2017 at 2:05:23 PM UTC-4, Raymond Canzanese wrote:
Has everyone seen this article out of the University of Waterloo, which concludes TinkerPop 3 to be not ready for prime time?Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking ApplicationsAnil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu10.1145/3078447.3078459Interested to know what other folks think of this testing setup and set of conclusions.
Stephen Mallette <spmal...@...>
It did use parameters. They basically forked Jonathan Ellithorpe's work:
converted all the embedded Gremlin to strings.
Not sure how much they modified the Gremlin statements from the Ellithorpe repo. I stopped digging into it once I didn't see vertex centric indices defined and other data modelling choices I probably wouldn't have taken. LDBC is "complex" in the sense that it takes time to dig into - hasn't really been a priority to me.
I'm not sure why Gremlin Server got smacked around so badly in what they did. I couldn't find anything about how it was set up at all. They used TinkerPop 3.2.3 for their work - there have been a lot of enhancements since then in relation to memory management, so perhaps newer versions would have fared better in their tests. Again, hard to say what could/would have happened without spending a decent amount of time on it.
> then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph
very cool, jason. glad your colleagues could spend some time on that. it would be nice to hear what they find.
On Mon, Aug 7, 2017 at 1:05 PM, Jason Plurad <plu...@...> wrote:
This blew up a while ago on the Twitter last month https://twitter.com/adriancolyer/status/ 883226836561518594
The testing set up was less than ideal for Titan. Cassandra isn't really meant for a single node install.
The paper picked on Gremlin Server, but it didn't disclose anything about the server configuration. Some of the latency for the Gremlin Server-based runs could have been because they weren't using parameterized script bindings. Using the Gremlin Server is not a requirement for using Titan at all, and I'm aware of projects that don't even use it.
There's a team in my company that is trying to reproduce the results in that paper, then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph.
On Thursday, August 3, 2017 at 2:05:23 PM UTC-4, Raymond Canzanese wrote:Has everyone seen this article out of the University of Waterloo, which concludes TinkerPop 3 to be not ready for prime time?Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking ApplicationsAnil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu10.1145/3078447.3078459Interested to know what other folks think of this testing setup and set of conclusions.--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raymond Canzanese <r...@...>
Looking forward to reading about your colleagues findings, Jason. Not using indices would certainly at least partially explain the poor performance given the types of queries they were making.
On Monday, August 7, 2017 at 1:44:33 PM UTC-4, Stephen Mallette wrote:
It did use parameters. They basically forked Jonathan Ellithorpe's work:converted all the embedded Gremlin to strings.Not sure how much they modified the Gremlin statements from the Ellithorpe repo. I stopped digging into it once I didn't see vertex centric indices defined and other data modelling choices I probably wouldn't have taken. LDBC is "complex" in the sense that it takes time to dig into - hasn't really been a priority to me.I'm not sure why Gremlin Server got smacked around so badly in what they did. I couldn't find anything about how it was set up at all. They used TinkerPop 3.2.3 for their work - there have been a lot of enhancements since then in relation to memory management, so perhaps newer versions would have fared better in their tests. Again, hard to say what could/would have happened without spending a decent amount of time on it.> then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraphvery cool, jason. glad your colleagues could spend some time on that. it would be nice to hear what they find.On Mon, Aug 7, 2017 at 1:05 PM, Jason Plurad <p...@...> wrote:This blew up a while ago on the Twitter last month https://twitter.com/adriancolyer/status/ 883226836561518594
The testing set up was less than ideal for Titan. Cassandra isn't really meant for a single node install.
The paper picked on Gremlin Server, but it didn't disclose anything about the server configuration. Some of the latency for the Gremlin Server-based runs could have been because they weren't using parameterized script bindings. Using the Gremlin Server is not a requirement for using Titan at all, and I'm aware of projects that don't even use it.
There's a team in my company that is trying to reproduce the results in that paper, then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph.
On Thursday, August 3, 2017 at 2:05:23 PM UTC-4, Raymond Canzanese wrote:Has everyone seen this article out of the University of Waterloo, which concludes TinkerPop 3 to be not ready for prime time?Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking ApplicationsAnil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu10.1145/3078447.3078459Interested to know what other folks think of this testing setup and set of conclusions.--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.