Low throughput on Janus vs Neo4j (Tuning issues?)


Carlos <512.qua...@...>
 

So I've been evaluating JanusGraph on a single machine that is also hosting a Cassandra instance. It seems that I am unable to achieve the same throughput that other users here seem to have. Currently we are using Neo4j as we are able to achieve a much higher throughput, but at the cost of being able to scale out.

I did some queries against both Janus and Neo4j through a WebSocket conneciton and timed the requests. Neo4j consistently performed much better than Janus. In our actual test setup we were able to get Neo4j to push 600 calls/second while Janus could only manage at most 50 calls/second.

With Janus I did set up indexing and expected improvements which I did not see. Attached above are the Janus configuration files I am using as well as the timed queries from Janus and Neo4j.

I am using Janus 0.1.1 on a machine with 24 GB of ram and a platter hard drive. Additionally I attempted to move Cassandra's data store to a RAM disk thinking that my platter drive was a bottleneck to no avail.
What is exactly going on with my setup that is causing this issue?


Robert Dale <rob...@...>
 

For Neo4j tests, what version of TinkerPop and Neo4j were used?
Were the queries submitted as scripts or remote traversals?
What were the memory settings for both gremlin servers?
What does the gremlin-server.yaml look like for the Neo4j server?
Were the tests done from a cold startup on both? If not, what was the warm up procedure?
Was the same client code used for both tests?  What does it look like?


Robert Dale

On Fri, Jun 16, 2017 at 2:25 PM, Carlos <512.qua...@...> wrote:
So I've been evaluating JanusGraph on a single machine that is also hosting a Cassandra instance. It seems that I am unable to achieve the same throughput that other users here seem to have. Currently we are using Neo4j as we are able to achieve a much higher throughput, but at the cost of being able to scale out.

I did some queries against both Janus and Neo4j through a WebSocket conneciton and timed the requests. Neo4j consistently performed much better than Janus. In our actual test setup we were able to get Neo4j to push 600 calls/second while Janus could only manage at most 50 calls/second.

With Janus I did set up indexing and expected improvements which I did not see. Attached above are the Janus configuration files I am using as well as the timed queries from Janus and Neo4j.

I am using Janus 0.1.1 on a machine with 24 GB of ram and a platter hard drive. Additionally I attempted to move Cassandra's data store to a RAM disk thinking that my platter drive was a bottleneck to no avail.
What is exactly going on with my setup that is causing this issue?

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Carlos <512.qua...@...>
 

Hey Robert, 

1) For Neo4J I ended up using your patch (I believe) to get Neo4J 3.0.3 working with TP 3.2.3

2) The queries were submitted as scripts. The connection to server was through the session operator.

3) The memory settings were "-Xms1024m -Xmx8192m -XX:+UseG1GC"

4) Attached is the yaml file for Neo4j.

5) Both tests were done cold.

6) I rely on this client: https://github.com/davebshow/gremlinclient which has been slightly modified to better support sessions. 
    a) I generate the traversals through this file: http://tinkerpop.apache.org/docs/3.2.1/resources/gremlin-python.py. This file has been modified to fix some issues I found along the way. 

Regards,
Carlos


On Friday, June 16, 2017 at 2:54:07 PM UTC-4, Robert Dale wrote:
For Neo4j tests, what version of TinkerPop and Neo4j were used?
Were the queries submitted as scripts or remote traversals?
What were the memory settings for both gremlin servers?
What does the gremlin-server.yaml look like for the Neo4j server?
Were the tests done from a cold startup on both? If not, what was the warm up procedure?
Was the same client code used for both tests?  What does it look like?


Robert Dale

On Fri, Jun 16, 2017 at 2:25 PM, Carlos <512...@...> wrote:
So I've been evaluating JanusGraph on a single machine that is also hosting a Cassandra instance. It seems that I am unable to achieve the same throughput that other users here seem to have. Currently we are using Neo4j as we are able to achieve a much higher throughput, but at the cost of being able to scale out.

I did some queries against both Janus and Neo4j through a WebSocket conneciton and timed the requests. Neo4j consistently performed much better than Janus. In our actual test setup we were able to get Neo4j to push 600 calls/second while Janus could only manage at most 50 calls/second.

With Janus I did set up indexing and expected improvements which I did not see. Attached above are the Janus configuration files I am using as well as the timed queries from Janus and Neo4j.

I am using Janus 0.1.1 on a machine with 24 GB of ram and a platter hard drive. Additionally I attempted to move Cassandra's data store to a RAM disk thinking that my platter drive was a bottleneck to no avail.
What is exactly going on with my setup that is causing this issue?

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.