Re: SimplePath query is slower in 6 node vs 3 node Cassandra cluster
Varun Ganesh <operatio...@...>
Hi Boxuan,toggle quoted message Show quoted text
Thank you for getting back to me. Please find my responses below:
> Did you check the hardware differences?
Yes I can confirm that the two clusters are identical except for the number of nodes.
> the data involved in your query is probably distributed across nodes
This was our initial guess as well. However, if that was the case, we should technically observe this slowness for all the queries that we try. But it is only observed for "path" queries.
For instance, here's an example of another traversal query where we observe the SAME latency across the 3 and 6 node clusters:
g.V().hasLabel('label_B').has('some_id', 123).has('data.name', 1234567).both('sample_edge').valueMap('data.field1', 'data.field2').next(10)
> Then there would be fewer round-trips happening within the 3-node cluster
I also want to point out that we are not running the Janusgraph in embedded mode (where it is colocated with Cassandra), instead it is running separately on its own server nodes
> Of course with larger cluster you can achieve higher throughput
Interestingly we are not observing any difference in the throughput (i.e. the maximum queries per second that can be handled without seeing timeouts) between the two clusters
Would appreciate any input on where/how we could possibly investigate further.
On Thursday, November 26, 2020 at 11:19:32 AM UTC-5 li...@... wrote: