Why are Index Queries so much slower than Vertex Lookup?
I did a small experiment where Index Queries are up to 50x slower than Vertex Lookups.
I want to know if I have something wrong in my setup, or if this is as expected.
I did a small experiment with Janusgraph (backed by Cassandra), with the goal to evaluate if Janusgraph can be used as the storage layer of my application.
In this experiment, I try to evaluate Janusgraph performance of querying multiple nodes, using an index, and using the gremlin method P.within.
The query I’m doing is the following:
final Long result = graph.traversal().V()
Where the vertexIds is an array of random possible values of the internal_id, for which I’m varying the size. Previously, I’ve configured a Vertex Property Index on this field internal_id. Additionally, I can see that this index is being used when I query (see:  below)
The results I’m seeing are the following:
100 nodes - 214ms
1000 nodes - 1 636ms
10 000 nodes - 36 604ms
100 000 nodes - 281551 ms (almost 5min)
I was not expecting such bad performance, so I did another experience, to see if the problem is my Cassandra setup. This time, instead of querying the indexed property, I’m querying for Vertex Ids directly. In this, I am previously storing the mapping between my Internal Ids and Vertex Ids using RocksDB.
With this, my previous query was simplified for:
final Long result = graph.traversal()
Where Vertices are already Janus’ Vertices ids.
The results I got were much better, even if I count the time I need to access the Vertex Ids from RocksDB:
100 nodes - 12ms
1000 nodes - 58ms
10 000 nodes - 668ms
100 000 nodes - 63 222ms
Plotting this difference makes the problem much clearer (x-axis are the number of nodes I’m querying; y-axis is the time in ms the system takes to return me results):
My question here is why are these results so different?
Am I missing some configuration, or is there something I should tune for this case?
I tried to enable query-batch property when opening the graph:
… (other configuration options)
However, I’m not completely sure that this property is being used by Janusgraph, since I got no improvements from using it (tips on how to check this, are highly appreciated)
My setup: Janusgraph Embedded with 3 Cassandra Nodes (each one on a separate machine)
 Profile Query Result:
Step Count Traversers Time (ms) % Dur
JanusGraphStep(,[internal_id.within([1234, 21... 12 12 226.852 100.00
\_condition=((internal_id = 1234 OR internal_id = 212 OR internal_id = 989 OR internal_id = 199))
backend-query 12 48.379
>TOTAL - - 226.852 -