[Performance Issue] Large partitions formed on janusgraph_ids table leading to read perf issues (throughput reduces to 1/3rd of original)


Hi all

We are using janusgraph at zeotap at humongous scale (~70B V and 50B E) backed by scylla.

Right now I am facing an issue in janusgraph_ids table, wherein there are large partitions created in the scylla DB, and this is leading to huge read performance issues. The queries hitting janusgraph_ids table are range queries and with large partitions, the reads are becoming super slow.  

I would like to know if anyone else has observed similar issue, is there a set of configurations that need to be checked or something else you would suggest.

In scylla grafana board, this issue is seen with high number of foreground read tasks.



I learned from your other thread that you use many spark executors that have their own janusgraph instance. I remember that I used a similar scheme many years ago with janusgraph-0.1.1. At that time I simply stored the janusgraph id's at load time in a partitioned file on hdfs, so that I could later use them for analytics queries with spark.

Could you elaborate on what you call the janusgraph_ids table and how you distribute vertex id's to your spark executors? According to the janusgraph data model there is no separate id  table, but vertex id's are encoded in the row key.

Best wishes,     Marc