[Performance Optimization] Optimization around the `system_properties` table interaction


sauverma
 
Edited

Hi all

- The interaction with the underlying KV store via janusgraph client hits the `system_properties` table with a range query where the key is `configuration`

- The observation is that the janusgraph client stores all the configurations (static + dynamic) is stored against `configuration` key

- When we run the job with spark executors, where each executor is using janusgraph embedded mode, each of these executors create executor level entries (dynamic) with the same key `configuration`

- Thus as the number of executors increase, the particular partition with the key `configuration` starts becoming a large partition, and queries with key=`configuration` become range queries scanning the large partition as seen in below graphs (these are from scylla monitoring grafana dashboard)

- I would like to know if this can be avoided - we at zeotap are using JanusGraph at tremendous scale (single graphs having 70 billion Vertices and 50 billion edges) and have identified couple of solutions to fix this



Thanks
Saurabh Verma
+91 7976984604

Join janusgraph-dev@lists.lfaidata.foundation to automatically receive all group messages.