Re: [Performance Issue] Large partitions formed on janusgraph_ids table leading to read perf issues (throughput reduces to 1/3rd of original)
I learned from your other thread that you use many spark executors that have their own janusgraph instance. I remember that I used a similar scheme many years ago with janusgraph-0.1.1. At that time I simply stored the janusgraph id's at load time in a partitioned file on hdfs, so that I could later use them for analytics queries with spark.
Could you elaborate on what you call the janusgraph_ids table and how you distribute vertex id's to your spark executors? According to the janusgraph data model there is no separate id table, but vertex id's are encoded in the row key. Best wishes, Marc |
|