Re: Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks
toggle quoted message Show quoted text
The number 513 is probably the number of Cassandra partitions. You can inspect the number of partitions in the tables of the Cassandra cluster with:
$ nodetool tablestats <your_keyspace>
Involving SparkGraphComputer only helps for a large number of vertices (100.000+) because there is a lot of one-off overhead for instantiating the JVM's for the Spark executors. Even then, the 25 minutes you mention is excessive. Are you sure your k8s spark cluster was used? The janusgraph default is to use spark local inside your janusgraph container, see the docs for how to configure JanusGraph for a Spark standalone cluster.
Op vrijdag 18 oktober 2019 16:19:19 UTC+2 schreef dim...@...: