GryoSerializer throws exception for RDD with more than 2000 partitions


Anjani Singh <anjani...@...>
 

Hi All,

We are using JaunsGraph with Casandra. Our RDD size is huge (around 400 GB). By default SparkGraphComputer was creating 905 partitions.
To increase parallel processing i increased partitions to 2000 and it's working fine, when i tried to increase it more than 2000 partitions ( even 2001), Jobs are failing with below exception :

Job aborted due to stage failure: Exception while getting task result: org.apache.tinkerpop.shaded.kryo.KryoException: java.io.IOException: org.apache.tinkerpop.shaded.kryo.KryoException: Buffer underflow.
Job aborted due to stage failure: Exception while getting task result: org.apache.tinkerpop.shaded.kryo.KryoException: java.io.IOException: I failed to find one of the right cookies.




Using below configs: 

gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.kryo.registrator: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
cassandra.input.partitioner.class: org.apache.cassandra.dht.Murmur3Partitioner


Looks like GryoSerializer does not like more than 2000 partitions. Is GryoSerializer has limitation with number of partitions in RDD?


Thanks,
Anjani