Hi everyone,
I downloaded a fresh spark binary relaese (spark-2.4.0-hadoop2.7) and set the master to spark://127.0.0.1:7077. I then started all services via $SPARK_HOME/sbin/start-all.sh.
I checked that spark works with the provided example programs.
I am further using the janusgraph-0.4.0-hadoop2 binary.
Now I configured the read-cassandra-3.properties as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=spark://127.0.0.1:7077
spark.executor.memory=8g
spark.executor.extraClassPath=/home/janusgraph-0.4.0-hadoop2/lib/*
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
where the janusgraph libraries are stored in /home/janusgraph-0.4.0-hadoop2/lib/*
In my java application I now tried
Graph graph = GraphFactory.open('...')
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
and then g.V().count().next()
I get the error message:
ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread "main" java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 15, 192.168.178.32, executor 0): java.io.InvalidClassException: org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal; local class incompatible: stream classdesc serialVersionUID = -3191185630641472442, local class serialVersionUID = 6523257080464450267
Any ideas as to what might be the problem?
Thanks!
Lilly