I'm attempting to transition from Titan to JanusGraph 0.1.0 and am having problems getting OLAP queries to work via Spark. I've loaded a graph with about 2 million vertices and tried to execute a simple count:
gremlin> graph =GraphFactory.open('janusgraph-olap.properties') gremlin> g = graph.traversal(computer(SparkGraphComputer)) gremlin> g.V().count()
The job soon fails with "java.lang.ClassNotFoundException: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer", which I know is in spark-gremlin-3.2.3.jar. This appears to happen before the Spark executor has a chance to start. I tried adding this jar to spark.executor.extraClassPath, but it didn't help. Does HADOOP_GREMLIN_LIBS come into play? I've tried fiddling with it but to no avail.
I'm using HBase 1.1.2.2.5.3.0-37 and Spark 1.6 on HDP 2.5.3.0.