Re: Janusgraph spark on yarn error
hadoopmarc@...
The path of the BulkLoaderVertexProgram might be doable, but I cannot help you on that one. In the stack trace above, the yarn appmaster from spark-yarn apparently tries to communicate with HBase but finds that various libraries do not match. This failure arises because the JanusGraph distribution does not include spark-yarn and thus is not handcrafted to work with spark-yarn.
For the path without BulkLoaderVertexProgram you inevitably need a JVM language (java, scala, groovy). In this case, a spark executor is unaware of any other executors running and is simply passed a callable (function) to execute (through RDD.mapPartitions() or through a spark-sql UDF). This callable can be part of a class that establish its own JanusGraph instances in the OLTP way. Now, you only have to deal with the executor CLASSPATH which does not need spark-yarn and the libs from the janusgraph distribution suffice. Some example code can be found at: https://nitinpoddar.medium.com/bulk-loading-data-into-janusgraph-part-2-ca946db26582 Best wishes, Marc |
|