Re: Janusgraph spark on yarn error


hadoopmarc@...
 

The path of the BulkLoaderVertexProgram might be doable, but I cannot help you on that one. In the stack trace above, the yarn appmaster from spark-yarn apparently tries to communicate with HBase but finds that various libraries do not match. This failure arises because the JanusGraph distribution does not include spark-yarn and thus is not handcrafted to work with spark-yarn.

For the path without BulkLoaderVertexProgram you inevitably need a JVM language (java, scala, groovy). In this case, a spark executor is unaware of any other executors running and  is simply passed a callable (function) to execute (through RDD.mapPartitions() or through a spark-sql UDF). This callable can be part of a class that establish its own JanusGraph instances in the OLTP way. Now, you only have to deal with the executor CLASSPATH which does not need spark-yarn and the libs from the janusgraph distribution suffice.

Some example code can be found at:
https://nitinpoddar.medium.com/bulk-loading-data-into-janusgraph-part-2-ca946db26582

Best wishes,    Marc

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.