Re: Janus Graph performing OLAP with Spark/Yarn

HadoopMarc <m.c.d...@...>

Hi John,

I have plans to try this, too, so question seconded. I have TinkerPop-3.1.1 OLAP working on Spark/Yarn (Hortonworks), but the JanusGraph HBase or Cassandra dependencies will make version conflicts harder to handle.

Basically, you need:
 - your cluster configs on your application or console classpath
 - solve version conflicts. So, get rid of the lower version jars where there is a minor version difference. Report to this list if clashing versions differ by a major version number. I believe the current lib folder of the JanusGraph distribution already has a few double jars with minor version differences (sorry, have not had time to report this). You will hate spark-assembly because it is not easy to remove lower versions from dependencies included in it... Spark has some config options to load user jars first, though. I still wonder if some maven guru can help us to solve this manual work by adding the entire cluster as a dependency to the JG project and get the version conflicts at build time instead of at runtime.

Also, I might be mistaken in the above and simple configs would solve the question. So, the original questions still stands (has anyone ....)

Cheers,     Marc

Op woensdag 31 mei 2017 19:36:01 UTC+2 schreef Joseph Obernberger:

Hi John - I'm also very interested in how to do this.  We recently built a graph stored in HBase, and when we run g.E().count(), it took some 5+ hours to complete from the gremlin shell (79 million edges).  Is there any 'how to' or getting started guide on how to use Spark+YARN with this?

Thank you!


On 5/31/2017 1:06 PM, 'John Helmsen' via JanusGraph users list wrote:
Gentlemen and Ladies,

Currently our group is trying to stand up an instance of JanusGraph/Titan that performs OLAP operations using SparkGraphComputer in TinkerPop.  To do OLAP,.we wish to use Spark with Yarn.  So far, however, we have not been able to successfully launch any distributed queries, such as count(), using this approach.  While we can post stack traces, etc, I'd like to ask a different question first.

Has anyone gotten the system to perform Spark operations using YARN?
If so, how?
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
For more options, visit


Join to automatically receive all group messages.