Re: Calling a SparkGraphComputer from within Spark

HadoopMarc <m.c.d...@...>

Hi Rob,

Documentation is not abundant on this, I agree. So I read through the spark-gremlin source code and saw that SparkGraphComputer can reuse the SparkContext using the gremlin.spark.persistStorageLevel property. If you set this property and get your SparkContext with one of gremlin-spark's static Spark.create() methods, I would expect that your own jobs and the SparkGraphComputer jobs run within the same SparkContext.

HTH,    Marc

Op vrijdag 17 maart 2017 12:50:31 UTC+1 schreef Rob Keevil:


I have a Spark-based program, which writes vertices and edges to a JanusGraph cluster (Cassandra backend).  Once the write is complete, I would like to execute an OLAP traversal for all vertices, using a SparkGraphComputer.

However, in all the examples I can find, Spark is being called externally from the Groovy console, using a config file with the Spark cluster's address.  I would like to keep the process coordination all in my Spark program, is there a way to achieve this?  Or should my first Spark program be triggering a second Spark program in the same cluster?

Thanks for any help,

Join to automatically receive all group messages.