Re: Calling a SparkGraphComputer from within Spark


robk...@...
 

Last battle before I think this is all done, I need to extract the output without collecting results to the driver and exploding the memory there.

Gremlin has a page at http://tinkerpop.apache.org/docs/current/reference/#interacting-with-spark on how to retrieve the result as a persisted RDD  However, their calculation uses a vertex program, which can name the step using memoryKey('clusterCount').  A regular traversal doesn't seem to have this option, and Spark logs that it removes the RDD after the traversal.  Do you know of any way to access this RDD?

(I've set the required gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.spark.structure.io.PersistedOutputRDD property).

Join {janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.