Re: Calling a SparkGraphComputer from within Spark

HadoopMarc <m.c.d...@...>

Hi Rob,

It sound like your battling skills are OK!  I have never used the PersistedOutputRDD option myself, but if your are stuck you could also try the class and the output location. This just writes the query output to hdfs and at least keeps you going.

Btw, I assumed you did not miss the
part of the reference section you linked to. Did you also try the option with the PersistedOutputRdd from the gremlin console or only from your scala program?

Cheers,    Marc

Op zondag 19 maart 2017 19:49:41 UTC+1 schreef Rob Keevil:

Last battle before I think this is all done, I need to extract the output without collecting results to the driver and exploding the memory there.

Gremlin has a page at on how to retrieve the result as a persisted RDD  However, their calculation uses a vertex program, which can name the step using memoryKey('clusterCount').  A regular traversal doesn't seem to have this option, and Spark logs that it removes the RDD after the traversal.  Do you know of any way to access this RDD?

(I've set the required property).

Join to automatically receive all group messages.