Re: Edge details dropped from vertex object in sparkGraphComputer


"anj...@gmail.com" <anjani...@...>
 

Hi All,

Thanks for all your inputs.  After doing some more analysis found that in SparkGraphComputer (in tinker-pop library),  vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Below configs i am using:

gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

Appreciate any suggestion/pointer to debug the issue. 

Thanks & Regards,

Anjani



On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote:
Thanks Marc for sharing detail.

Regards,
Anjani

On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:


HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.