Edge details dropped from vertex object in sparkGraphComputer


"anj...@gmail.com" <anjani...@...>
 

Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani


Evgeniy Ignatiev <yevgeniy...@...>
 

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


HadoopMarc <bi...@...>
 

Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


"anj...@gmail.com" <anjani...@...>
 

Hi Evgeniy,

Thanks for response. We are using JanusGraph 0.4
I have not written any custom code, just using available conectedVertexProgram and tinker-pop library. 
I just commented below lines in SparkGraphComputer in tinker-pop as it was dropping edged but then also its not working .
vertexWritable.get().dropEdges(Direction.BOTH);

Please let me know if you need more details.

Thank you.
Anjani

On Thursday, 10 September 2020 at 19:56:09 UTC+5:30 yevg...@... wrote:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


"anj...@gmail.com" <anjani...@...>
 

Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


"anj...@gmail.com" <anjani...@...>
 

Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


Abhay Pandit <abha...@...>
 

Yes you are right to Anjani CloneVertexProgram will provide complete data but not connected component.

Thanks,
Abhay


On Fri, 11 Sep 2020 at 19:03, anj...@... <anjani...@...> wrote:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/884d372e-0e66-4b5e-94ac-853485d153fan%40googlegroups.com.


HadoopMarc <bi...@...>
 

Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:

https://issues.apache.org/jira/browse/TINKERPOP-1306?jql=project%20%3D%20TINKERPOP%20AND%20text%20~%20%22CloneVertexProgram%22

HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:

Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


"anj...@gmail.com" <anjani...@...>
 

Thanks Marc for sharing detail.

Regards,
Anjani

On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:


HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


"anj...@gmail.com" <anjani...@...>
 

Hi All,

Thanks for all your inputs.  After doing some more analysis found that in SparkGraphComputer (in tinker-pop library),  vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Below configs i am using:

gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

Appreciate any suggestion/pointer to debug the issue. 

Thanks & Regards,

Anjani



On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote:
Thanks Marc for sharing detail.

Regards,
Anjani

On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:


HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


HadoopMarc <bi...@...>
 

Hi Anjani,

Your original post started with:
" I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details. "

The connectedVertexPorgram only adds a component property to the vertices, what does this have to do with "edge details¨? Can you give an example of the output you want, in terms of the TinkerPop modern graph, like in the example in the docs:
https://tinkerpop.apache.org/docs/current/reference/#connectedcomponent-step

I feel we might be looking in the wrong direction.

Best wishes,    Marc

Op donderdag 17 september 2020 om 17:24:22 UTC+2 schreef anj...@...:

Hi All,

Thanks for all your inputs.  After doing some more analysis found that in SparkGraphComputer (in tinker-pop library),  vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Below configs i am using:

gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

Appreciate any suggestion/pointer to debug the issue. 

Thanks & Regards,

Anjani



On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote:
Thanks Marc for sharing detail.

Regards,
Anjani

On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:


HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


"anj...@gmail.com" <anjani...@...>
 

Hi Marc,

Thanks for response. I want to fetch all connected components with their edge details from Graph. For example say node A and node B are connected with a edge E1, then i want to create output like :
 
   {node A attributes },
   {node B attributes},
   {edge E1 details}
}

For using connectedVertexProgram and able to get node details in computerResult but not edge details.  In map-reduce stage in SparkGraphComputer class, 
we are creating mapRDD, combineRDD, reduceRDD. I tried to read reduceRDD and see vertex object was of type ComputerVertex and have vertex & edge details in it.
 Finally reduceRDD is written to memory:
mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

After that computerResult is returned :  return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Thanks,
Anjani

On Friday, 18 September 2020 at 11:49:51 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

Your original post started with:

" I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details. "

The connectedVertexPorgram only adds a component property to the vertices, what does this have to do with "edge details¨? Can you give an example of the output you want, in terms of the TinkerPop modern graph, like in the example in the docs:

I feel we might be looking in the wrong direction.

Best wishes,    Marc

Op donderdag 17 september 2020 om 17:24:22 UTC+2 schreef anj...@...:
Hi All,

Thanks for all your inputs.  After doing some more analysis found that in SparkGraphComputer (in tinker-pop library),  vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Below configs i am using:

gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

Appreciate any suggestion/pointer to debug the issue. 

Thanks & Regards,

Anjani



On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote:
Thanks Marc for sharing detail.

Regards,
Anjani

On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:


HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.


HadoopMarc <bi...@...>
 

Hi Anjani,

OK, more explicitly, is this what you are looking for:

gremlin> g = TinkerFactory.createModern().traversal().withComputer()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer]
gremlin> g.V().
    connectedComponent().
        with(ConnectedComponent.propertyName, 'component').
    group().
        by('component').
        by(project('vertex', 'outedges').
               by(valueMap(true)).
               by(outE().valueMap(true).fold()).fold())
           
==>[1:[
    [vertex:[id:2,label:person,component:[1],name:[vadas],age:[27]],outedges:[]],
    [vertex:[id:1,label:person,component:[1],name:[marko],age:[29]],outedges:[[id:9,label:created,weight:0.4],[id:7,label:knows,weight:0.5],[id:8,label:knows,weight:1.0]]],
    [vertex:[id:3,label:software,component:[1],name:[lop],lang:[java]],outedges:[]],
    [vertex:[id:4,label:person,component:[1],name:[josh],age:[32]],outedges:[[id:10,label:created,weight:1.0],[id:11,label:created,weight:0.4]]],
    [vertex:[id:5,label:software,component:[1],name:[ripple],lang:[java]],outedges:[]],
    [vertex:[id:6,label:person,component:[1],name:[peter],age:[35]],outedges:[[id:12,label:created,weight:0.2]]]
]]

Best wishes,    Marc



Op vrijdag 18 september 2020 om 08:43:52 UTC+2 schreef anj...@...:

Hi Marc,

Thanks for response. I want to fetch all connected components with their edge details from Graph. For example say node A and node B are connected with a edge E1, then i want to create output like :
 
   {node A attributes },
   {node B attributes},
   {edge E1 details}
}

For using connectedVertexProgram and able to get node details in computerResult but not edge details.  In map-reduce stage in SparkGraphComputer class, 
we are creating mapRDD, combineRDD, reduceRDD. I tried to read reduceRDD and see vertex object was of type ComputerVertex and have vertex & edge details in it.
 Finally reduceRDD is written to memory:
mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

After that computerResult is returned :  return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Thanks,
Anjani

On Friday, 18 September 2020 at 11:49:51 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

Your original post started with:

" I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details. "

The connectedVertexPorgram only adds a component property to the vertices, what does this have to do with "edge details¨? Can you give an example of the output you want, in terms of the TinkerPop modern graph, like in the example in the docs:

I feel we might be looking in the wrong direction.

Best wishes,    Marc

Op donderdag 17 september 2020 om 17:24:22 UTC+2 schreef anj...@...:
Hi All,

Thanks for all your inputs.  After doing some more analysis found that in SparkGraphComputer (in tinker-pop library),  vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));

writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.classI see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.

But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex

return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());

I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to   DetachedVertex object.

Below configs i am using:

gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

Appreciate any suggestion/pointer to debug the issue. 

Thanks & Regards,

Anjani



On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote:
Thanks Marc for sharing detail.

Regards,
Anjani

On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:


HTH,    Marc


Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
Hi Marc, 

I want to fetch connected vertices with vertex properties and edge details.
CloneVertexProgram will provide complete data but i think it will not provide it as  connected components. Please correct me if my understanding is wrong .

Thanks,
Anjani

On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,
Thanks for response.
I will check TinkerPop Jira to get details.

Thanks,
Anjani 

On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,

No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.

HTH,     Marc

Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:

Hi Anjani,

What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?

Best regards,
Evgenii Ignatev.

9/10/2020 2:11 PM, anj...@... пишет:
Hi All,

I am trying get complete data from graph (vertex details and edge details). 
We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.

I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);

After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage :
final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;

But its getting dropped in reduce stage :
final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;

I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.

Appreciate any pointer/suggestions to fix it. 

Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
-- 
Best regards,
Evgeniy Ignatiev.