"anj...@gmail.com" <anjani...@...>
Hi All,
I am trying get complete data from graph (vertex details and edge details). We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from VertexWritable object before reducer loop. vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details. While debuting i found that edge are present in vertex object till combine stage : final JavaPairRDD combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE) ? SparkExecutor.executeCombine(mapRDD, newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage : final JavaPairRDD reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE) ? SparkExecutor.executeReduce(combineRDD, mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge details. I noticed edge information are dropped from vertex object while doing groupBy in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks, Anjani
|
|
Evgeniy Ignatiev <yevgeniy...@...>
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
--
You received this message because you are subscribed to the Google
Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/27f1da00-043c-4c07-8e69-f2dbeaddf14bn%40googlegroups.com.
--
Best regards,
Evgeniy Ignatiev.
|
|
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
toggle quoted message
Show quoted text
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi Evgeniy, Thanks for response. We are using JanusGraph 0.4
I have not written any custom code, just using available conectedVertexProgram and tinker-pop library. I just commented below lines in SparkGraphComputer in tinker-pop as it was dropping edged but then also its not working . vertexWritable.get().dropEdges(Direction.BOTH);
Please let me know if you need more details.
Thank you. Anjani
toggle quoted message
Show quoted text
On Thursday, 10 September 2020 at 19:56:09 UTC+5:30 yevg...@... wrote:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi Marc, Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
toggle quoted message
Show quoted text
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
toggle quoted message
Show quoted text
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote:
Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
Abhay Pandit <abha...@...>
Yes you are right to Anjani CloneVertexProgram will provide complete data but not connected component.
Thanks, Abhay
toggle quoted message
Show quoted text
Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/884d372e-0e66-4b5e-94ac-853485d153fan%40googlegroups.com.
|
|
Hi Anjani,
What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:
https://issues.apache.org/jira/browse/TINKERPOP-1306?jql=project%20%3D%20TINKERPOP%20AND%20text%20~%20%22CloneVertexProgram%22
HTH, Marc
Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...:
toggle quoted message
Show quoted text
Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Thanks Marc for sharing detail.
Regards, Anjani
toggle quoted message
Show quoted text
On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:
HTH, Marc
Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...: Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi All,
Thanks for all your inputs. After doing some more analysis found that in SparkGraphComputer (in tinker-pop library), vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.class. I see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.
But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex
return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());
I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to DetachedVertex object. Below configs i am using: gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
Appreciate any suggestion/pointer to debug the issue. Thanks & Regards, Anjani
toggle quoted message
Show quoted text
On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote:
Thanks Marc for sharing detail.
Regards, Anjani
On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:
HTH, Marc
Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...: Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
Hi Anjani,
Your original post started with: " I am trying get complete data from graph (vertex details and edge details). We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details. "
The connectedVertexPorgram only adds a component property to the vertices, what does this have to do with "edge details¨? Can you give an example of the output you want, in terms of the TinkerPop modern graph, like in the example in the docs: https://tinkerpop.apache.org/docs/current/reference/#connectedcomponent-step
I feel we might be looking in the wrong direction.
Best wishes, Marc
Op donderdag 17 september 2020 om 17:24:22 UTC+2 schreef anj...@...:
toggle quoted message
Show quoted text
Hi All,
Thanks for all your inputs. After doing some more analysis found that in SparkGraphComputer (in tinker-pop library), vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.class. I see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.
But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex
return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());
I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to DetachedVertex object. Below configs i am using: gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
Appreciate any suggestion/pointer to debug the issue. Thanks & Regards, Anjani
On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote: Thanks Marc for sharing detail.
Regards, Anjani
On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:
HTH, Marc
Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...: Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi Marc,
Thanks for response. I want to fetch all connected components with their edge details from Graph. For example say node A and node B are connected with a edge E1, then i want to create output like : { {node A attributes }, {node B attributes}, {edge E1 details} }
For using connectedVertexProgram and able to get node details in computerResult but not edge details. In map-reduce stage in SparkGraphComputer class, we are creating mapRDD, combineRDD, reduceRDD. I tried to read reduceRDD and see vertex object was of type ComputerVertex and have vertex & edge details in it. Finally reduceRDD is written to memory: mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));
writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.class. I see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.
After that computerResult is returned : return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable()); But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex
I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to DetachedVertex object.
Thanks, Anjani
toggle quoted message
Show quoted text
On Friday, 18 September 2020 at 11:49:51 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Your original post started with: " I am trying get complete data from graph (vertex details and edge details). We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details. "
The connectedVertexPorgram only adds a component property to the vertices, what does this have to do with "edge details¨? Can you give an example of the output you want, in terms of the TinkerPop modern graph, like in the example in the docs:
I feel we might be looking in the wrong direction.
Best wishes, Marc
Op donderdag 17 september 2020 om 17:24:22 UTC+2 schreef anj...@...: Hi All,
Thanks for all your inputs. After doing some more analysis found that in SparkGraphComputer (in tinker-pop library), vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.class. I see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.
But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex
return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());
I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to DetachedVertex object. Below configs i am using: gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
Appreciate any suggestion/pointer to debug the issue. Thanks & Regards, Anjani
On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote: Thanks Marc for sharing detail.
Regards, Anjani
On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:
HTH, Marc
Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...: Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|
Hi Anjani,
OK, more explicitly, is this what you are looking for:
gremlin> g = TinkerFactory.createModern().traversal().withComputer() ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] gremlin> g.V(). connectedComponent(). with(ConnectedComponent.propertyName, 'component'). group(). by('component'). by(project('vertex', 'outedges'). by(valueMap(true)). by(outE().valueMap(true).fold()).fold()) ==>[1:[ [vertex:[id:2,label:person,component:[1],name:[vadas],age:[27]],outedges:[]], [vertex:[id:1,label:person,component:[1],name:[marko],age:[29]],outedges:[[id:9,label:created,weight:0.4],[id:7,label:knows,weight:0.5],[id:8,label:knows,weight:1.0]]], [vertex:[id:3,label:software,component:[1],name:[lop],lang:[java]],outedges:[]], [vertex:[id:4,label:person,component:[1],name:[josh],age:[32]],outedges:[[id:10,label:created,weight:1.0],[id:11,label:created,weight:0.4]]], [vertex:[id:5,label:software,component:[1],name:[ripple],lang:[java]],outedges:[]], [vertex:[id:6,label:person,component:[1],name:[peter],age:[35]],outedges:[[id:12,label:created,weight:0.2]]] ]]
Best wishes, Marc
Op vrijdag 18 september 2020 om 08:43:52 UTC+2 schreef anj...@...:
toggle quoted message
Show quoted text
Hi Marc,
Thanks for response. I want to fetch all connected components with their edge details from Graph. For example say node A and node B are connected with a edge E1, then i want to create output like : { {node A attributes }, {node B attributes}, {edge E1 details} }
For using connectedVertexProgram and able to get node details in computerResult but not edge details. In map-reduce stage in SparkGraphComputer class, we are creating mapRDD, combineRDD, reduceRDD. I tried to read reduceRDD and see vertex object was of type ComputerVertex and have vertex & edge details in it. Finally reduceRDD is written to memory: mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));
writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.class. I see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.
After that computerResult is returned : return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable()); But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex
I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to DetachedVertex object.
Thanks, Anjani
On Friday, 18 September 2020 at 11:49:51 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Your original post started with: " I am trying get complete data from graph (vertex details and edge details). We are using connectedVertexProgram with Spark2.4, able to get vertex details but not edge details. "
The connectedVertexPorgram only adds a component property to the vertices, what does this have to do with "edge details¨? Can you give an example of the output you want, in terms of the TinkerPop modern graph, like in the example in the docs:
I feel we might be looking in the wrong direction.
Best wishes, Marc
Op donderdag 17 september 2020 om 17:24:22 UTC+2 schreef anj...@...: Hi All,
Thanks for all your inputs. After doing some more analysis found that in SparkGraphComputer (in tinker-pop library), vertex object has edge details in RDD till we add result to memory. mapReduce.addResultToMemory(finalMemory, outputRDD.writeMemoryRDD(graphComputerConfiguration, mapReduce.getMemoryKey(), reduceRDD));writeMemoryRDD is using ouput format as "SequenceFileOutputFormat.class" which calls SequenceFile.class. I see vertex object has edge details till SequenceFile.class. Till here vertex object is of type ComputerVertex.
But computerResult object does not have edge details in vertex object. I see in ComputerResult vertex object type is changed to DetachedVertex
return new DefaultComputerResult(InputOutputHelper.getOutputGraph(graphComputerConfiguration, this.resultGraph, this.persist), finalMemory.asImmutable());
I think edges are getting dropped while de-serialising and converting object to DetachedVertex. But i was not able to figure out where its getting converted to DetachedVertex object. Below configs i am using: gremlin.graph: org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader: org.janusgraph.hadoop.formats.cql.CqlInputFormat gremlin.hadoop.graphWriter: org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat gremlin.hadoop.defaultGraphComputer: org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer spark.kryo.registrator: org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
Appreciate any suggestion/pointer to debug the issue. Thanks & Regards, Anjani
On Monday, 14 September 2020 at 12:04:31 UTC+5:30 anj...@... wrote: Thanks Marc for sharing detail.
Regards, Anjani
On Saturday, 12 September 2020 at 17:40:35 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
What I tried to convey is not to use CloneVertexProgram instead of the ConnectedVertexProgram, but rather to chain these two VertexPrograms. The relatied JIRA issue I referred to, including an example, is:
HTH, Marc
Op vrijdag 11 september 2020 om 15:33:23 UTC+2 schreef anj...@...: Hi Marc, I want to fetch connected vertices with vertex properties and edge details. CloneVertexProgram will provide complete data but i think it will not provide it as connected components. Please correct me if my understanding is wrong .
Thanks, Anjani
On Friday, 11 September 2020 at 12:09:03 UTC+5:30 anj...@... wrote: Hi Marc,Thanks for response. I will check TinkerPop Jira to get details.
Thanks, Anjani
On Thursday, 10 September 2020 at 20:41:26 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No time to look for this now myself, but I remember a similar issue in the TinkerPop JIRA. I also remember there was a workaround from Daniel Kuppitz by applying the CloneVertexProgram.
HTH, Marc
Op donderdag 10 september 2020 om 16:26:09 UTC+2 schreef yevg...@...:
Hi Anjani,
What is the version of JanusGraph you are using?
Can you share some code and configuration to reproduce the issue?
Best regards,
Evgenii Ignatev.
Hi All,
I am trying get complete data from graph (vertex details and
edge details).
We are using connectedVertexProgram with Spark2.4, able to
get vertex details but not edge details.
I see in SparkGraphCompuer, edges are dropped from
VertexWritable object before reducer loop.
vertexWritable.get().dropEdges(Direction.BOTH);
After removing above line, still not getting edge details.
While debuting i found that edge are present in vertex object
till combine stage :
final JavaPairRDD
combineRDD = mapReduce.doStage(MapReduce.Stage.COMBINE)
? SparkExecutor.executeCombine(mapRDD,
newApacheConfiguration) : mapRDD;
But its getting dropped in reduce stage :
final JavaPairRDD
reduceRDD = mapReduce.doStage(MapReduce.Stage.REDUCE)
? SparkExecutor.executeReduce(combineRDD,
mapReduce, newApacheConfiguration) : combineRDD;
I see vertex object passed to executeReduce() method has edge
details. I noticed edge information are dropped from vertex
object while doing groupBy
in executeReduce() method.
Appreciate any pointer/suggestions to fix it.
Thanks,
Anjani
|
|