Re: JanusGraph meetup topic discussion - graph OLAP & algorithms
hadoopmarc@...
Hi Ted,
Saw these two interesting threads on the dev list the other day:
https://lists.lfaidata.foundation/g/janusgraph-dev/topic/performance_optimization/80653320
https://lists.lfaidata.foundation/g/janusgraph-dev/topic/performance_issue_large/80821002
Apparently, the people at Zeotab do analytics on janusgraph at a massive scale by having many spark executors individually connect to janusgraph (skipping SparkGraphComputer/HadoopGraph). It would be interesting to have them at the meeting and hear what kind of analytic queries they do, in particular:
Saw these two interesting threads on the dev list the other day:
https://lists.lfaidata.foundation/g/janusgraph-dev/topic/performance_optimization/80653320
https://lists.lfaidata.foundation/g/janusgraph-dev/topic/performance_issue_large/80821002
Apparently, the people at Zeotab do analytics on janusgraph at a massive scale by having many spark executors individually connect to janusgraph (skipping SparkGraphComputer/HadoopGraph). It would be interesting to have them at the meeting and hear what kind of analytic queries they do, in particular:
- how do they access the table with janusgraph id's?
- how do they aggregate the results of individual spark partitions into the end result of the gremlin query?
- how do they retrieve vertex data for step 2,3,.... of the traversal (spark shuffle vs each executor retrieving additional vertex data from janusgraph)?