Re: Query Optimisation

Vinayak Bali

Hi Marc, 

This query takes 18 sec to run by changing as to aggregate and select to project. But still, 99% of the time is taken to compute union. There is no memory issue, it already set to 8g.

g.inject(1).union(V().has('property1', 'vertex1').aggregate('v1').union(outE().has('property1', 'edge1').aggregate('e').inV().has('property1', 'vertex1'),outE().has('property1', 'edge2').aggregate('e').inV().has('property1', 'vertex2')).aggregate('v2'),V().has('property1', 'vertex3').aggregate('v1').union(outE().has('property1', 'edge3').aggregate('e').inV().has('property1', 'vertex2'),outE().has('property1', 'Component_Of').aggregate('e').inV().has('property1', 'vertex1')).aggregate('v2')).limit(100).project('v1','e','v2').by(valueMap().by(unfold()))

Also, this has the same effect as removing the inner union step to separate ones.

Thanks & Regards,

On Mon, May 10, 2021 at 11:45 AM <hadoopmarc@...> wrote:
Hi Vinayak,

Your last remark explains it well: it seems that in JanusGraph a union of multiple clauses can take much longer than the sum of the individual clauses. There are still two things that we have not ruled out:

  • the repetition of as('v1') is unusual. Can you try what happens if you use the aggegate('v1')..............cap('v1', e, 'v2') mechanism instead? Or, simpler, what happens if you use neither the as() nor the aggregate() steps, omitting the formatting of the output?
  • are you sure there are no memory constraints, even if this seems unlikely given the limit(100) steps applied. You can check by increasing memory for gremlin console:
    export JAVA_OPTIONS="-Xmx4g"
Best wishes,    Marc

Join { to automatically receive all group messages.