Re: Count Query Optimization


Vinayak Bali
 

Hi Amiya,

With dedup:
g.V().has('property1', 'A').
   outE().has('property1','E').
       where(inV().has('property1', 'B')). fold().
   project('edgeCount', 'vertexCount').
            by(count(local)).
            by(unfold().bothV().dedup().count())
Output: ==>[edgeCount:200166,vertexCount:34693]

without dedup:
g.V().has('property1', 'A').
   outE().has('property1','E').
       where(inV().has('property1', 'B')). fold().
   project('edgeCount', 'vertexCount').
            by(count(local)).
            by(unfold().bothV().count())
Output: ==>[edgeCount:200166,vertexCount:400332]

Both queries are taking approx 3 sec to run.

Query: g.V().has('property1', 'A').aggregate('v').outE().has('property1','E').aggregate('e').inV().has('property1', 'B').aggregate('v').select('v').dedup().as('vetexCount').select('e').dedup().as('edgeCount').select('vetexCount','edgeCount').by(unfold().count())
Output: ==>[vetexCount:383633,edgeCount:200166]
Time: 3.5 mins

Edge Count is the same for all the queries but getting different vertexCount. Which one is the right vertex count??

Thanks & Regards,
Vinayak


On Thu, Mar 18, 2021 at 11:18 AM AMIYA KUMAR SAHOO <amiyakr.sahoo91@...> wrote:
Hi Vinayak,

May be try below.

g.V().has('property1', 'A').
   outE().has('property1','E').
       where(inV().has('property1', 'B')). fold().
   project('edgeCount', 'vertexCount').
            by(count(local)).
            by(unfold().bothV().dedup().count())    // I do not think dedup is required for your use case, can try both with and without dedup

Regards, Amiya

Join {janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.