Topics

Gremlin Query to return count for nodes and edges


Vinayak Bali
 

Hi All,

Wanted to return the count of nodes and edges returned by the query. Tired a few queries but they are not working. Can someone please share a single query, which returns both the count?

Thanks & Regards,
Vinayak  



hadoopmarc@...
 

Hi Vinayak,

Try:

g.V().project('v', 'vcount', 'ecount').by(identity()).by(count()).by(bothE().count())

Best wishes,    Marc


Vinayak Bali
 

Hi Marc,

I am using the following query to return the results.
g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(valueMap().by(unfold()))
Want the count of unique nodes in a and b together and e i.e number of edges.
Please modify this query to get the required output.

Thanks & Regards,
Vinayak

On Tue, Feb 23, 2021 at 1:08 PM <hadoopmarc@...> wrote:
Hi Vinayak,

Try:

g.V().project('v', 'vcount', 'ecount').by(identity()).by(count()).by(bothE().count())

Best wishes,    Marc


Graham Wallis
 

Hi Vinayak

You could do this:

g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(count())

That should produce something like:

==>{a=1, e=1, b=1}

Best regards,
 Graham

Graham Wallis
IBM Open Software
Internet: graham_wallis@...    
IBM, Hursley Park, Hursley, Hampshire SO21 2JN







From:        "Vinayak Bali" <vinayakbali16@...>
To:        janusgraph-users@...
Date:        23/02/2021 09:11
Subject:        [EXTERNAL] Re: [janusgraph-users] Gremlin Query to return count for nodes and edges
Sent by:        janusgraph-users@...




Hi Marc, I am using the following query to return the results. g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(valueMap().by(unfold())) Want the count of
Hi Marc,

I am using the following query to return the results.

g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(valueMap().by(unfold()))
Want the count of unique nodes in a and b together and e i.e number of edges.
Please modify this query to get the required output.

Thanks & Regards,
Vinayak

On Tue, Feb 23, 2021 at 1:08 PM <hadoopmarc@...> wrote:

Hi Vinayak,

Try:

g.V().project('v', 'vcount', 'ecount').by(identity()).by(count()).by(bothE().count())

Best wishes,    Marc





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Vinayak Bali
 

Hi Graham,

Tried itm the output is as follows:
[{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},{"v1":1,"e":1,"v2":1},

I want the count something like {v1: 20, e: 60, v2:10} or {v:30, e: 60}

Thanks & Regards,
Vinayak


On Tue, Feb 23, 2021 at 3:00 PM Graham Wallis <graham_wallis@...> wrote:
Hi Vinayak

You could do this:

g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(count())

That should produce something like:

==>{a=1, e=1, b=1}

Best regards,
 Graham

Graham Wallis
IBM Open Software
Internet: graham_wallis@...    
IBM, Hursley Park, Hursley, Hampshire SO21 2JN







From:        "Vinayak Bali" <vinayakbali16@...>
To:        janusgraph-users@...
Date:        23/02/2021 09:11
Subject:        [EXTERNAL] Re: [janusgraph-users] Gremlin Query to return count for nodes and edges
Sent by:        janusgraph-users@...




Hi Marc, I am using the following query to return the results. g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(valueMap().by(unfold())) Want the count of
Hi Marc,

I am using the following query to return the results.

g.V().hasLabel('A').as('a').outE().as('e').inV().as('b').select('a','e','b').by(valueMap().by(unfold()))
Want the count of unique nodes in a and b together and e i.e number of edges.
Please modify this query to get the required output.

Thanks & Regards,
Vinayak

On Tue, Feb 23, 2021 at 1:08 PM <hadoopmarc@...> wrote:

Hi Vinayak,

Try:

g.V().project('v', 'vcount', 'ecount').by(identity()).by(count()).by(bothE().count())

Best wishes,    Marc





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


hadoopmarc@...
 

Hi Vinayak,

A new attempt:

g = TinkerFactory.createModern().traversal()   
g.withSideEffect('vs', new HashSet()).withSideEffect('es', new HashSet()).
    V(1,2).aggregate('vs').outE().aggregate('es').inV().aggregate('vs').cap('vs', 'es').
    project('vs', 'es').
    by(select('vs').unfold().count()).
    by(select('es').unfold().count())
==>[vs:4,es:3]
This still looks clunky to me, so I challenge other readers to get rid of the project().by(select()) construct.

Best wishes,    Marc


cmilowka
 

It may work as well, to count totals of all in and out edges for "A" label:

g.V().hasLabel('A').union( __.count(), __. outE().count(), __.inV().count() )

 


Graham Wallis
 

Good query from @hadoopmarc and I like @cmilowka's suggestion, although I needed to modify it very slightly as follows:

g.V().hasLabel('A').union( __.count(), __.outE().count(), __.outE().inV().count() )

That has to be the shortest and neatest solution. Certainly far better than my rather basic effort below, which surely gets the prize for the longest solution :-)

g.V().hasLabel('A').aggregate('a').outE().aggregate('e').inV().aggregate('b').select('a').dedup().as('as').select('e').dedup().as('es').select('b').dedup().as('bs').select('as','es','bs').by(unfold().count())


Best regards,
 Graham

Graham Wallis
IBM Open Software
Internet: graham_wallis@...    
IBM, Hursley Park, Hursley, Hampshire SO21 2JN







From:        "cmilowka" <cmilowka@...>
To:        janusgraph-users@...
Date:        23/02/2021 22:49
Subject:        [EXTERNAL] Re: [janusgraph-users] Gremlin Query to return count for nodes and edges
Sent by:        janusgraph-users@...







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Vinayak Bali
 

Hi All,

The query shared by HadoopMarc works. The query, I executed returns 752650 nodes and 297302 edges as a count.  The time taken is around 1min. Is there any way to optimize it further ??? 
Thank You, Marc, and all others for your help. 

Thanks & Regards,
Vinayak

On Wed, Feb 24, 2021 at 2:32 PM Graham Wallis <graham_wallis@...> wrote:
Good query from @hadoopmarc and I like @cmilowka's suggestion, although I needed to modify it very slightly as follows:

g.V().hasLabel('A').union( __.count(), __.outE().count(), __.outE().inV().count() )

That has to be the shortest and neatest solution. Certainly far better than my rather basic effort below, which surely gets the prize for the longest solution :-)

g.V().hasLabel('A').aggregate('a').outE().aggregate('e').inV().aggregate('b').select('a').dedup().as('as').select('e').dedup().as('es').select('b').dedup().as('bs').select('as','es','bs').by(unfold().count())


Best regards,
 Graham

Graham Wallis
IBM Open Software
Internet: graham_wallis@...    
IBM, Hursley Park, Hursley, Hampshire SO21 2JN







From:        "cmilowka" <cmilowka@...>
To:        janusgraph-users@...
Date:        23/02/2021 22:49
Subject:        [EXTERNAL] Re: [janusgraph-users] Gremlin Query to return count for nodes and edges
Sent by:        janusgraph-users@...







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


hadoopmarc@...
 

Hi Vinayak,

Speeding up your query depends on your setup. 15.000 vertices/second is already fast. Is this the janusgraph inmemory backend? Or ScyllaDB?

In a perfect world, not there yet, your query would profit from parallelization (OLAP). JanusGraph supports both the withComputer() and withComputer(SparkGraphComputer) start steps, but the former is undocumented and the performance gains of the latter are often disappointing.

Best wishes,    Marc


Vinayak Bali
 

Hi Marc,

The backend used is Cassandra. I was just wondering if we can load the data from Cassandra's data store to the in-memory backend to speed up the process.
I tried OLAP by configuring Hadoop and Spark with the help of references shared in the documentation. A simple query to retrieve 1 node from the graph took around 5 mins. 
Based on your experience, request to share the steps to be followed to solve the issue.

Thanks & Regards,
Vinayak

On Wed, Feb 24, 2021 at 9:32 PM <hadoopmarc@...> wrote:
Hi Vinayak,

Speeding up your query depends on your setup. 15.000 vertices/second is already fast. Is this the janusgraph inmemory backend? Or ScyllaDB?

In a perfect world, not there yet, your query would profit from parallelization (OLAP). JanusGraph supports both the withComputer() and withComputer(SparkGraphComputer) start steps, but the former is undocumented and the performance gains of the latter are often disappointing.

Best wishes,    Marc