Hi All,
I need to select multiple nodes and edges and display the content in v1 - e - v2 format. The query generated is as follows:
g.V().union(has('title', 'V1').as('v1').outE().hasLabel('E1').as('e').inV().has('title', 'V2').as('v2'),has('title', 'V2').as('v1').union(outE().hasLabel('E2').as('e').inV().has('title', 'V2'),outE().hasLabel('E3').as('e').inV().has('title', 'V3')).as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
It throws the warning: 05:20:21 WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [()]. For better performance, use indexes
How can we optimize the query may be without a union step?
Thanks & Regards, Vinayak
|
|

cmilowka
I guess, building composite index for 'title' property will do the job of accessing title(V1) and title(V2) fast, without full scan of DB as currently does.
cheers, CM
|
|
Hi, cmilowka,
The property title has a composite index created on it. Further modified the query as follows:
g.V().has('title',within('V1','V2')).union(has('title', 'V1').as('v1').outE().hasLabel('E1').as('e').inV().has('title', 'V2').as('v2'),has('title', 'V2').as('v1').union(outE().hasLabel('E2').as('e').inV().has('title', 'V2'),outE().hasLabel('E3').as('e').inV().has('title', 'V3')).as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
the only change is adding has('title',within('V1','V2')) at start of query. The warning is gone now and performance is also improved. Earlier the time taken was around 3.5 mins now it's 55 sec to return only 44 records. The problem is my source changes, need to consider it. For example: v1 - e1 -v2 v3 -e2 -v4 Want the want in a single query. now the query for this will be as follows:
g.V().has('title',within('V1','V3')).union(has('title','V1').as('v1').outE().has('title','E1').as('e').inV().has('title','V2').as('v2'),has('title','V3').as('v1').outE().has('title','E2').as('e').inV().has('title','V4').as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
Request all of you to provide your feedback to improve it further.
Thanks & Regards, Vinayak
toggle quoted message
Show quoted text
On Thu, Apr 22, 2021 at 5:14 AM cmilowka < cmilowka@...> wrote: I guess, building composite index for 'title' property will do the job of accessing title(V1) and title(V2) fast, without full scan of DB as currently does.
cheers, CM
|
|
Hi Vinayak,
You can try below query, it can use index and combine as many traversals you want.
g.inject(1). union ( V().has('title', 'V1'). outE().hasLabel('E1').inV().has('title', 'V2'), V().has('title', 'V3'). outE().hasLabel('E3').inV().has('title', 'V4'))....
Regards, Amiya
toggle quoted message
Show quoted text
Hi, cmilowka,
The property title has a composite index created on it. Further modified the query as follows:
g.V().has('title',within('V1','V2')).union(has('title', 'V1').as('v1').outE().hasLabel('E1').as('e').inV().has('title', 'V2').as('v2'),has('title', 'V2').as('v1').union(outE().hasLabel('E2').as('e').inV().has('title', 'V2'),outE().hasLabel('E3').as('e').inV().has('title', 'V3')).as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
the only change is adding has('title',within('V1','V2')) at start of query. The warning is gone now and performance is also improved. Earlier the time taken was around 3.5 mins now it's 55 sec to return only 44 records. The problem is my source changes, need to consider it. For example: v1 - e1 -v2 v3 -e2 -v4 Want the want in a single query. now the query for this will be as follows:
g.V().has('title',within('V1','V3')).union(has('title','V1').as('v1').outE().has('title','E1').as('e').inV().has('title','V2').as('v2'),has('title','V3').as('v1').outE().has('title','E2').as('e').inV().has('title','V4').as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
Request all of you to provide your feedback to improve it further.
Thanks & Regards, Vinayak
On Thu, Apr 22, 2021 at 5:14 AM cmilowka < cmilowka@...> wrote: I guess, building composite index for 'title' property will do the job of accessing title(V1) and title(V2) fast, without full scan of DB as currently does.
cheers, CM
|
|
Hi Amiya,
Thank you for the query. It also increased the performance. But, it's still 35 seconds. Is there any other way to optimize it further, there are only 10 records returned by the query. Counts are as follows: V1: 187K V2:40 V3: 50 V4: 447K
Thanks & Regards, Vinayak
toggle quoted message
Show quoted text
Hi Vinayak,
You can try below query, it can use index and combine as many traversals you want.
g.inject(1). union ( V().has('title', 'V1'). outE().hasLabel('E1').inV().has('title', 'V2'), V().has('title', 'V3'). outE().hasLabel('E3').inV().has('title', 'V4'))....
Regards, Amiya
Hi, cmilowka,
The property title has a composite index created on it. Further modified the query as follows:
g.V().has('title',within('V1','V2')).union(has('title', 'V1').as('v1').outE().hasLabel('E1').as('e').inV().has('title', 'V2').as('v2'),has('title', 'V2').as('v1').union(outE().hasLabel('E2').as('e').inV().has('title', 'V2'),outE().hasLabel('E3').as('e').inV().has('title', 'V3')).as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
the only change is adding has('title',within('V1','V2')) at start of query. The warning is gone now and performance is also improved. Earlier the time taken was around 3.5 mins now it's 55 sec to return only 44 records. The problem is my source changes, need to consider it. For example: v1 - e1 -v2 v3 -e2 -v4 Want the want in a single query. now the query for this will be as follows:
g.V().has('title',within('V1','V3')).union(has('title','V1').as('v1').outE().has('title','E1').as('e').inV().has('title','V2').as('v2'),has('title','V3').as('v1').outE().has('title','E2').as('e').inV().has('title','V4').as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
Request all of you to provide your feedback to improve it further.
Thanks & Regards, Vinayak
On Thu, Apr 22, 2021 at 5:14 AM cmilowka < cmilowka@...> wrote: I guess, building composite index for 'title' property will do the job of accessing title(V1) and title(V2) fast, without full scan of DB as currently does.
cheers, CM
|
|
Hi Vinayak,
I am not sure how to improve this query further through gremlin.
Query can be faster through data model. VCI will be helpful, if you are applying any other filter along with hasLabel and your edge selectivity is low compare to the total degree of those vertex.
If this query is very frequent and there's a need to improve it further, you can make inV title property be part of edge and VCI can be enabled on that edge property.
Other than that not sure if any configuration can be done to further improve. Someone else might comment on this front.
Regards, Amiya
toggle quoted message
Show quoted text
Hi Amiya,
Thank you for the query. It also increased the performance. But, it's still 35 seconds. Is there any other way to optimize it further, there are only 10 records returned by the query. Counts are as follows: V1: 187K V2:40 V3: 50 V4: 447K
Thanks & Regards, Vinayak
Hi Vinayak,
You can try below query, it can use index and combine as many traversals you want.
g.inject(1). union ( V().has('title', 'V1'). outE().hasLabel('E1').inV().has('title', 'V2'), V().has('title', 'V3'). outE().hasLabel('E3').inV().has('title', 'V4'))....
Regards, Amiya
Hi, cmilowka,
The property title has a composite index created on it. Further modified the query as follows:
g.V().has('title',within('V1','V2')).union(has('title', 'V1').as('v1').outE().hasLabel('E1').as('e').inV().has('title', 'V2').as('v2'),has('title', 'V2').as('v1').union(outE().hasLabel('E2').as('e').inV().has('title', 'V2'),outE().hasLabel('E3').as('e').inV().has('title', 'V3')).as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
the only change is adding has('title',within('V1','V2')) at start of query. The warning is gone now and performance is also improved. Earlier the time taken was around 3.5 mins now it's 55 sec to return only 44 records. The problem is my source changes, need to consider it. For example: v1 - e1 -v2 v3 -e2 -v4 Want the want in a single query. now the query for this will be as follows:
g.V().has('title',within('V1','V3')).union(has('title','V1').as('v1').outE().has('title','E1').as('e').inV().has('title','V2').as('v2'),has('title','V3').as('v1').outE().has('title','E2').as('e').inV().has('title','V4').as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))
Request all of you to provide your feedback to improve it further.
Thanks & Regards, Vinayak
On Thu, Apr 22, 2021 at 5:14 AM cmilowka < cmilowka@...> wrote: I guess, building composite index for 'title' property will do the job of accessing title(V1) and title(V2) fast, without full scan of DB as currently does.
cheers, CM
|
|