Re: Union Query Optimization


AMIYA KUMAR SAHOO
 

Hi Vinayak,

I am not sure how to improve this query further through gremlin.

Query can be faster through data model.  VCI will be helpful, if you are applying any other filter along with hasLabel and  your edge selectivity is low compare  to the total degree of those vertex. 

If this query is very frequent and there's a need to improve it further, you can make inV title property be part of edge and VCI can be enabled on that edge property.

Other than that not sure if any configuration can be done to further improve. Someone else might comment on this front.


Regards,
Amiya


On Thu, 22 Apr 2021, 19:08 Vinayak Bali, <vinayakbali16@...> wrote:
Hi Amiya, 

Thank you for the query. It also increased the performance. But, it's still 35 seconds. Is there any other way to optimize it further, there are only 10 records returned by the query. 
Counts are as follows: 
V1: 187K V2:40 V3: 50 V4: 447K 

Thanks & Regards,
Vinayak

On Thu, Apr 22, 2021 at 12:55 PM AMIYA KUMAR SAHOO <amiyakr.sahoo91@...> wrote:
Hi Vinayak,

You can try below query, it can use index and combine as many traversals you want.

g.inject(1).
   union ( 
      V().has('title', 'V1'). outE().hasLabel('E1').inV().has('title', 'V2'),
    
       V().has('title', 'V3'). outE().hasLabel('E3').inV().has('title', 'V4'))....

Regards,
Amiya



On Thu, 22 Apr 2021, 10:36 Vinayak Bali, <vinayakbali16@...> wrote:
Hi, cmilowka,

The property title has a composite index created on it. Further modified the query as follows:

g.V().has('title',within('V1','V2')).union(has('title', 'V1').as('v1').outE().hasLabel('E1').as('e').inV().has('title', 'V2').as('v2'),has('title', 'V2').as('v1').union(outE().hasLabel('E2').as('e').inV().has('title', 'V2'),outE().hasLabel('E3').as('e').inV().has('title', 'V3')).as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))

the only change is adding has('title',within('V1','V2')) at start of query. The warning is gone now and performance is also improved. 
Earlier the time taken was around 3.5 mins now it's 55 sec to return only 44 records.
The problem is my source changes, need to consider it. For example: 
v1 - e1 -v2
v3 -e2 -v4
Want the want in a single query. now the query for this will be as follows:

g.V().has('title',within('V1','V3')).union(has('title','V1').as('v1').outE().has('title','E1').as('e').inV().has('title','V2').as('v2'),has('title','V3').as('v1').outE().has('title','E2').as('e').inV().has('title','V4').as('v2')).select('v1','e','v2').by(valueMap().by(unfold()))

Request all of you to provide your feedback to improve it further. 

Thanks & Regards,
Vinayak

On Thu, Apr 22, 2021 at 5:14 AM cmilowka <cmilowka@...> wrote:

I guess, building composite index for 'title' property will do the job of accessing title(V1) and title(V2) fast, without full scan of DB as currently does.

cheers, CM

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.