Selection of indexCandidates is empty when using disjuctive top level query


Sylvain Julmy <sylvai...@...>
 

Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy


HadoopMarc <bi...@...>
 

Hi Sylvain,

Could you please add your findings to:

https://github.com/JanusGraph/janusgraph/issues/1868

Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef syl...@...:

Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy


BO XUAN LI <libo...@...>
 

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <bi...@...> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.


BO XUAN LI <libo...@...>
 

Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <libo...@...> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <bi...@...> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



Sylvain Julmy <sylvai...@...>
 

Hi HadoopMarc,

we are using the Transaction API, and there are no and() or union() step defined in it. We just do the work with the or() and has() step.

Like the following (its in scala, but I don't think it matters) :

val queryBuilder = transaction.query().asInstanceOf[GraphCentricQueryBuilder]
val subQuery1 = transaction.query().asInstanceOf[GraphCentricQueryBuilder].has("field1",v1).has("~label", label)
val subQuery2 = transaction.query().asInstanceOf[GraphCentricQueryBuilder].has("field2",v2).has("~label", label)
queryBuilder.or(subQuery1).or(subQuery2)

and, internally, the query is transformed into Or(And(...),And(...))

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:32:47 UTC+2, HadoopMarc a écrit :
Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef syl...@...:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy


Sylvain Julmy <sylvai...@...>
 

Hi Boxuan,

well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan and, with the debugger, you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected.

When we only have the And() condition, indexes are selected correctly.

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:46:59 UTC+2, li...@... a écrit :
Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <li...@...> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <b...@...> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



BO XUAN LI <libo...@...>
 

Hi Sylvain,

> well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan

This is not completely correct. If you see this warning message, it means JanusGraph does not use indexes for at least one condition in your query. It could have used indexes for other conditions.

If I understand correctly, you have 2 “and" conditions, and each of which when used independently, is satisfied by some index. However, when they are combined using a “Or” clause, indexes are not being used. If true, then this looks like a bug to me, but I cannot reproduce it on 0.5.2. Which version are you using? Can you provide a minimal example which could showcase it?

> you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected

It does not work in the way you presume. You could set a debug point at that line and observe how it is invoked multiple times. JanusGraph tries to first pick up a single mixed index which can cover both conditions in the “Or" (as you described - nothing is selected), and then picks up indexes for each condition in the “Or” clause respectively, so that it can merge the results later. If one condition uses some index while another condition does not, then a full scan is still needed and you would still see the full scan warning message.

Hope this helps,
Boxuan

On Sep 21, 2020, at 12:41 PM, Sylvain Julmy <sylvai...@...> wrote:

Hi Boxuan,

well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan and, with the debugger, you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected.

When we only have the And() condition, indexes are selected correctly.

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:46:59 UTC+2, libo...@connect.hku.hk a écrit :
Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <li...@...> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <b...@...> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/5e09f098-d076-486f-b1bc-43138f166ddan%40googlegroups.com.


BO XUAN LI <libo...@...>
 

If you are using a version older than 0.3.0, then it would make sense to me because seems index support for “Or” clause is added in 0.3.0. See https://github.com/JanusGraph/janusgraph/pull/927

On Sep 21, 2020, at 9:17 PM, BO XUAN LI <libo...@...> wrote:

Hi Sylvain,

> well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan

This is not completely correct. If you see this warning message, it means JanusGraph does not use indexes for at least one condition in your query. It could have used indexes for other conditions.

If I understand correctly, you have 2 “and" conditions, and each of which when used independently, is satisfied by some index. However, when they are combined using a “Or” clause, indexes are not being used. If true, then this looks like a bug to me, but I cannot reproduce it on 0.5.2. Which version are you using? Can you provide a minimal example which could showcase it?

> you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected

It does not work in the way you presume. You could set a debug point at that line and observe how it is invoked multiple times. JanusGraph tries to first pick up a single mixed index which can cover both conditions in the “Or" (as you described - nothing is selected), and then picks up indexes for each condition in the “Or” clause respectively, so that it can merge the results later. If one condition uses some index while another condition does not, then a full scan is still needed and you would still see the full scan warning message.

Hope this helps,
Boxuan

On Sep 21, 2020, at 12:41 PM, Sylvain Julmy <sylvai...@...> wrote:

Hi Boxuan,

well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan and, with the debugger, you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected.

When we only have the And() condition, indexes are selected correctly.

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:46:59 UTC+2, libo...@connect.hku.hk a écrit :
Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <li...@...> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <b...@...> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/5e09f098-d076-486f-b1bc-43138f166ddan%40googlegroups.com.



Sylvain Julmy <sylvai...@...>
 

Hi Boxuan,

I put an example of a query we try to make working with indexes at the end of the message, it is a test case I wrote in the QueryTest.java file.

> It does not work in the way you presume.

from GraphCentricQueryBuilder.java:261
indexType -> indexType.getElement() == resultType && !(conditions instanceof Or && (indexType.isCompositeIndex() || !serializer.features((MixedIndexType) indexType).supportNotQueryNormalForm()))));

Maybe I am just stupid and I don't see it, but the conditions instanceof Or would is always true (if the toplevel query is an Or, which is the case for our queries) and we only have compositeIndex, so the indexType would never be picked in the indexCandidates Set, right ?
Therefore all indexType would be filtered out of the collection and no index would be used for the query.

And we are using JanusGraph 0.5.2 (and impatient to go with the 0.6 :) ! )

Thx for your time and best wishes !
Sylvain

--------------------

@Test
public void testTopLevelOrUseIndexesForSubQuery() {
JanusGraphManagement mgmt = graph.openManagement();
PropertyKey prop1Key = mgmt.makePropertyKey("prop1").dataType(String.class).make();
PropertyKey prop2Key = mgmt.makePropertyKey("prop2").dataType(String.class).make();

mgmt.buildIndex("prop1_idx", Vertex.class).addKey(prop1Key).buildCompositeIndex();
mgmt.buildIndex("prop2_idx", Vertex.class).addKey(prop2Key).buildCompositeIndex();

mgmt.commit();

for (int i = 0; i < 20; i++) {
tx.addVertex("file").property("prop1", "p1_" + i).element().property("prop2", "p2_" + i);
}

GraphCentricQueryBuilder andQueryBuilder = (GraphCentricQueryBuilder) tx.query();
andQueryBuilder.has("prop1", "p1_9").has("~label", "file");

// this is good, andQuery.indexQuery.backendQuery.queries contain one JointIndexQuery and use the prop1_idx:multiKSQ[1]@2005 index
GraphCentricQuery andQuery = andQueryBuilder.constructQuery(ElementCategory.VERTEX);

Iterable<JanusGraphVertex> resultAnd = andQueryBuilder.vertices();

GraphCentricQueryBuilder orQueryBuilder = (GraphCentricQueryBuilder) tx.query();

GraphCentricQueryBuilder subQuery1 = (GraphCentricQueryBuilder) tx.query();
GraphCentricQueryBuilder subQuery2 = (GraphCentricQueryBuilder) tx.query();

subQuery1.has("prop1", "p1_9").has("~label", "file");
subQuery2.has("prop2", "p2_9").has("~label", "file");

orQueryBuilder.or(subQuery1).or(subQuery2);

// this is good, andQuery.indexQuery.backendQuery.queries contain nothing
GraphCentricQuery orQuery = orQueryBuilder.constructQuery(ElementCategory.VERTEX);

Iterable<JanusGraphVertex> resultOr = orQueryBuilder.vertices();
}

Le lundi 21 septembre 2020 à 15:17:52 UTC+2, li...@... a écrit :
Hi Sylvain,

> well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan

This is not completely correct. If you see this warning message, it means JanusGraph does not use indexes for at least one condition in your query. It could have used indexes for other conditions.

If I understand correctly, you have 2 “and" conditions, and each of which when used independently, is satisfied by some index. However, when they are combined using a “Or” clause, indexes are not being used. If true, then this looks like a bug to me, but I cannot reproduce it on 0.5.2. Which version are you using? Can you provide a minimal example which could showcase it?

> you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected

It does not work in the way you presume. You could set a debug point at that line and observe how it is invoked multiple times. JanusGraph tries to first pick up a single mixed index which can cover both conditions in the “Or" (as you described - nothing is selected), and then picks up indexes for each condition in the “Or” clause respectively, so that it can merge the results later. If one condition uses some index while another condition does not, then a full scan is still needed and you would still see the full scan warning message.

Hope this helps,
Boxuan

On Sep 21, 2020, at 12:41 PM, Sylvain Julmy <syl...@...> wrote:

Hi Boxuan,

well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan and, with the debugger, you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected.

When we only have the And() condition, indexes are selected correctly.

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:46:59 UTC+2, libo...@connect.hku.hk a écrit :
Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <li...@...> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <b...@...> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....


BO XUAN LI <libo...@...>
 

Hi Sylvain,

I think I got where your confusion came from.

Your understanding of GraphCentricQueryBuilder.java:261 is absolutely correct (and not stupid!). The problem is with the way you create your query.

Rather than building a GraphCentricQuery by yourself (which is not recommended because it is an internal interface), you should do a gremlin query:

g.V().hasLabel("file").or(__.has("prop1", "p1_9"), __.has("prop2", "p2_9")).toList();

By using the query above, JanusGraph should be able to use indexes.

FYI, The magic is at JanusGraphStep (see the usage of hasLocalContainers), where each condition in the “Or” clause will fire a index query separately. This will not be effective if you are not using a gremlin query (which explains why you got confused by my words! :P).

Btw, the following query seems to trigger a full scan:

g.V().or(__.and(__.hasLabel("file"), __.has("prop1", "p1_9")), __.and(__.hasLabel("file"), __.has("prop2", "p2_9"))).toList();
which is worth investigating. But anyway, you could use the first gremlin query which hopefully works as expected.

Hope this helps,
Boxuan


On Sep 21, 2020, at 10:21 PM, Sylvain Julmy <sylvai...@...> wrote:

Hi Boxuan,

I put an example of a query we try to make working with indexes at the end of the message, it is a test case I wrote in the QueryTest.java file.

> It does not work in the way you presume.

from GraphCentricQueryBuilder.java:261
indexType -> indexType.getElement() == resultType && !(conditions instanceof Or && (indexType.isCompositeIndex() || !serializer.features((MixedIndexType) indexType).supportNotQueryNormalForm()))));

Maybe I am just stupid and I don't see it, but the conditions instanceof Or would is always true (if the toplevel query is an Or, which is the case for our queries) and we only have compositeIndex, so the indexType would never be picked in the indexCandidates Set, right ?
Therefore all indexType would be filtered out of the collection and no index would be used for the query.

And we are using JanusGraph 0.5.2 (and impatient to go with the 0.6 :) ! )

Thx for your time and best wishes !
Sylvain

--------------------

@Test
public void testTopLevelOrUseIndexesForSubQuery() {
JanusGraphManagement mgmt = graph.openManagement();
PropertyKey prop1Key = mgmt.makePropertyKey("prop1").dataType(String.class).make();
PropertyKey prop2Key = mgmt.makePropertyKey("prop2").dataType(String.class).make();

mgmt.buildIndex("prop1_idx", Vertex.class).addKey(prop1Key).buildCompositeIndex();
mgmt.buildIndex("prop2_idx", Vertex.class).addKey(prop2Key).buildCompositeIndex();

mgmt.commit();

for (int i = 0; i < 20; i++) {
tx.addVertex("file").property("prop1", "p1_" + i).element().property("prop2", "p2_" + i);
}

GraphCentricQueryBuilder andQueryBuilder = (GraphCentricQueryBuilder) tx.query();
andQueryBuilder.has("prop1", "p1_9").has("~label", "file");

// this is good, andQuery.indexQuery.backendQuery.queries contain one JointIndexQuery and use the prop1_idx:multiKSQ[1]@2005 index
GraphCentricQuery andQuery = andQueryBuilder.constructQuery(ElementCategory.VERTEX);

Iterable<JanusGraphVertex> resultAnd = andQueryBuilder.vertices();

GraphCentricQueryBuilder orQueryBuilder = (GraphCentricQueryBuilder) tx.query();

GraphCentricQueryBuilder subQuery1 = (GraphCentricQueryBuilder) tx.query();
GraphCentricQueryBuilder subQuery2 = (GraphCentricQueryBuilder) tx.query();

subQuery1.has("prop1", "p1_9").has("~label", "file");
subQuery2.has("prop2", "p2_9").has("~label", "file");

orQueryBuilder.or(subQuery1).or(subQuery2);

// this is good, andQuery.indexQuery.backendQuery.queries contain nothing
GraphCentricQuery orQuery = orQueryBuilder.constructQuery(ElementCategory.VERTEX);

Iterable<JanusGraphVertex> resultOr = orQueryBuilder.vertices();
}
Le lundi 21 septembre 2020 à 15:17:52 UTC+2, libo...@connect.hku.hk a écrit :
Hi Sylvain,

> well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan

This is not completely correct. If you see this warning message, it means JanusGraph does not use indexes for at least one condition in your query. It could have used indexes for other conditions.

If I understand correctly, you have 2 “and" conditions, and each of which when used independently, is satisfied by some index. However, when they are combined using a “Or” clause, indexes are not being used. If true, then this looks like a bug to me, but I cannot reproduce it on 0.5.2. Which version are you using? Can you provide a minimal example which could showcase it?

> you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected

It does not work in the way you presume. You could set a debug point at that line and observe how it is invoked multiple times. JanusGraph tries to first pick up a single mixed index which can cover both conditions in the “Or" (as you described - nothing is selected), and then picks up indexes for each condition in the “Or” clause respectively, so that it can merge the results later. If one condition uses some index while another condition does not, then a full scan is still needed and you would still see the full scan warning message.

Hope this helps,
Boxuan

On Sep 21, 2020, at 12:41 PM, Sylvain Julmy <syl...@...> wrote:

Hi Boxuan,

well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan and, with the debugger, you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected.

When we only have the And() condition, indexes are selected correctly.

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:46:59 UTC+2, libo...@connect.hku.hk a écrit :
Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <libo...@connect.hku.hk> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <bi...@xs4all.nl> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/dffb6b32-0ff7-43a2-be70-a5d9758b0014n%40googlegroups.com.


Sylvain Julmy <sylvai...@...>
 

Hi Boxuan,

thank you very much for the clarification :) !

I've applied the fixes (moving from the transaction API to the Gremlin one) and worked perfectly.
I don't know why we used the transaction API instead of the Gremlin one, I've to ask my team mates...

Thanks for the time passed on this, kind regards !

Sylvain

Le lundi 21 septembre 2020 à 18:06:04 UTC+2, li...@... a écrit :
Hi Sylvain,

I think I got where your confusion came from.

Your understanding of GraphCentricQueryBuilder.java:261 is absolutely correct (and not stupid!). The problem is with the way you create your query.

Rather than building a GraphCentricQuery by yourself (which is not recommended because it is an internal interface), you should do a gremlin query:

g.V().hasLabel("file").or(__.has("prop1", "p1_9"), __.has("prop2", "p2_9")).toList();

By using the query above, JanusGraph should be able to use indexes.

FYI, The magic is at JanusGraphStep (see the usage of hasLocalContainers), where each condition in the “Or” clause will fire a index query separately. This will not be effective if you are not using a gremlin query (which explains why you got confused by my words! :P).

Btw, the following query seems to trigger a full scan:

g.V().or(__.and(__.hasLabel("file"), __.has("prop1", "p1_9")), __.and(__.hasLabel("file"), __.has("prop2", "p2_9"))).toList();
which is worth investigating. But anyway, you could use the first gremlin query which hopefully works as expected.

Hope this helps,
Boxuan


On Sep 21, 2020, at 10:21 PM, Sylvain Julmy <syl...@...> wrote:

Hi Boxuan,

I put an example of a query we try to make working with indexes at the end of the message, it is a test case I wrote in the QueryTest.java file.

> It does not work in the way you presume.

from GraphCentricQueryBuilder.java:261
indexType -> indexType.getElement() == resultType && !(conditions instanceof Or && (indexType.isCompositeIndex() || !serializer.features((MixedIndexType) indexType).supportNotQueryNormalForm()))));

Maybe I am just stupid and I don't see it, but the conditions instanceof Or would is always true (if the toplevel query is an Or, which is the case for our queries) and we only have compositeIndex, so the indexType would never be picked in the indexCandidates Set, right ?
Therefore all indexType would be filtered out of the collection and no index would be used for the query.

And we are using JanusGraph 0.5.2 (and impatient to go with the 0.6 :) ! )

Thx for your time and best wishes !
Sylvain

--------------------

@Test
public void testTopLevelOrUseIndexesForSubQuery() {
JanusGraphManagement mgmt = graph.openManagement();
PropertyKey prop1Key = mgmt.makePropertyKey("prop1").dataType(String.class).make();
PropertyKey prop2Key = mgmt.makePropertyKey("prop2").dataType(String.class).make();

mgmt.buildIndex("prop1_idx", Vertex.class).addKey(prop1Key).buildCompositeIndex();
mgmt.buildIndex("prop2_idx", Vertex.class).addKey(prop2Key).buildCompositeIndex();

mgmt.commit();

for (int i = 0; i < 20; i++) {
tx.addVertex("file").property("prop1", "p1_" + i).element().property("prop2", "p2_" + i);
}

GraphCentricQueryBuilder andQueryBuilder = (GraphCentricQueryBuilder) tx.query();
andQueryBuilder.has("prop1", "p1_9").has("~label", "file");

// this is good, andQuery.indexQuery.backendQuery.queries contain one JointIndexQuery and use the prop1_idx:multiKSQ[1]@2005 index
GraphCentricQuery andQuery = andQueryBuilder.constructQuery(ElementCategory.VERTEX);

Iterable<JanusGraphVertex> resultAnd = andQueryBuilder.vertices();

GraphCentricQueryBuilder orQueryBuilder = (GraphCentricQueryBuilder) tx.query();

GraphCentricQueryBuilder subQuery1 = (GraphCentricQueryBuilder) tx.query();
GraphCentricQueryBuilder subQuery2 = (GraphCentricQueryBuilder) tx.query();

subQuery1.has("prop1", "p1_9").has("~label", "file");
subQuery2.has("prop2", "p2_9").has("~label", "file");

orQueryBuilder.or(subQuery1).or(subQuery2);

// this is good, andQuery.indexQuery.backendQuery.queries contain nothing
GraphCentricQuery orQuery = orQueryBuilder.constructQuery(ElementCategory.VERTEX);

Iterable<JanusGraphVertex> resultOr = orQueryBuilder.vertices();
}
Le lundi 21 septembre 2020 à 15:17:52 UTC+2, libo...@connect.hku.hk a écrit :
Hi Sylvain,

> well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan

This is not completely correct. If you see this warning message, it means JanusGraph does not use indexes for at least one condition in your query. It could have used indexes for other conditions.

If I understand correctly, you have 2 “and" conditions, and each of which when used independently, is satisfied by some index. However, when they are combined using a “Or” clause, indexes are not being used. If true, then this looks like a bug to me, but I cannot reproduce it on 0.5.2. Which version are you using? Can you provide a minimal example which could showcase it?

> you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected

It does not work in the way you presume. You could set a debug point at that line and observe how it is invoked multiple times. JanusGraph tries to first pick up a single mixed index which can cover both conditions in the “Or" (as you described - nothing is selected), and then picks up indexes for each condition in the “Or” clause respectively, so that it can merge the results later. If one condition uses some index while another condition does not, then a full scan is still needed and you would still see the full scan warning message.

Hope this helps,
Boxuan

On Sep 21, 2020, at 12:41 PM, Sylvain Julmy <syl...@...> wrote:

Hi Boxuan,

well we know that JanusGraph does not use indexes because it log a warning message that it would do a fullscan and, with the debugger, you can look at the precise part of the code I give, when selecting index candidates, if the top level condition is an Or, nothing is selected.

When we only have the And() condition, indexes are selected correctly.

Best wishes,
Sylvain

Le vendredi 18 septembre 2020 à 15:46:59 UTC+2, libo...@connect.hku.hk a écrit :
Sorry I was wrong. JanusGraph should still fire index queries for your given Or query, even if one or more other Or conditions requires full scan.

Sylvain, how did you know JanusGraph did not use indexes for your sub And query? Does it use indexes when you only have this And condition?

Best regards,
Boxuan


On Sep 18, 2020, at 9:33 PM, BO XUAN LI <libo...@connect.hku.hk> wrote:

Hi Sylvain,

Looks like the other Or condition of your query does not utilize index, and needs a full scan. Under this circumstance, JanusGraph does not bother firing index queries for your given Or condition.

Best regards,
Boxuan

On Sep 18, 2020, at 9:32 PM, HadoopMarc <bi...@xs4all.nl> wrote:

Hi Sylvain,

Could you please add your findings to:


Maybe, the gremlin union() step can offer a workaround?

Best wishes,    Marc


Op vrijdag 18 september 2020 om 14:31:39 UTC+2 schreef sylvai...@gmail.com:
Dear all,

within our project, we find out that query of the following form

Or(
    And(has('field1','value1'),has('~label','label1'),
    And(has('field2','value2'),has('~label','label1'),
...
)

with composite indexes on 'field1' and 'field2', does not use indexes for the sub And query.

It seems that the condition at GraphCentricQueryBuilder.java:261 filter out the Or condition, is there any reason for that ?

Sylvain Julmy

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/8750b910-fd50-4247-b3d0-57e86d74d508n%40googlegroups.com.



-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....

-- 
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....