Re: Chained or-clauses return incorrect result set


Ben Sheppard <shepp...@...>
 

Ive retested JanusGraph+inmemory and found it to be susceptible as well (my first test was in the same shell as the TinkerGraph):

gremlin> graph = JanusGraphFactory.build().set('storage.backend', 'inmemory').open()
==>standardjanusgraph[inmemory:[127.0.0.1]]
gremlin> mgmt = graph.openManagement()
==>org.janusgraph.graphdb.database.management.ManagementSystem.798dad3d
gremlin> sample = mgmt.makeVertexLabel('sample').make()
==>sample
gremlin> prop_a = mgmt.makePropertyKey('a').dataType(Boolean.class).make()
==>a
gremlin> prop_b = mgmt.makePropertyKey('b').dataType(Boolean.class).make()
==>b
gremlin> prop_c = mgmt.makePropertyKey('c').dataType(Boolean.class).make()
==>c
gremlin> prop_d = mgmt.makePropertyKey('d').dataType(Boolean.class).make()
==>d
gremlin> mgmt.addProperties(sample, prop_a, prop_b, prop_c, prop_d)
==>sample
gremlin> mgmt.commit()
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[inmemory:[127.0.0.1]], standard]
gremlin> g.addV('test').property('a', true).property('b', true).property('c', true).property('d', true).
......1>   addV('test').property('a', true).property('b', false).property('c', true).property('d', false).
......2>   addV('test').property('a', false).property('b', true).property('c', false).property('d', true).
......3>   addV('test').property('a', false).property('b', false).property('c', true).property('d', false)
==>v[4176]
gremlin> g.tx().commit()
==>null
gremlin> g.V().
......1>     or(
......2>         has('a', true),
......3>         has('b', true)
......4>     ).
......5>     or(
......6>         has('c', false),
......7>         has('d', true)
......8>     )
==>[a:[vp[a->true]],b:[vp[b->true]],c:[vp[c->true]],d:[vp[d->true]]]
==>[a:[vp[a->true]],b:[vp[b->false]],c:[vp[c->true]],d:[vp[d->false]]]
==>[a:[vp[a->false]],b:[vp[b->true]],c:[vp[c->false]],d:[vp[d->true]]]


On Wednesday, October 7, 2020 at 10:24:15 AM UTC-4 Ben Sheppard wrote:
If I execute a gremlin query for two or-clauses, containing two conditions each, the resulting query JanusGraph executes a single or-clause with four conditions. This is problematic as the result set is a superset of what it should actually be.

A basic setup using TinkerGraph or JanusGraph+inmemory does not seem to exhibit this issue.

To demonstrate, Ive set up a simple example which uses 4 boolean flags below. In the result query, the third element should not have been returned as it fails the second or-clause.  My test setup is a simple Cassandra/Elasticsearch backend (JanusGraph 0.5.2):

gremlin> graph = JanusGraphFactory.open('test.conf')
==>standardjanusgraph[cql:[172.30.30.221]]
gremlin> mgmt = graph.openManagement()
==>org.janusgraph.graphdb.database.management.ManagementSystem.643ecfef
gremlin> sample = mgmt.makeVertexLabel('sample').make()
==>sample
gremlin> prop_a = mgmt.makePropertyKey('a').dataType(Boolean.class).make()
==>a
gremlin> prop_b = mgmt.makePropertyKey('b').dataType(Boolean.class).make()
==>b
gremlin> prop_c = mgmt.makePropertyKey('c').dataType(Boolean.class).make()
==>c
gremlin> prop_d = mgmt.makePropertyKey('d').dataType(Boolean.class).make()
==>d
gremlin> mgmt.addProperties(sample, prop_a, prop_b, prop_c, prop_d)
==>sample
gremlin> mgmt.buildIndex('propTest', Vertex.class).addKey(prop_a).addKey(prop_b).addKey(prop_c).addKey(prop_d).buildMixedIndex('search')
==>propTest
gremlin> mgmt.commit()
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[cql:[172.30.30.221]], standard]
gremlin> g.addV('test').property('a', true).property('b', true).property('c', true).property('d', true).
......1>   addV('test').property('a', true).property('b', false).property('c', true).property('d', false).
......2>   addV('test').property('a', false).property('b', true).property('c', false).property('d', true).
......3>   addV('test').property('a', false).property('b', false).property('c', true).property('d', false)
==>v[4280]
gremlin> g.tx().commit()
==>null
gremlin> 
gremlin> 
gremlin> g.V().
......1>     or(
......2>         has('a', true),
......3>         has('b', true)
......4>     ).
......5>     or(
......6>         has('c', false),
......7>         has('d', true)
......8>     ).propertyMap()
==>[a:[vp[a->false]],b:[vp[b->true]],c:[vp[c->false]],d:[vp[d->true]]]
==>[a:[vp[a->true]],b:[vp[b->true]],c:[vp[c->true]],d:[vp[d->true]]]
==>[a:[vp[a->true]],b:[vp[b->false]],c:[vp[c->true]],d:[vp[d->false]]]
gremlin> g.V().
......1>     or(
......2>         has('a', true),
......3>         has('b', true)
......4>     ).
......5>     or(
......6>         has('c', false),
......7>         has('d', true)
......8>     ).explain()
==>Traversal Explanation

===========================================================================================================================================================================================

Original Traversal                          [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

ConnectiveStrategy                    [D]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

MatchPredicateStrategy                [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

FilterRankingStrategy                 [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

InlineFilterStrategy                  [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

CountStrategy                         [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

IncidentToAdjacentStrategy            [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

RepeatUnrollStrategy                  [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

PathRetractionStrategy                [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

EarlyLimitStrategy                    [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

AdjacentToIncidentStrategy            [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

LazyBarrierStrategy                   [O]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(vertex,[]), OrStep([[HasStep([a.eq(true)])], [HasStep([b.eq(true)])]]), OrStep([[HasStep([c.eq(false)])], [HasStep([d.eq(true)])]])]

JanusGraphStepStrategy                [P]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

AdjacentVertexHasIdOptimizerStrategy  [P]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

AdjacentVertexIsOptimizerStrategy     [P]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

JanusGraphLocalQueryOptimizerStrategy [P]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

JanusGraphIoRegistrationStrategy      [P]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

ProfileStrategy                       [F]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

StandardVerificationStrategy          [V]   [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]

Final Traversal                             [Or(JanusGraphStep([],[a.eq(true)]),JanusGraphStep([],[b.eq(true)]),JanusGraphStep([],[c.eq(false)]),JanusGraphStep([],[d.eq(true)]))]


Ive also tried wrapping the or-clauses in an and-clause, but it gets stripped out by strategies. Ive also inspected the query being sent to ElasticSearch, which still just shows 4 individual bool.must clauses wrapped inside a single bool.should.

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.