Hello!
I came upon some unexpected behavior when running queries with multiple or()
steps and string searches on mixed indexes and would like some clarification if this is intended or not.
I have a graph with a vertex label of type person
and an edge label of type sent-message-to
. All edges have the properties sender
and receiver
, and some others. A mixed index backed by elasticsearch is created for the edge label. Both sender
and receiver
are indexed (as STRING
types), as well as some others.
The query that's causing me problems is:
g.E() \
.hasLabel('sent-message-to') \
.or( \
has('sender', textRegex('.*alice.*')), \
has('receiver', textRegex('.*alice.*')) \
).or( \
has('sender', textRegex('.*bob.*')), \
has('receiver', textRegex('.*bob.*')) \
).toList()
The query should return (roughly) messages between alice
s and bob
s (and some edge cases where an alice bobowitz
talks to an eve
, but that's not important here). However, I'm getting some unexpected results where, for example, neither the sender nor the recipient contain the substring bob
.
The explain plan for the query is as follows:
gremlin> g.E() \
......1> .hasLabel('sent-message-to') \
......2> .or( \
......3> has('sender', textRegex('.*alice.*')), \
......4> has('receiver', textRegex('.*alice.*')) \
......5> ).or( \
......6> has('sender', textRegex('.*bob.*')), \
......7> has('receiver', textRegex('.*bob.*')) \
......8> ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================================
Original Traversal [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
ConnectiveStrategy [D] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
IncidentToAdjacentStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
RepeatUnrollStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
MatchPredicateStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
PathRetractionStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
EarlyLimitStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
FilterRankingStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
InlineFilterStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentToIncidentStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
CountStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
LazyBarrierStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexFilterOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexHasIdOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexIsOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphLocalQueryOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphStepStrategy [P] [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
JanusGraphIoRegistrationStrategy [P] [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
ProfileStrategy [F] [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
StandardVerificationStrategy [V] [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
Final Traversal [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
The final (seemingly incorrect) traversal explains the results I'm getting. However, what strikes me as odd is that the JanusGraphLocalQueryOptimizerStrategy
seems to return a correct traversal with two separate or
steps:
[
GraphStep(edge,[]),
HasStep([~label.eq(sent-message-to)]),
OrStep([
[HasStep([sender.textRegex(.*alice.*)])],
[HasStep([receiver.textRegex(.*alice.*)])]
]),
OrStep([
[HasStep([sender.textRegex(.*bob.*)])],
[HasStep([receiver.textRegex(.*bob.*)])]
])
]
but the following JanusGraphStepStrategy
conflates the two or
steps into a single one:
[
JanusGraphStep(
[],[~label.eq(sent-message-to)]
)
.Or(
JanusGraphStep([],[sender.eq(alice)]),
JanusGraphStep([],[receiver.eq(alice)]),
JanusGraphStep([],[sender.eq(bob)]),
JanusGraphStep([],[receiver.eq(bob)])
)
]
, which should not be correct because (A or B) and (C or D)
is not equal to (A or B or C or D)
.
What's more confusing is that if I replace the textRegex()
predicate with the tinkerpop predicate containing()
, I get the proper results, because the explain plan is different:
gremlin> g.E() \
......1> .hasLabel('sent-message-to') \
......2> .or( \
......3> has('sender', containing('alice')), \
......4> has('receiver', containing('alice')) \
......5> ).or( \
......6> has('sender', containing('bob')), \
......7> has('receiver', containing('bob')) \
......8> ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================
Original Traversal [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
ConnectiveStrategy [D] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
IncidentToAdjacentStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
RepeatUnrollStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
MatchPredicateStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
PathRetractionStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
EarlyLimitStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
FilterRankingStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
InlineFilterStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentToIncidentStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
CountStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
LazyBarrierStrategy [O] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexFilterOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexHasIdOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexIsOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphLocalQueryOptimizerStrategy [P] [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphStepStrategy [P] [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphIoRegistrationStrategy [P] [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
ProfileStrategy [F] [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
StandardVerificationStrategy [V] [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
Final Traversal [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
and the final traversal contains two or
steps:
[
JanusGraphStep([],[~label.eq(sent-message-to)]),
OrStep([
[HasStep([sender.containing(alice)])],
[HasStep([receiver.containing(alice)])]
]),
OrStep([
[HasStep([sender.containing(bob)])],
[HasStep([receiver.containing(bob)])]
])]
I'd like to use the underlying mixed index to fetch the results, and preferably only one index query should be performed under the hood.
Is there a way to force this query to properly use the mixed index? Why is the explain plan in these two cases different?
Kind regards,
Mladen Marović