Date   

Multiple or-steps are conflated when using the textRegex() predicate

Mladen Marović
 

Hello!

I came upon some unexpected behavior when running queries with multiple or() steps and string searches on mixed indexes and would like some clarification if this is intended or not.

I have a graph with a vertex label of type person and an edge label of type sent-message-to. All edges have the properties sender and receiver, and some others. A mixed index backed by elasticsearch is created for the edge label. Both sender and receiver are indexed (as STRING types), as well as some others.

The query that's causing me problems is:

g.E() \
    .hasLabel('sent-message-to') \
    .or( \
        has('sender', textRegex('.*alice.*')), \
        has('receiver', textRegex('.*alice.*')) \
    ).or( \
        has('sender', textRegex('.*bob.*')), \
        has('receiver', textRegex('.*bob.*')) \
    ).toList()

The query should return (roughly) messages between alices and bobs (and some edge cases where an alice bobowitz talks to an eve, but that's not important here). However, I'm getting some unexpected results where, for example, neither the sender nor the recipient contain the substring bob.

The explain plan for the query is as follows:

gremlin> g.E() \
......1>     .hasLabel('sent-message-to') \
......2>     .or( \
......3>         has('sender', textRegex('.*alice.*')), \
......4>         has('receiver', textRegex('.*alice.*')) \
......5>     ).or( \
......6>         has('sender', textRegex('.*bob.*')), \
......7>         has('receiver', textRegex('.*bob.*')) \
......8>     ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================================
Original Traversal                          [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]

ConnectiveStrategy                    [D]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
IncidentToAdjacentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
RepeatUnrollStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
MatchPredicateStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
PathRetractionStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
EarlyLimitStrategy                    [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
FilterRankingStrategy                 [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
InlineFilterStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentToIncidentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
CountStrategy                         [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
LazyBarrierStrategy                   [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexHasIdOptimizerStrategy  [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexIsOptimizerStrategy     [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphLocalQueryOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphStepStrategy                [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
JanusGraphIoRegistrationStrategy      [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
ProfileStrategy                       [F]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
StandardVerificationStrategy          [V]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]

Final Traversal                             [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]

The final (seemingly incorrect) traversal explains the results I'm getting. However, what strikes me as odd is that the JanusGraphLocalQueryOptimizerStrategy seems to return a correct traversal with two separate or steps:

[
	GraphStep(edge,[]),
	HasStep([~label.eq(sent-message-to)]),
	OrStep([
		[HasStep([sender.textRegex(.*alice.*)])],
		[HasStep([receiver.textRegex(.*alice.*)])]
	]),
	OrStep([
		[HasStep([sender.textRegex(.*bob.*)])],
		[HasStep([receiver.textRegex(.*bob.*)])]
	])
]

but the following JanusGraphStepStrategy conflates the two or steps into a single one:

[
	JanusGraphStep(
		[],[~label.eq(sent-message-to)]
	)
	.Or(
		JanusGraphStep([],[sender.eq(alice)]),
		JanusGraphStep([],[receiver.eq(alice)]),
		JanusGraphStep([],[sender.eq(bob)]),
		JanusGraphStep([],[receiver.eq(bob)])
	)
]

, which should not be correct because (A or B) and (C or D) is not equal to (A or B or C or D).

What's more confusing is that if I replace the textRegex() predicate with the tinkerpop predicate containing(), I get the proper results, because the explain plan is different:

gremlin> g.E() \
......1>     .hasLabel('sent-message-to') \
......2>     .or( \
......3>         has('sender', containing('alice')), \
......4>         has('receiver', containing('alice')) \
......5>     ).or( \
......6>         has('sender', containing('bob')), \
......7>         has('receiver', containing('bob')) \
......8>     ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================
Original Traversal                          [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

ConnectiveStrategy                    [D]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
IncidentToAdjacentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
RepeatUnrollStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
MatchPredicateStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
PathRetractionStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
EarlyLimitStrategy                    [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
FilterRankingStrategy                 [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
InlineFilterStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentToIncidentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
CountStrategy                         [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
LazyBarrierStrategy                   [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexHasIdOptimizerStrategy  [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexIsOptimizerStrategy     [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphLocalQueryOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphStepStrategy                [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphIoRegistrationStrategy      [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
ProfileStrategy                       [F]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
StandardVerificationStrategy          [V]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

Final Traversal                             [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

and the final traversal contains two or steps:

[
	JanusGraphStep([],[~label.eq(sent-message-to)]),
	OrStep([
		[HasStep([sender.containing(alice)])],
		[HasStep([receiver.containing(alice)])]
	]),
	OrStep([
		[HasStep([sender.containing(bob)])],
		[HasStep([receiver.containing(bob)])]
	])]

I'd like to use the underlying mixed index to fetch the results, and preferably only one index query should be performed under the hood.

Is there a way to force this query to properly use the mixed index? Why is the explain plan in these two cases different?

Kind regards,

Mladen Marović


Re: Indexing on sub-attribute of custom data type

Ronnie
 

Hi Marc,
Thanks for confirming about "creating an associated vertex which defines this custom data type" approach. In which case i would not be experimenting custom data types for now. Thanks for the details regarding the serializers for custom attributes - i am sure these will come handy for me.

Thanks!
Ronnie


Re: Could not call index

schwartz@...
 

At first I thought it might have something to do with 2 new indices I added yesterday, so I re-indexed them just in case. But the result is still the same.
I'm having a hard time what does the error means - is it an issue with a composite index? a mixed index? which index?

The same traversal steps worked in a development Janus container (whichever storage and index that come out of the box)


Re: Could not call index

schwartz@...
 

This is all I have in Stackdriver. If there's a way to see more details logs from inside the container, please tell how to get them and I'll gladly post them here.


Re: Could not call index

Boxuan Li
 

Hi, is this the full stacktrace? Is there a nested exception?


Could not call index

schwartz@...
 

Hi! Running JanusGraph 0.5.3 against BigTable, and ES is used as the Index backend.
For a particular traversal I'm seeing the error message below. No clue what this means and where to look for a solution.
Any assistance will be greatly appreciated! Thanks, Assaf

org.janusgraph.core.JanusGraphException: Could not call index at org.janusgraph.graphdb.util.SubqueryIterator.<init>(SubqueryIterator.java:68) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3.execute(StandardJanusGraphTx.java:1354) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3.execute(StandardJanusGraphTx.java:1246) at org.janusgraph.graphdb.query.MetricsQueryExecutor.lambda$execute$3(MetricsQueryExecutor.java:60) at org.janusgraph.graphdb.query.MetricsQueryExecutor.runWithMetrics(MetricsQueryExecutor.java:72) at org.janusgraph.graphdb.query.MetricsQueryExecutor.execute(MetricsQueryExecutor.java:60) at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:201) at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:69) at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:55) at org.janusgraph.graphdb.query.ResultSetIterator.<init>(ResultSetIterator.java:45) at org.janusgraph.graphdb.query.QueryProcessor.iterator(QueryProcessor.java:67) at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder$1.iterator(GraphCentricQueryBuilder.java:204) at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder$1.iterator(GraphCentricQueryBuilder.java:201) at com.google.common.collect.Iterables.getOnlyElement(Iterables.java:302) at org.janusgraph.graphdb.database.StandardJanusGraph$1.retrieveSchemaByName(StandardJanusGraph.java:383) at org.janusgraph.graphdb.database.cache.MetricInstrumentedSchemaCache$1.retrieveSchemaByName(MetricInstrumentedSchemaCache.java:41) at org.janusgraph.graphdb.database.cache.StandardSchemaCache.getSchemaId(StandardSchemaCache.java:111) at org.janusgraph.graphdb.database.cache.MetricInstrumentedSchemaCache.getSchemaId(MetricInstrumentedSchemaCache.java:59) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.getSchemaVertex(StandardJanusGraphTx.java:950) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.getRelationType(StandardJanusGraphTx.java:970) at org.janusgraph.graphdb.query.QueryUtil.getType(QueryUtil.java:68) at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.constructQueryWithoutProfile(BasicVertexCentricQueryBuilder.java:483) at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.constructQuery(BasicVertexCentricQueryBuilder.java:416) at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.execute(VertexCentricQueryBuilder.java:68) at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.properties(VertexCentricQueryBuilder.java:100) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphPropertiesStep.flatMap(JanusGraphPropertiesStep.java:137) at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:49) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphPropertiesStep.processNextStart(JanusGraphPropertiesStep.java:125) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197) at org.apache.tinkerpop.gremlin.process.traversal.step.map.CoalesceStep.flatMap(CoalesceStep.java:58) at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:49) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.branch.BranchStep.standardAlgorithm(BranchStep.java:126) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ComputerAwareStep.processNextStart(ComputerAwareStep.java:46) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:82) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:205) at org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.apply(TraversalUtil.java:44) at org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.applyNullable(TraversalUtil.java:87) at org.apache.tinkerpop.gremlin.process.traversal.step.map.GroupStep.projectTraverser(GroupStep.java:140) at org.apache.tinkerpop.gremlin.process.traversal.step.map.GroupStep.projectTraverser(GroupStep.java:56) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.map.MapStep.processNextStart(MapStep.java:36) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:48) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.CollectingBarrierStep.processNextStart(CollectingBarrierStep.java:107) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:82) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.map.MapStep.processNextStart(MapStep.java:36) at org.apache.tinkerpop.gremlin.process.traversal.step.map.SelectStep.processNextStart(SelectStep.java:156) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197) at org.apache.tinkerpop.gremlin.server.util.TraverserIterator.fillBulker(TraverserIterator.java:69) at org.apache.tinkerpop.gremlin.server.util.TraverserIterator.hasNext(TraverserIterator.java:56) at org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor.handleIterator(TraversalOpProcessor.java:512) at org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor.lambda$iterateBytecodeTraversal$4(TraversalOpProcessor.java:411) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)


Re: Index stuck on INSTALLED (single instance of JanusGraph)

schwartz@...
 

THANK YOU!!! :)


Re: Index stuck on INSTALLED (single instance of JanusGraph)

sergeymetallic@...
 
Edited

I had a similar problem, how I solved it:
mgmt.getGraphIndex('my_index_name').getIndexStatus(mgmt.getPropertyKey("property_i_ised"))
==>INSTALLED
 graph.getOpenTransactions() - > showed me all the open transactions.
 graph.getOpenTransactions().getAt(0).commit() -> for every open transaction
 mgmt = graph.openManagement()
 mgmt.updateIndex(mgmt.getGraphIndex('my_index_name'), SchemaAction.REGISTER_INDEX).get() 
 mgmt.commit()
mgmt = graph.openManagement()
 mgmt.getGraphIndex('my_index_name').getIndexStatus(mgmt.getPropertyKey("property_i_ised"))
==>REGISTERED
 


Re: Index stuck on INSTALLED (single instance of JanusGraph)

schwartz@...
 

It seems that I had lots of instances registered in the cluster, probably due to shutdowns.
I got the list by using mgmt.getOpenInstances().toList()

I closed all instances expect for the current one, and committed, hoping that this would move the index status to REGISTERED.
Yet, nothing happens


Index stuck on INSTALLED (single instance of JanusGraph)

schwartz@...
 

I tried adding a composite index based on 2 existing properties.
As far as I understand, the initial stated is INSTALLED, then after all instances become aware of it, it should be REGISTERED.
Only then, I should re-index to make the index ENABLED.

My index remains INSTALLED. The JanusGraph server has no other instances (GKE deployment, with just 1 replica).
What needs to be done for the index to transition from INSTALLED to REGISTERED?

Many thanks!
Assaf


Re: How to replay a transaction log from the begining

Boxuan Li
 

Hi Ojas,

Ideally, by using Instant.now() to add your log processor, you should be able to see your callback invoked as soon as the transaction completes (if you are using a single in-memory storage backend), or with a minimal delay (depending on the read latency of your storage backend).

The time difference in your log looks a bit weird to me. Can you check if there is a clock drift among your servers?

Best,
Boxuan

「ojas.dubey via lists.lfaidata.foundation <ojas.dubey=amdocs.com@...>」在 2021年6月30日 週三,下午10:12 寫道:

Hi Boxuan,

Thanks. This indeed helped. Initially nothing happened (or at least it appeared that way) so I changed the start time to EPOCH and left the application running for a while and after sometime the callback was executed. 

So I was wondering how the log processor uses the start time value to replay the log and why did it take a long time to replay the logs. Is there a way by which I can reduce the time by setting the correct UTC time to start time (as i dont want to use EPOCH everytime) so that the callback is executed immediately?

Also is there a difference in the values of Instant.now() used by ReadMarker vs the actual local time used by the applicatioon because the ReadMarker initialization logs showed a different time. e.g.

2021-06-30T13:21:32.003+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@4051e47b
2021-06-30T13:21:32.008+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@6a332fb7
2021-06-30T13:21:32.013+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@1f2cf847
2021-06-30T13:21:32.015+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@3cc2c61b

while the application log shows another time

2021-06-30T13:26:15.794+05:30 INFO |InternalEventLogger||c.a.o.s.b.s.i.Test|Started tx standardjanusgraphtx[0x39c4068c] for requestId 5ba073c8-68c2-4356-8097-2e62ef56299a and batchId 9632dceb-7996-4464-91d8-1b157fc8ca00


Regards,
Ojas 


Re: Indexing on sub-attribute of custom data type

hadoopmarc@...
 

Regarding documentation on custom attributes: Jason Plurad published an example project a few years ago (so, for an older JanusGraph version).

See, https://github.com/pluradj/janusgraph-attribute-serializer


Re: Indexing on sub-attribute of custom data type

hadoopmarc@...
 

Hi Ronnie,

Actually, "creating an associated vertex which defines this custom data type" sounds like an excellent idea! If an attribute is important enough to define an index on, it probably deserves to be a first class citizen in the graph.

Answers to the other questions:

  1.     Not for the MixedIndex; it does not support Object type keys (https://docs.janusgraph.org/index-backend/search-predicates/#data-type-support). A CompositeIndex index on an Object type key is possible, but it would still be an ugly approach because the index would compare entire objects based on the implemented "equals" method of the custom attribute (which in your case would compare one specific attribute).
  2.     I tried to test this in an example, but got stuck (for now) on the limited documentation given in https://docs.janusgraph.org/advanced-topics/serializer/. Otherwise, there is no reason why these cardinalities would not be supported. If you are still interested, we can try and get this working (and add it to the documentation).
Best wishes,

Marc


Indexing on sub-attribute of custom data type

Ronnie
 

Hi,
Few questions related to custom data types (https://docs.janusgraph.org/basics/common-questions/#custom-class-datatype)
1. Is it possible to index on a sub-attribute of a custom data type? If not, is there any other alternative other than creating an associated vertex which defines this custom data type?
2. Is attribute cardinality like SET / LIST supported with custom data type?

Thanks,
Ronnie


Re: Union with Count returning unexpected results

hadoopmarc@...
 

Hi Vinayak,

I guess this has to do with differences in lazy vs eager evaluation between the two queries. The TinkerPop ref docs reference the aggregated values with cap('ACount','E1Count','BCount','E2Count','CCount'), rather than with select(), to force eager evaluation, see: https://tinkerpop.apache.org/docs/current/reference/#store-step

Best wishes,    Marc

For other readers, please find the queries from the original post in a better readable format:

g2.inject(1).union(
  V().has('title', 'A').aggregate('v1').union(
    outE().has('title', 'E1').aggregate('e').inV().has('title', 'B'),
    outE().has('title', 'E2').aggregate('e').inV().has('title','C')
    ).aggregate('v2')
  ).
  select('v1').dedup().as('sourceCount').
  select('e').dedup().as('edgeCount').
  select('v2').dedup().as('destinationCount').
  select('sourceCount','edgeCount','destinationCount').by(unfold().count())


g2.inject(1).union(
  V().has('title', 'A').aggregate('A').union(
    outE().has('title', 'E1').aggregate('E1').inV().has('title', 'B').aggregate('B'),
    outE().has('title', 'E2').aggregate('E2').inV().has('title','C').aggregate('C')
    )
  ).
  select('A').dedup().as('ACount').
  select('E1').dedup().as('E1Count').
  select('B').dedup().as('BCount').
  select('E2').dedup().as('E2Count').
  select('C').dedup().as('CCount').
  select('ACount','E1Count','BCount','E2Count','CCount').by(unfold().count())


Re: How to replay a transaction log from the begining

ojas.dubey@...
 

Hi Boxuan,

Thanks. This indeed helped. Initially nothing happened (or at least it appeared that way) so I changed the start time to EPOCH and left the application running for a while and after sometime the callback was executed. 

So I was wondering how the log processor uses the start time value to replay the log and why did it take a long time to replay the logs. Is there a way by which I can reduce the time by setting the correct UTC time to start time (as i dont want to use EPOCH everytime) so that the callback is executed immediately?

Also is there a difference in the values of Instant.now() used by ReadMarker vs the actual local time used by the applicatioon because the ReadMarker initialization logs showed a different time. e.g.

2021-06-30T13:21:32.003+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@4051e47b
2021-06-30T13:21:32.008+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@6a332fb7
2021-06-30T13:21:32.013+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@1f2cf847
2021-06-30T13:21:32.015+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@3cc2c61b

while the application log shows another time

2021-06-30T13:26:15.794+05:30 INFO |InternalEventLogger||c.a.o.s.b.s.i.Test|Started tx standardjanusgraphtx[0x39c4068c] for requestId 5ba073c8-68c2-4356-8097-2e62ef56299a and batchId 9632dceb-7996-4464-91d8-1b157fc8ca00


Regards,
Ojas 


Re: How to replay a transaction log from the begining

Boxuan Li
 

Hi Ojas,

Your `startLogProcessor` method looks good to me. I suspect that you are not using the transaction returned in step 1 to do the vertex/edge operations. In step 2, you are using `g.addV` which automatically starts a new anonymous transaction. To commit using that transaction, you will do `g.tx().commit()`, and of course, it will not be captured by your log processor. Therefore, you need to make sure you are using the transaction associated with the log processor to do the mutations.

Try replacing `g` with `tx.traversal()` where `tx` is returned in step 1. Then, your code should look like this:

JanusGraphTransaction tx = startJanusGraphTransaction(identifier);
tx.traversal().addV("idVertex").property(id, "uuid").next();
tx.commit();

Hope this helps.

Best,
Boxuan
 


Re: How to replay a transaction log from the begining

ojas.dubey@...
 

Hi Boxuan,
Please find the code below:

1. Starting the transaction (identifier value is TestBatchLogger)

public JanusGraphTransaction startJanusGraphTransaction(String identifier) {
    return janusGraphSchema.getConfiguredGraph().buildTransaction().logIdentifier(identifier).start();
}

2. Multiple add vertex/edge operations on the graph through  (e.g.) 

GraphTraversal<Vertex, Vertex> traversal = g.addV("idVertex")
.property(id, "uuid");
return traversal.next();

Here g is the gremlin GraphTraversalSource object obtained from JanusGraphFactory.open(<graphConfigPropertiesFile>).traversal()

3. Commit on the transaction object returned by the start transaction method.

So I wanted to replay the logs of this transaction. For this I made a call to the below method

public void startLogProcessor(String identifier) {
LogProcessorFramework logProcessor =
JanusGraphFactory.openTransactionLog(graph);
logProcessor.addLogProcessor(identifier).
setProcessorIdentifier("BatchTxLogger").
setStartTime(Instant.now()).
addProcessor((tx, txId, changeState) -> {
System.out.println("tx--"+tx.toString() + "  txId--"+txId.toString()
+"  changeState--"+changeState.toString());
for (JanusGraphVertex v : changeState.getVertices(Change.ANY)) {
System.out.println(v.label());
}
}).build();
}

But here I am unable to get the sysout. Tried different combinations of startTime (Instance.EPOCH, Instance.now().minusMillis(500) etc.) but did not get the println output on the console (No exception or error in any case).
I also tried removing the identifier which gave the invalid readmarker error. So after checking the class files I also removed the start time to resolve the error. But still no console output :(

Regards,
Ojas


Re: How to replay a transaction log from the begining

Boxuan Li
 

Hi Ojas,

Can you share your code and explain what you mean by "unable to work"? Is it running but not producing results as you expected, or encountering errors/exceptions?

Best,
Boxuan


Re: How to replay a transaction log from the begining

ojas.dubey@...
 

Hi,
 
Was wondering if this had been implemented.
 
I am running JanusGraph over Cassandra and was trying to work with the transaction log feature using the provided documentation.
 
So far I have managed to start the transaction with the identifier (the ulog tables are created in cassandra) but am still unable to get the Java callback to work. Have browsed through some threads here as well but still not able to get it to work.
 
Any help is appreciated.
 
 
Regards,
Ojas