Multiple or-steps are conflated when using the textRegex() predicate
Mladen Marović
Hello! I came upon some unexpected behavior when running queries with multiple I have a graph with a vertex label of type The query that's causing me problems is:
The query should return (roughly) messages between The explain plan for the query is as follows:
The final (seemingly incorrect) traversal explains the results I'm getting. However, what strikes me as odd is that the
but the following
, which should not be correct because What's more confusing is that if I replace the
and the final traversal contains two
I'd like to use the underlying mixed index to fetch the results, and preferably only one index query should be performed under the hood. Is there a way to force this query to properly use the mixed index? Why is the explain plan in these two cases different? Kind regards, Mladen Marović
|
|
Re: Indexing on sub-attribute of custom data type
Ronnie
Hi Marc,
Thanks for confirming about "creating an associated vertex which defines this custom data type" approach. In which case i would not be experimenting custom data types for now. Thanks for the details regarding the serializers for custom attributes - i am sure these will come handy for me. Thanks! Ronnie
|
|
Re: Could not call index
schwartz@...
At first I thought it might have something to do with 2 new indices I added yesterday, so I re-indexed them just in case. But the result is still the same.
I'm having a hard time what does the error means - is it an issue with a composite index? a mixed index? which index? The same traversal steps worked in a development Janus container (whichever storage and index that come out of the box)
|
|
Re: Could not call index
schwartz@...
This is all I have in Stackdriver. If there's a way to see more details logs from inside the container, please tell how to get them and I'll gladly post them here.
|
|
Re: Could not call index
Boxuan Li
Hi, is this the full stacktrace? Is there a nested exception?
|
|
Could not call index
schwartz@...
Hi! Running JanusGraph 0.5.3 against BigTable, and ES is used as the Index backend.
For a particular traversal I'm seeing the error message below. No clue what this means and where to look for a solution. Any assistance will be greatly appreciated! Thanks, Assaf org.janusgraph.core.JanusGraphException: Could not call index at org.janusgraph.graphdb.util.SubqueryIterator.<init>(SubqueryIterator.java:68) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3.execute(StandardJanusGraphTx.java:1354) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3.execute(StandardJanusGraphTx.java:1246) at org.janusgraph.graphdb.query.MetricsQueryExecutor.lambda$execute$3(MetricsQueryExecutor.java:60) at org.janusgraph.graphdb.query.MetricsQueryExecutor.runWithMetrics(MetricsQueryExecutor.java:72) at org.janusgraph.graphdb.query.MetricsQueryExecutor.execute(MetricsQueryExecutor.java:60) at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:201) at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:69) at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:55) at org.janusgraph.graphdb.query.ResultSetIterator.<init>(ResultSetIterator.java:45) at org.janusgraph.graphdb.query.QueryProcessor.iterator(QueryProcessor.java:67) at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder$1.iterator(GraphCentricQueryBuilder.java:204) at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder$1.iterator(GraphCentricQueryBuilder.java:201) at com.google.common.collect.Iterables.getOnlyElement(Iterables.java:302) at org.janusgraph.graphdb.database.StandardJanusGraph$1.retrieveSchemaByName(StandardJanusGraph.java:383) at org.janusgraph.graphdb.database.cache.MetricInstrumentedSchemaCache$1.retrieveSchemaByName(MetricInstrumentedSchemaCache.java:41) at org.janusgraph.graphdb.database.cache.StandardSchemaCache.getSchemaId(StandardSchemaCache.java:111) at org.janusgraph.graphdb.database.cache.MetricInstrumentedSchemaCache.getSchemaId(MetricInstrumentedSchemaCache.java:59) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.getSchemaVertex(StandardJanusGraphTx.java:950) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.getRelationType(StandardJanusGraphTx.java:970) at org.janusgraph.graphdb.query.QueryUtil.getType(QueryUtil.java:68) at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.constructQueryWithoutProfile(BasicVertexCentricQueryBuilder.java:483) at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.constructQuery(BasicVertexCentricQueryBuilder.java:416) at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.execute(VertexCentricQueryBuilder.java:68) at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.properties(VertexCentricQueryBuilder.java:100) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphPropertiesStep.flatMap(JanusGraphPropertiesStep.java:137) at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:49) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphPropertiesStep.processNextStart(JanusGraphPropertiesStep.java:125) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197) at org.apache.tinkerpop.gremlin.process.traversal.step.map.CoalesceStep.flatMap(CoalesceStep.java:58) at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:49) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.branch.BranchStep.standardAlgorithm(BranchStep.java:126) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ComputerAwareStep.processNextStart(ComputerAwareStep.java:46) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:82) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:205) at org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.apply(TraversalUtil.java:44) at org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.applyNullable(TraversalUtil.java:87) at org.apache.tinkerpop.gremlin.process.traversal.step.map.GroupStep.projectTraverser(GroupStep.java:140) at org.apache.tinkerpop.gremlin.process.traversal.step.map.GroupStep.projectTraverser(GroupStep.java:56) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.map.MapStep.processNextStart(MapStep.java:36) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:48) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.CollectingBarrierStep.processNextStart(CollectingBarrierStep.java:107) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:82) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:112) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.map.MapStep.processNextStart(MapStep.java:36) at org.apache.tinkerpop.gremlin.process.traversal.step.map.SelectStep.processNextStart(SelectStep.java:156) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197) at org.apache.tinkerpop.gremlin.server.util.TraverserIterator.fillBulker(TraverserIterator.java:69) at org.apache.tinkerpop.gremlin.server.util.TraverserIterator.hasNext(TraverserIterator.java:56) at org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor.handleIterator(TraversalOpProcessor.java:512) at org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor.lambda$iterateBytecodeTraversal$4(TraversalOpProcessor.java:411) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
|
|
Re: Index stuck on INSTALLED (single instance of JanusGraph)
schwartz@...
THANK YOU!!! :)
|
|
Re: Index stuck on INSTALLED (single instance of JanusGraph)
I had a similar problem, how I solved it:
mgmt.getGraphIndex('my_index_name').getIndexStatus(mgmt.getPropertyKey("property_i_ised"))
==>INSTALLED
graph.getOpenTransactions() - > showed me all the open transactions.graph.getOpenTransactions().getAt(0).commit() -> for every open transaction mgmt = graph.openManagement() mgmt.updateIndex(mgmt.getGraphIndex('my_index_name'), SchemaAction.REGISTER_INDEX).get() mgmt.commit() mgmt = graph.openManagement() mgmt.getGraphIndex('my_index_name').getIndexStatus(mgmt.getPropertyKey("property_i_ised")) ==>REGISTERED
|
|
Re: Index stuck on INSTALLED (single instance of JanusGraph)
schwartz@...
It seems that I had lots of instances registered in the cluster, probably due to shutdowns.
I got the list by using mgmt.getOpenInstances().toList() I closed all instances expect for the current one, and committed, hoping that this would move the index status to REGISTERED. Yet, nothing happens
|
|
Index stuck on INSTALLED (single instance of JanusGraph)
schwartz@...
I tried adding a composite index based on 2 existing properties.
As far as I understand, the initial stated is INSTALLED, then after all instances become aware of it, it should be REGISTERED. Only then, I should re-index to make the index ENABLED. My index remains INSTALLED. The JanusGraph server has no other instances (GKE deployment, with just 1 replica). What needs to be done for the index to transition from INSTALLED to REGISTERED? Many thanks! Assaf
|
|
Re: How to replay a transaction log from the begining
Boxuan Li
Hi Ojas, Ideally, by using Instant.now() to add your log processor, you should be able to see your callback invoked as soon as the transaction completes (if you are using a single in-memory storage backend), or with a minimal delay (depending on the read latency of your storage backend). The time difference in your log looks a bit weird to me. Can you check if there is a clock drift among your servers? Best, Boxuan 「ojas.dubey via lists.lfaidata.foundation <ojas.dubey=amdocs.com@...>」在 2021年6月30日 週三,下午10:12 寫道:
Hi Boxuan,
|
|
Re: Indexing on sub-attribute of custom data type
hadoopmarc@...
Regarding documentation on custom attributes: Jason Plurad published an example project a few years ago (so, for an older JanusGraph version).
See, https://github.com/pluradj/janusgraph-attribute-serializer
|
|
Re: Indexing on sub-attribute of custom data type
hadoopmarc@...
Hi Ronnie,
Actually, "creating an associated vertex which defines this custom data type" sounds like an excellent idea! If an attribute is important enough to define an index on, it probably deserves to be a first class citizen in the graph. Answers to the other questions:
Marc
|
|
Indexing on sub-attribute of custom data type
Ronnie
Hi,
Few questions related to custom data types (https://docs.janusgraph.org/basics/common-questions/#custom-class-datatype) 1. Is it possible to index on a sub-attribute of a custom data type? If not, is there any other alternative other than creating an associated vertex which defines this custom data type? 2. Is attribute cardinality like SET / LIST supported with custom data type? Thanks, Ronnie
|
|
Re: Union with Count returning unexpected results
hadoopmarc@...
Hi Vinayak,
I guess this has to do with differences in lazy vs eager evaluation between the two queries. The TinkerPop ref docs reference the aggregated values with cap('ACount','E1Count','BCount','E2Count','CCount'), rather than with select(), to force eager evaluation, see: https://tinkerpop.apache.org/docs/current/reference/#store-step Best wishes, Marc For other readers, please find the queries from the original post in a better readable format: g2.inject(1).union( V().has('title', 'A').aggregate('v1').union( outE().has('title', 'E1').aggregate('e').inV().has('title', 'B'), outE().has('title', 'E2').aggregate('e').inV().has('title','C') ).aggregate('v2') ). select('v1').dedup().as('sourceCount'). select('e').dedup().as('edgeCount'). select('v2').dedup().as('destinationCount'). select('sourceCount','edgeCount','destinationCount').by(unfold().count()) g2.inject(1).union( V().has('title', 'A').aggregate('A').union( outE().has('title', 'E1').aggregate('E1').inV().has('title', 'B').aggregate('B'), outE().has('title', 'E2').aggregate('E2').inV().has('title','C').aggregate('C') ) ). select('A').dedup().as('ACount'). select('E1').dedup().as('E1Count'). select('B').dedup().as('BCount'). select('E2').dedup().as('E2Count'). select('C').dedup().as('CCount'). select('ACount','E1Count','BCount','E2Count','CCount').by(unfold().count())
|
|
Re: How to replay a transaction log from the begining
ojas.dubey@...
Hi Boxuan,
Thanks. This indeed helped. Initially nothing happened (or at least it appeared that way) so I changed the start time to EPOCH and left the application running for a while and after sometime the callback was executed. So I was wondering how the log processor uses the start time value to replay the log and why did it take a long time to replay the logs. Is there a way by which I can reduce the time by setting the correct UTC time to start time (as i dont want to use EPOCH everytime) so that the callback is executed immediately? Also is there a difference in the values of Instant.now() used by ReadMarker vs the actual local time used by the applicatioon because the ReadMarker initialization logs showed a different time. e.g. 2021-06-30T13:21:32.003+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@4051e47b
2021-06-30T13:21:32.008+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@6a332fb7
2021-06-30T13:21:32.013+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@1f2cf847
2021-06-30T13:21:32.015+05:30 INFO |InternalEventLogger|||||||o.j.diskstorage.log.kcvs.KCVSLog|Loaded identified ReadMarker start time 2021-06-30T04:00:00Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@3cc2c61b
while the application log shows another time 2021-06-30T13:26:15.794+05:30 INFO |InternalEventLogger||c.a.o.s.b.s.i.Test|Started tx standardjanusgraphtx[0x39c4068c] for requestId 5ba073c8-68c2-4356-8097-2e62ef56299a and batchId 9632dceb-7996-4464-91d8-1b157fc8ca00 Regards, Ojas
|
|
Re: How to replay a transaction log from the begining
Boxuan Li
Hi Ojas,
Your `startLogProcessor` method looks good to me. I suspect that you are not using the transaction returned in step 1 to do the vertex/edge operations. In step 2, you are using `g.addV` which automatically starts a new anonymous transaction. To commit using that transaction, you will do `g.tx().commit()`, and of course, it will not be captured by your log processor. Therefore, you need to make sure you are using the transaction associated with the log processor to do the mutations. Try replacing `g` with `tx.traversal()` where `tx` is returned in step 1. Then, your code should look like this: JanusGraphTransaction tx = startJanusGraphTransaction(identifier); Hope this helps. Best, Boxuan
|
|
Re: How to replay a transaction log from the begining
ojas.dubey@...
Hi Boxuan,
Please find the code below: 1. Starting the transaction (identifier value is TestBatchLogger) public JanusGraphTransaction startJanusGraphTransaction(String identifier) {
return janusGraphSchema.getConfiguredGraph().buildTransaction().logIdentifier(identifier).start();
}
2. Multiple add vertex/edge operations on the graph through (e.g.) GraphTraversal<Vertex, Vertex> traversal = g.addV("idVertex")
.property(id, "uuid");
return traversal.next();
Here g is the gremlin GraphTraversalSource object obtained from JanusGraphFactory.open(<graphConfigPropertiesFile>).traversal() 3. Commit on the transaction object returned by the start transaction method. So I wanted to replay the logs of this transaction. For this I made a call to the below method public void startLogProcessor(String identifier) {
LogProcessorFramework logProcessor =
JanusGraphFactory.openTransactionLog(graph);
logProcessor.addLogProcessor(identifier).
setProcessorIdentifier("BatchTxLogger").
setStartTime(Instant.now()).
addProcessor((tx, txId, changeState) -> {
System.out.println("tx--"+tx.toString() + " txId--"+txId.toString()
+" changeState--"+changeState.toString());
for (JanusGraphVertex v : changeState.getVertices(Change.ANY)) {
System.out.println(v.label());
}
}).build();
} But here I am unable to get the sysout. Tried different combinations of startTime (Instance.EPOCH, Instance.now().minusMillis(500) etc.) but did not get the println output on the console (No exception or error in any case). I also tried removing the identifier which gave the invalid readmarker error. So after checking the class files I also removed the start time to resolve the error. But still no console output :( Regards, Ojas
|
|
Re: How to replay a transaction log from the begining
Boxuan Li
Hi Ojas,
Can you share your code and explain what you mean by "unable to work"? Is it running but not producing results as you expected, or encountering errors/exceptions? Best, Boxuan
|
|
Re: How to replay a transaction log from the begining
ojas.dubey@...
Hi,
Was wondering if this had been implemented.
I am running JanusGraph over Cassandra and was trying to work with the transaction log feature using the provided documentation.
So far I have managed to start the transaction with the identifier (the ulog tables are created in cassandra) but am still unable to get the Java callback to work. Have browsed through some threads here as well but still not able to get it to work.
Any help is appreciated.
Regards,
Ojas
|
|