Index corruption after reindexing vertex-centric indices with direction 'BOTH'


Shiva Krishnan <shivain...@...>
 

I have noticed a corruption in index after reindexing vertex-centric indices with direction 'BOTH'.

Issue : A self-link is getting added to every vertex after reindexing.

To demonstrate this, I have taken a very small graph consists of two vertices (A & B) and one bidirectional edge connecting the two.

Also I have created an vertex-centric index.

gremlin > edgeLabel = mgmt.getEdgeLabel("link");
gremlin > assocKind= mgmt.getPropertyKey("assocKind")
gremlin > mgmt.buildEdgeIndex(edgeLabel, "myVertexCentricIndex", Direction.BOTH, assocKind);

Please note i have given the direction BOTH for the index.

I tried querying for the edge using the vertex centric index.
expectation :  [A -> B] ==> one outE in vertex A and one inE in vertex B.

//Gremlin output:
gremlin> g.V().has('objId' , 'A').inE().hasLabel('link').has('assocKind',16)
//no IN edges to vertex A  Correct

gremlin> g.V().has('objId' , 'A').outE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024]   Correct

Now I ran a reindex for the index.

// 'index' is the vertex centric index which is created above
gremlin> m.updateIndex(index, SchemaAction.REINDEX).get()
==>org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics@28babeca
gremlin> m.commit()

After this when I try to execute the same query which is used above, I have noticed an unexpected self link in vertex A

gremlin> g.V().has('_objId','A').inE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-b6g][14488-link->14488] Wrong

gremlin> g.V().has('_objId','A').bothE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024]
==>e[4e1f-b6g-1bit-b6g][14488-link->14488] Unexpected Link

Though the primary representation of the graph is not affected.

gremlin> g.E()
14:17:51 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024] Correct

After reindexing, the vertex centric edges with direction BOTH is getting corrupted.

I'm using Janus 0.2.3 for my testing.

Could anyone please help me on this issue.

Thanks
Shiva


Shiva Krishnan <shivain...@...>
 

Any help on this?


On Tuesday, May 5, 2020 at 2:24:30 PM UTC+5:30, Shiva Krishnan wrote:
I have noticed a corruption in index after reindexing vertex-centric indices with direction 'BOTH'.

Issue : A self-link is getting added to every vertex after reindexing.

To demonstrate this, I have taken a very small graph consists of two vertices (A & B) and one bidirectional edge connecting the two.

Also I have created an vertex-centric index.

gremlin > edgeLabel = mgmt.getEdgeLabel("link");
gremlin > assocKind= mgmt.getPropertyKey("assocKind")
gremlin > mgmt.buildEdgeIndex(edgeLabel, "myVertexCentricIndex", Direction.BOTH, assocKind);

Please note i have given the direction BOTH for the index.

I tried querying for the edge using the vertex centric index.
expectation :  [A -> B] ==> one outE in vertex A and one inE in vertex B.

//Gremlin output:
gremlin> g.V().has('objId' , 'A').inE().hasLabel('link').has('assocKind',16)
//no IN edges to vertex A  Correct

gremlin> g.V().has('objId' , 'A').outE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024]   Correct

Now I ran a reindex for the index.

// 'index' is the vertex centric index which is created above
gremlin> m.updateIndex(index, SchemaAction.REINDEX).get()
==>org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics@28babeca
gremlin> m.commit()

After this when I try to execute the same query which is used above, I have noticed an unexpected self link in vertex A

gremlin> g.V().has('_objId','A').inE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-b6g][14488-link->14488] Wrong

gremlin> g.V().has('_objId','A').bothE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024]
==>e[4e1f-b6g-1bit-b6g][14488-link->14488] Unexpected Link

Though the primary representation of the graph is not affected.

gremlin> g.E()
14:17:51 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024] Correct

After reindexing, the vertex centric edges with direction BOTH is getting corrupted.

I'm using Janus 0.2.3 for my testing.

Could anyone please help me on this issue.

Thanks
Shiva


Andrew Grosser <dio...@...>
 

Did you ever solve this?


On Friday, June 5, 2020 at 9:59:53 AM UTC-7 shi...@... wrote:
Any help on this?


On Tuesday, May 5, 2020 at 2:24:30 PM UTC+5:30, Shiva Krishnan wrote:
I have noticed a corruption in index after reindexing vertex-centric indices with direction 'BOTH'.

Issue : A self-link is getting added to every vertex after reindexing.

To demonstrate this, I have taken a very small graph consists of two vertices (A & B) and one bidirectional edge connecting the two.

Also I have created an vertex-centric index.

gremlin > edgeLabel = mgmt.getEdgeLabel("link");
gremlin > assocKind= mgmt.getPropertyKey("assocKind")
gremlin > mgmt.buildEdgeIndex(edgeLabel, "myVertexCentricIndex", Direction.BOTH, assocKind);

Please note i have given the direction BOTH for the index.

I tried querying for the edge using the vertex centric index.
expectation :  [A -> B] ==> one outE in vertex A and one inE in vertex B.

//Gremlin output:
gremlin> g.V().has('objId' , 'A').inE().hasLabel('link').has('assocKind',16)
//no IN edges to vertex A  Correct

gremlin> g.V().has('objId' , 'A').outE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024]   Correct

Now I ran a reindex for the index.

// 'index' is the vertex centric index which is created above
gremlin> m.updateIndex(index, SchemaAction.REINDEX).get()
==>org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics@28babeca
gremlin> m.commit()

After this when I try to execute the same query which is used above, I have noticed an unexpected self link in vertex A

gremlin> g.V().has('_objId','A').inE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-b6g][14488-link->14488] Wrong

gremlin> g.V().has('_objId','A').bothE().hasLabel('link').has('assocKind',16)
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024]
==>e[4e1f-b6g-1bit-b6g][14488-link->14488] Unexpected Link

Though the primary representation of the graph is not affected.

gremlin> g.E()
14:17:51 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>e[4e1f-b6g-1bit-1pqw][14488-link->80024] Correct

After reindexing, the vertex centric edges with direction BOTH is getting corrupted.

I'm using Janus 0.2.3 for my testing.

Could anyone please help me on this issue.

Thanks
Shiva