JG Schema - addConnection seem to create duplicate connections


Peter Molnar
 

Hi All,

I have a strange behaviour while using the addConnection method for creating JG schema constraints. It seems it creates duplicated connections in some cases. Please see below how to reproduce this with JG v0.6 and Cassandra v3.11.11 backend. I just used the below snippets in Gremlin Console 3.5.1 to connect to JG remotely.

The graph is supposed model transactions and parties involved in the transaction. It has transaction, person and entity nodes. A transaction is supposed to have one "from party" (either person or entity) connected with FROM edge and one "to party" (either person or entity) connected with TO edge. For example (Person A) --FROM--> (Transaction #1) --TO--> (Entity A) for expressing that "Transaction #1" was performed between "Person A" and "Entity A" and the source of the transaction was "Person A".

Creating dynamic graph and schema
-------
map = new HashMap();
map.put("storage.backend","cql");
map.put("storage.hostname","cassandra");
map.put("query.force-index", "false");
map.put("schema.default", "default");
map.put("schema.constraints", "false");
map.put("graph.graphname", "transactionGraph")
ConfiguredGraphFactory.createConfiguration(new MapConfiguration(map));
 
graph = ConfiguredGraphFactory.open("transactionGraph");
mgmt = graph.openManagement();
 
transaction = mgmt.makeVertexLabel('transaction').make();
person = mgmt.makeVertexLabel('person').make();
entity = mgmt.makeVertexLabel('entity').make();
 
fromEdge = mgmt.makeEdgeLabel('FROM').multiplicity(ONE2ONE).make()
mgmt.addConnection(fromEdge, person, transaction)
mgmt.addConnection(fromEdge, entity, transaction)
 
toEdge = mgmt.makeEdgeLabel('TO').multiplicity(ONE2ONE).make()
mgmt.addConnection(toEdge, transaction, person)
mgmt.addConnection(toEdge, transaction, entity)
 
mgmt.commit()

Checking connections of the schema
----------------
mgmt = graph.openManagement()
edges = mgmt.getRelationTypes(EdgeLabel.class)
fromEdge = edges[0]
toEdge = edges[1]
 
fromEdge.mappedConnections().size() // as I would expect, it has two connections
toEdge.mappedConnections().size() // why 4 connections are here? I would expect only two connections similarly to the FROM edge
 
mgmt.close()

--------------

Could you please have a look and let me know if this is a feature or a bug?

Thanks, Peter


hadoopmarc@...
 

Hi Peter,

Thanks for reporting. I think it is a bug. I checked with the standalone gremlin REPL of janusgraph-0.6.0, using:
graph = JanusGraphFactory.open('conf/janusgraph-inmemory.properties')

This gives the same results and if you add the from toEdge connections first, the FromEdge gets 4 connections.

You can check that two of the four connections are redundant, that is, they refer to the same edge in the schema:

gremlin> edges[1].mappedConnections()
==>org.janusgraph.core.Connection@1fecfaea
==>org.janusgraph.core.Connection@4872669f
==>org.janusgraph.core.Connection@483f286e
==>org.janusgraph.core.Connection@4bb147ec
gremlin> edges[1].mappedConnections()[0].getConnectionEdge()
==>e[hs0-el-1th-st][525-~T$SchemaRelated->1037]
gremlin> edges[1].mappedConnections()[1].getConnectionEdge()
==>e[ikg-el-1th-171][525-~T$SchemaRelated->1549]
gremlin> edges[1].mappedConnections()[2].getConnectionEdge()
==>e[hs0-el-1th-st][525-~T$SchemaRelated->1037]
gremlin> edges[1].mappedConnections()[3].getConnectionEdge()
==>e[ikg-el-1th-171][525-~T$SchemaRelated->1549]

Finally, I checked that the schema results remain the same if you add the following config properties to the graph (as suggested by the ref docs):
schema.default=none
schema.constraints=true

Can you please report this as an issue on: https://github.com/JanusGraph/janusgraph/issues

Best wishes,   Marc


On Tue, Jan 11, 2022 at 01:06 PM, Peter Molnar wrote:
mgmt = graph.openManagement();


Peter Molnar
 

Hi Marc,

Thanks a lot for looking into this. As requested, I filled an issue about this on Github: https://github.com/JanusGraph/janusgraph/issues/2950

Thanks,
Peter