Importing Graphson with pre set IDs.


Gwiz <feed...@...>
 

I have some Graphson data that i am importing using the API. When I do that, the graph is getting imported but the IDs are not getting preserved. All Vertices are getting random ids. Is there a way to preserve the IDs?

Code to import:

graph.io(IoCore.graphson()).readGraph("foo.json");

I also tried the following:

final GraphReader reader = GraphSONReader.build().create();
reader
.readGraph(new FileInputStream("foo.json"), graph);


Sample Graphson:

{ "inV": [], "id": 1, "label": "vertex", "type": "vertex", "outV": [ { "inV": 3, "inVLabel": "vertex", "outVLabel": "vertex", "id": 9, "label": "created", "type": "edge", "outV": 1, "properties": { "weight": 0.4 } }, { "inV": 2, "inVLabel": "vertex", "outVLabel": "vertex", "id": 7, "label": "knows", "type": "edge", "outV": 1, "properties": { "weight": 0.5 } }, { "inV": 4, "inVLabel": "vertex", "outVLabel": "vertex", "id": 8, "label": "knows", "type": "edge", "outV": 1, "properties": { "weight": 1 } } ], "properties": { "name": "marko", "age": 29 } }


Jason Plurad <plu...@...>
 

No, like many other graph systems, JanusGraph assigns IDs. Your best approach for now is to create a composite index against a vertex property, i.e.

mgmt = graph.openManagement();
name
= mgmt.makePropertyKey("name").dataType(String.class).make();
byName
= mgmt.buildIndex("byName", Vertex.class).addKey(name).buildCompositeIndex();
mgmt
.commit();

http://docs.janusgraph.org/latest/indexes.html#_composite_index

On Monday, April 24, 2017 at 12:21:38 PM UTC-4, Gwiz wrote:
I have some Graphson data that i am importing using the API. When I do that, the graph is getting imported but the IDs are not getting preserved. All Vertices are getting random ids. Is there a way to preserve the IDs?

Code to import:

graph.io(IoCore.graphson()).readGraph("foo.json");

I also tried the following:

final GraphReader reader = GraphSONReader.build().create();
reader
.readGraph(new FileInputStream("foo.json"), graph);


Sample Graphson:

{ "inV": [], "id": 1, "label": "vertex", "type": "vertex", "outV": [ { "inV": 3, "inVLabel": "vertex", "outVLabel": "vertex", "id": 9, "label": "created", "type": "edge", "outV": 1, "properties": { "weight": 0.4 } }, { "inV": 2, "inVLabel": "vertex", "outVLabel": "vertex", "id": 7, "label": "knows", "type": "edge", "outV": 1, "properties": { "weight": 0.5 } }, { "inV": 4, "inVLabel": "vertex", "outVLabel": "vertex", "id": 8, "label": "knows", "type": "edge", "outV": 1, "properties": { "weight": 1 } } ], "properties": { "name": "marko", "age": 29 } }


tschu...@...
 

Hi all,

So I'm trying to understand the best way of loading moderately-sized chunks of a graph in JanusGraph.  I wrote some Python mining routines which digest and parse data from various sources and convert them into a graph form.  I used a pre-2015 version of Titan and the pybulbs library which used Rexster endpoints to createOrGetVertex and then createEdge based on the node ID returned from createOrGetVertex.  It wasn't really efficient, but for the individual load sizes, it was manageable.

The pybulbs library is defunct and it looks like Rexster was removed from newer versions of Gremlin.  As such, I'm looking for another way of getting the data into a new JanusGraph instance.  The most obvious way would be to have the Python routine convert the data to a GraphSON structure and import as outlined in this thread.

But I'm confused how to handle the internal IDs when establishing the edges.  If I run two load batches and NodeA instance is created in the first batch and assigned an internal ID, and then in the second batch is referenced in an edge relationship, how am I supposed to set _outV and _inV IDs appropriately?

At least with the createOrGetVertex, it would return the internal vertex ID if it was already created from another process.

Is there any consent on the best approach for handling these case?

Thank you in advance!

Regards,

Tim


On Tuesday, April 25, 2017 at 9:47:26 AM UTC-4, Jason Plurad wrote:
No, like many other graph systems, JanusGraph assigns IDs. Your best approach for now is to create a composite index against a vertex property, i.e.

mgmt = graph.openManagement();
name
= mgmt.makePropertyKey("name").dataType(String.class).make();
byName
= mgmt.buildIndex("byName", Vertex.class).addKey(name).buildCompositeIndex();
mgmt
.commit();


On Monday, April 24, 2017 at 12:21:38 PM UTC-4, Gwiz wrote:
I have some Graphson data that i am importing using the API. When I do that, the graph is getting imported but the IDs are not getting preserved. All Vertices are getting random ids. Is there a way to preserve the IDs?

Code to import:

graph.io(IoCore.graphson()).readGraph("foo.json");

I also tried the following:

final GraphReader reader = GraphSONReader.build().create();
reader
.readGraph(new FileInputStream("foo.json"), graph);


Sample Graphson:

{ "inV": [], "id": 1, "label": "vertex", "type": "vertex", "outV": [ { "inV": 3, "inVLabel": "vertex", "outVLabel": "vertex", "id": 9, "label": "created", "type": "edge", "outV": 1, "properties": { "weight": 0.4 } }, { "inV": 2, "inVLabel": "vertex", "outVLabel": "vertex", "id": 7, "label": "knows", "type": "edge", "outV": 1, "properties": { "weight": 0.5 } }, { "inV": 4, "inVLabel": "vertex", "outVLabel": "vertex", "id": 8, "label": "knows", "type": "edge", "outV": 1, "properties": { "weight": 1 } } ], "properties": { "name": "marko", "age": 29 } }