Re: Importing Graphson with pre set IDs.
tschu...@...
Hi all, So I'm trying to understand the best way of loading moderately-sized chunks of a graph in JanusGraph. I wrote some Python mining routines which digest and parse data from various sources and convert them into a graph form. I used a pre-2015 version of Titan and the pybulbs library which used Rexster endpoints to createOrGetVertex and then createEdge based on the node ID returned from createOrGetVertex. It wasn't really efficient, but for the individual load sizes, it was manageable. The pybulbs library is defunct and it looks like Rexster was removed from newer versions of Gremlin. As such, I'm looking for another way of getting the data into a new JanusGraph instance. The most obvious way would be to have the Python routine convert the data to a GraphSON structure and import as outlined in this thread. But I'm confused how to handle the internal IDs when establishing the edges. If I run two load batches and NodeA instance is created in the first batch and assigned an internal ID, and then in the second batch is referenced in an edge relationship, how am I supposed to set _outV and _inV IDs appropriately? At least with the createOrGetVertex, it would return the internal vertex ID if it was already created from another process. Is there any consent on the best approach for handling these case? Thank you in advance! Regards, Tim On Tuesday, April 25, 2017 at 9:47:26 AM UTC-4, Jason Plurad wrote:
|
|