Changing graphname at runtime
Diglio A. Simoni
Note: I cross-posted this to https://groups.google.com/g/gremlin-users so that I can reach a broader audience, so if you're a member of both groups, please receive my apologies!
Hello, I have a situation where I have a consumer API that talks to JanusGraph. It's configured to connect to a a ConfiguredGraphFactory graph with graphname "Graph_A". I want to update Graph_A in *its totality*, which implies dropping it and recreating it from scratch. The problem is that such a process takes a long time, and I don't want the system to be down while Graph_A is being rebuilt. So I have another ConfiguredGraphFactory graph with graphname Graph_B. I take whatever time is required to create Graph_B, and when it's done, I need to stop the system, change the configuration of the API to now connect to Graph_B and restart the system.
But I don't want to do that.
Instead what I'd like to do is: when Graph_B is ready, I would ConfiguredGraphFactory.drop('Graph_A') and *rename* Graph_B to Graph_A.
Is that possible? If not, does anybody have another solution? This is akin to what one does in computer graphics and double buffering....
|
|
hadoopmarc@...
Is this what you are looking for (it includes an explicit example):
https://docs.janusgraph.org/basics/configured-graph-factory/#updating-configurations You can version your graph in the storage and indexing backends, but keep the graph name facing the end user the same. Best wishes, Marc |
|
Diglio A. Simoni
OK, so that I'm clear, what you're suggesting is that I try something like:
// Create and open main graph map = new HashMap(); map.put("storage.backend", “hbase); map.put("storage.hostname", “xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz”); map.put("storage.hbase.table”, “TABLE_A”); map.put("graph.graphname", “GRAPH”); configuration = new MapConfiguration(map); configuration.setDelimiterParsingDisabled(True); ConfiguredGraphFactory.createConfiguration(configuration); graph = ConfiguredGraphFactory.open(“GRAPH”);
// Create, open and update replacement graph map = new HashMap(); map.put("storage.backend", “hbase); map.put("storage.hostname", “xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz”); map.put("storage.hbase.table”, “TABLE_B”); map.put("graph.graphname", “GRAPH_TEMP”); configuration = new MapConfiguration(map); configuration.setDelimiterParsingDisabled(True); ConfiguredGraphFactory.createConfiguration(configuration); graph = ConfiguredGraphFactory.open(“GRAPH_TEMP”);
// Modify GRAPH_TEMP and when it’s time to make that the live one: map = new HashMap(); map.put("storage.hbase.table”, “TABLE_B); ConfiguredGraphFactory.updateConfiguration(“GRAPH”,map); graph = ConfiguredGraphFactory.open(“GRAPH”); But that raises some additional questions:
|
|
hadoopmarc@...
You really have to try this out and see. I can only answer from what I read in the ref docs.
> Do I need to ConfiguredGraphFactory.close(GRAPH) before I update its configuration? The docs say the binding between graph name and graph instance renews every 20 secs, so maybe this is not necessary. > What happens to GRAPH_TEMP? Wouldn't it be still pointing to the same storage backend HBase table as GRAPH, i.e. to TABLE_B? GRAPH_TEMP is just a name in the JanusGraphManager memory. It does not matter. > if I want to reuse the same scheme, I'd have to have some logic that the next time around I need to renew GRAPH, I have GRAPH_TEMP talk to TABLE_A instead and then switch GRAPH to use TABLE_A, correct? You are right. I would prefer straight versioning or a timestamp in the tablename, or the reuse of names will bite you some day. Of course, you would drop TABLE_A from the storage backend if not needed anymore. Best wishes, Marc |
|