Re: Parameterized bulk insert (addV) script in gremlin-python
Scott Friedman
Wow, works like a charm using gremlin-python, and I don't even have to use a script!
Thanks for the quick wisdom! SF |
|
Re: Parameterized bulk insert (addV) script in gremlin-python
hadoopmarc@...
Hi Scott,
You can try to use this thread for inspiration: https://groups.google.com/g/gremlin-users/c/HtBRwaU0pnQ/m/duFs5-imBAAJ 2 1/2 years ago I was impressed by this solution! This really iterates over the input data and add multiple vertices. Best wishes, Marc |
|
Parameterized bulk insert (addV) script in gremlin-python
Scott Friedman
Good afternoon,
I'm attempting to use gremlin-python to do bulk vertex or edge inserts, and I'd figured I could use params to send in a simple script. A simple proof of concept would be: cmd = 'g.addV().property("name", values)'
params = { 'values': ['name1', 'name2'] } result_set = conn._client.submit(cmd, params) ...but when I execute that, I get a single vertex added with "[name1, name2]" as its name. I suppose this makes sense. And is there a way to issue a compact loop-based script over an arbitrary list in my parameters? I could always forego the script-based approach and use the python API to make a massive query of repeated addV() calls (which is my present implementation), but I'd hoped that a parameter-based, script-based solution would be more efficient (and elegant). Suggestions are very welcome! Regards, Scott |
|
Re: JG as a 3store, rdf support
Matthew Nguyen <nguyenm9@...>
Thanks Marc, hadn't seen ERGS but looks interesting and will take a look.
|
|
Re: Using a user-supplied string as vertex ID
Scott Friedman
Thanks, Boxuan! Looks like a great discussion in that github issue; I hope something eventually comes of it!
|
|
Re: Python output to mgmt queries
dimi
Hi Marc,
Thank you for your reply. I would like to access this information only from the schema (my graph is empty now). Your second solution could work but it only prints the result. So to get the labels, I would need to extract them with some regex. However, I think I have found a solution. It seems that python converts the object org.janusgraph.graphdb.types.VertexLabelVertex to gremlin_python.structure.graph.Vertex. I do not know if this is wanted or accidental (for janusgraph-0.6.x with gremlin_python-3.5.1). However, I can get a list of labels if I convert the labels to string before requesting the result to Python. For example from gremlin_python.driver.client import Client
client = Client('ws://localhost:8182/gremlin', 'mygraph')
get_v_labels = mgmt + ".getVertexLabels().collect {a -> a.name()}"
mgmt = "mygraph.openManagement()"
client.submit(get_v_labels).all().result() In this way, I can also get properties etc. Thanks again and best wishes, Dimi |
|
Re: JG as a 3store, rdf support
hadoopmarc@...
Hi Matthew,
Not an answer to your questions, but a few remarks that might help anyway:
https://github.com/IBM/expressive-reasoning-graph-store which seems pretty recent, though immature. Best wishes, Marc |
|
Re: Python output to mgmt queries
hadoopmarc@...
I am not sure what you are up to and the API changes in remote connections may have confused you.
If you want to see the labels of all vertices in the graph (for janusgraph-0.6.x with gremlin_python-3.5.1): from gremlin_python.process.anonymous_traversal import traversalIf you want to see the vertex labels that you defined in the JanusgGraph schema: from gremlin_python.driver.client import ClientBest wishes, Marc |
|
Re: Using a user-supplied string as vertex ID
Boxuan Li
Hi Scott,
Currently, JanusGraph does not support user-specified string identifiers. You could check out https://github.com/JanusGraph/janusgraph/issues/1221 to see discussions on this topic. Best, Boxuan |
|
Re: Important | Queries for edge label connections
Hi Pawan,
Regarding your first question, try this in Java: mgmt.getEdgeLabel("belongsTo").mappedConnections() which should give you a list of Java objects that contain the outgoing and incoming labels for each connection. In the Gremlin console, you could do this: mgmt.getEdgeLabel("knows").mappedConnections()[0].incomingVertexLabelwhich is a bit hacky but hopefully shall work. Feel free to create a feature request on GitHub and link to this thread. Regarding your second question, IIRC there is no such API available. Best regards, Boxuan |
|
Using a user-supplied string as vertex ID
Scott Friedman
Greetings,
I'd like to specify unique string IDs for newly-added vertices in JanusGraph. I've verified that I can set graph.set-vertex-id to True and then add integer IDs via my (python) client as expected. Does JanusGraph support user-specified string identifiers in any fashion? If not, is there a recommended way to map into integers (e.g., a potentially lengthy MD5 hash?) or will such a long number damage JanusGraph's ID indexing? Thanks much for your time! Scott |
|
JG as a 3store, rdf support
Matthew Nguyen <nguyenm9@...>
Hey folks, been playing with JG the last couple weeks and am able to import a few million triples using rdf2g (cassandra/solr backend). I'm processing around 1000 triples/sec currently after turning on batch-loading and disabling a few pre-conditions :-). While this may be suitable for loading a few million triples, it will take far too long to load a billion+. I've also gotten sparql-gremlin working but haven't yet run it through its paces though I'm disheartened to see that the project appears to have been abandoned. Has anyone on here made any significant strides with rdf & JG and can share their experiences? And if there's a better place to discuss this topic, please advise. thx, matt |
|
Important | Queries for edge label connections
Pawan Shriwas
Hi All, I need a solution for these two things, but I tried but was not able to find the solution. 1. I want to list the edgeLabel connection created in janusgraph
mgmt.addConnection("belongsTo", vertexLabel1, vertexLabel3); mgmt.addConnection("belongsTo", vertexLabel3, vertexLabel4); mgmt.addConnection("belongsTo", vertexLabel5, vertexLabel6); Can see only edge labels in printSchema but not how many time it used between Vertex labels. which is created after above steps. 2. I want to update the direction of one connection of edgeLabel. current -> mgmt.addConnection(“belongsTo”, vertexLabel1, vertexLabel2); //outdirection towards the vertexLabel2
I want to update the direction of this created connection. like below Expected Direction --> mgmt.addConnection(“belongsTo”, vertexLabel2 ,vertexLabel1); // I want only one kind of direction to exist between these 2 nodes types. If this option creates another connection then I want the previous direction to be removed. Please review and let me know how I can achieve this. Thanks in advance. Thanks, Pawan
|
|
Python output to mgmt queries
Hi! I am trying to parse some basic info from the schema via Python but I am probably doing something wrong.
I can request the management info with the Client object in python: from gremlin_python.driver.client import Client
client = Client('ws://localhost:8182/gremlin', 'mygraph') mgmt = "mygraph.openManagement()"
get_v_labels = mgmt + ".getVertexLabels()"
tt = client.submit(get_v_labels).all().result()I have 16 labels, and if I run it in the gremlin console I obtain a list of labels. In python, instead, I get [v[74253], v[74765], v[75277], v[75789], v[76301], v[76813], v[77325], v[77837], v[78349], v[78861], v[79373], v[79885], v[80397], v[80909], v[81421], v[81933]] If I do for t in tt: for p in g.V(t.id).properties(): print("key:",p.label, "| value: " ,p.value) I do not get any output. How can I get the list of labels from the schema? |
|
Re: high-scale-lib dependency
sergeymetallic@...
Hm, in Janusgraph version 0.6.0 there is a different library used https://github.com/datastax/java-driver , is there any point to have the dependency on apache cassandra?
|
|
Re: high-scale-lib dependency
Clement de Groc
Hey! Just wanted to report that we had a similar issue with high-scale-lib.
Replacing high-scale-lib with JCTools sounds like a good option, but I'm not sure it will work for all modules: if I'm not mistaken, Cassandra relies on `high-scale-lib` too. Another solution could be to exclude all classes under `java/util` from JanusGraph uber-jars. |
|
high-scale-lib dependency
There is a java library dependency in Janusgraph "high-scale-lib" which is old and unsupported (last update happened 8 years back, see https://github.com/boundary/high-scale-lib). When Janusgraph is included as library into another project it causes IDE issues when loaded into Eclipse or VScode and using Java 11, just because it contains packages like "java.*"
As a solution I would suggest to migrate to a successor project which is actively developed https://github.com/JCTools/JCTools and does not have such issues |
|
Re: High HBase backend 'configuration' row contention
hadoopmarc@...
Hi Tendai,
Just one thing came to my mind: did you apply the JanusGraphFactory inside a singleton object so that all tasks from all cores in a spark executor use the same JanusGraph instance? If not, this is an easy change to lower overhead due to connection setups. Best wishes, Marc |
|
Re: High HBase backend 'configuration' row contention
Tendai Munetsi
Hi Marc,
Thanks for responding back. The configuration row in question, which is created by Janusgraph when the HBase table is first initialized, is having slow read performance due to the simultaneous access by the Spark executors (400+). Again, each executor creates an embedded Janusgraph instance, and we found that the Janusgraph instance accesses the config row every time the JanusGraphFactory’s open() method is called (numerous times per executor). This leads to the executors trying to access this row at the same time and is causing the row to respond back slow. The rest of the other graph data rows do NOT have this latency while reading/writing the graph. I hope that provides some clarification on the issue. |
|
Re: Bindings for graphs created using ConfiguredGraphFactory not working as expected
anya.sharma@...
Hello Marc,
Thank you for your reply. In response to your suggestions and the questions you posed: Are you sure you did start Cassandra ("cassandra/bin/cassandra") before starting JanusGraph? - Yes, the Cassandra server is running. I am not using the Cassandra bundled with JanusGraph, but my own separate installation which was running at the time when I faced this issue. Also check whether you did not mix up the graph1 and graph1_config graph.graphname values - I double-checked, and the values for the graph name that I am using are correct. I guess you found out to do (before running bin/janusgraph-server.sh): export JANUSGRAPH_YAML=conf/gremlin-server/gremlin-server-configuration.yaml - I am running the server using the command - gremlin-server.bat conf/gremlin-server/gremlin-server-configuration.yaml instead of storing the yaml file in an environment variable. Apart from the above, I also ran the same commands as mentioned in my question on a 0.3.1 JanusGraph server and it worked: Creating the graph: Accessing the graph through the implicit variables: The same steps give the issue posed in the question when run using JanusGraph 0.6.0: Thanks Anya |
|