Bindings for graphs created using ConfiguredGraphFactory not working as expected
Hello,
I have a local setup of JanusGraph 0.6.0 with Cassandra 3.11.9. I am creating a graph using the ConfiguredGraphFactory. For this, I am using the bundled properties and yaml files and creating the graph by running the following commands from the Gremlin console (also bundled with the JanusGraph installation): gremlin> :remote connect tinkerpop.server conf/remote.yaml session
gremlin> :remote console
gremlin> map.put('storage.backend', 'cql');
gremlin> map.put('storage.hostname', '127.0.0.1');
gremlin> map.put('graph.graphname', 'graph1');
gremlin> map.put('storage.username', 'myDBUsername');
gremlin> map.put('storage.password', 'myDBPassword');
gremlin> ConfiguredGraphFactory.createConfiguration(new MapConfiguration(map));
Once I have created the map, I try to access the graph and the traversal variables bound to it, but I get the following response: gremlin>ConfiguredGraphFactory.open('graph1') gremlin> graph1
No such property: graph1 for class: Script7
gremlin> graph1_traversal
No such property: graph1_traversal for class: Script8
graph.graphname=graph1_config
storage.hostname=127.0.0.1
storage.username=myDBUsername
storage.password=myDBPassword
According to the documentation, I should be able to access the bound variables. I was able to do this in the 0.3.1 version of Janusgraph. What could I be missing/doing wrong? Thanks Anya |
|
Re: Duplicate vertex issue with Uniqueness constraints | Janusgraph CQL
Pawan Shriwas
Hi Marc, Checking duplicate data with uniqueness constraints on name_cons field - gremlin> g.V().has('gId',P.within('da209078-4a2f-4db2-b489-27da028df983','ba81f5d3-a29b-4a2c-88c3-c265ce3f68a5','9804b32d-31d9-409a-a441-a38fdbf998f7')).valueMap() ==>[gId:[da209078-4a2f-4db2-b489-27da028df983],entityGId:[9e51c70d-f148-401f-8eea-53b767d9bbb6],name_cons:[CGNAT_NS2]] ==>[gId:[ba81f5d3-a29b-4a2c-88c3-c265ce3f68a5],entityGId:[7e763ebc-b2e0-4d04-baaa-4463d04ca436],name_cons:[CGNAT_NS2]] ==>[gId:[9804b32d-31d9-409a-a441-a38fdbf998f7],entityGId:[23fd7efd-3688-4b58-aab6-173d25a8dd63],name_cons:[CGNAT_NS2]] gremlin> Reading of data with unique index property with Consistency lock and get only one record - gremlin> g.V().has('name_cons','CGNAT_NS2').valueMap() ==>[gId:[290cc878-19e1-44f6-9f6c-62b7471e21bc],entityGId:[0b59889d-e725-46e5-9f42-d96daaeaa21d],name_cons:[CGNAT_NS2]] gremlin> gremlin> Hope this clarifies!!!! On Mon, Nov 22, 2021 at 12:39 PM Pawan Shriwas via lists.lfaidata.foundation <shriwas.pawan=gmail.com@...> wrote:
--
Thanks & Regard PAWAN SHRIWAS |
|
Cleaning up old data in large graphs
Mladen Marović
Hello, I have a graph (Janusgraph 0.5.3 running on a cql backend and an elasticsearch index) that is updated in near real-time. About 50M new vertices and 100M new edges are added every month. A large part of these (around 90%) should be deleted after 1 year, and the customer may require to change this at a later date. The remaining 10% of the data has no fixed expiration period, but vertices are expected to be deleted when they have no more edges. Currently, I have a daily Spark job that deletes vertices and their edges by checking their
I was wondering if anyone has some suggestions or best practices on how to manage graph data with a retention period (that could change over time)? Best regards, Mladen Marović |
|
Re: Duplicate vertex issue with Uniqueness constraints | Janusgraph CQL
Pawan Shriwas
Hi Marc; Yes, We are committing the transaction after each operation. how do you know about "duplicate vertex creation" when "it returns only 1 record"? Vertex is being ingested with the same data and graph generate different id for the same. When we query the graph with these different ids, list object return having same name multiple time but when we retrieve the data with name parameter(having unique index with lock consistency) graph returns only one record. Hope this helps. Thanks, Pawan On Sun, Nov 21, 2021 at 4:01 PM <hadoopmarc@...> wrote: Hi Pawan, --
Thanks & Regard PAWAN SHRIWAS |
|
Re: Duplicate vertex issue with Uniqueness constraints | Janusgraph CQL
hadoopmarc@...
Hi Pawan,
Your code mirrors the example at https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency for the greatest part. Are you sure the changes on graphMgmt get committed? Also, how do you know about "duplicate vertex creation" when "it returns only 1 record"? Best wishes, Marc PS. Most of the software community reserves names starting with a verb to functions and class methods. Violating this convention (e.g. PropertyKey makePropertyKey) makes your code almost unreadable to others. |
|
Re: jvm.options broken
hadoopmarc@...
Hi Matthias,
Thanks for taking the trouble to report this. It took a while, but your report did not go unnoticed: https://github.com/JanusGraph/janusgraph/issues/2857 Best wishes, Marc |
|
Duplicate vertex issue with Uniqueness constraints | Janusgraph CQL
Pawan Shriwas
Hi Everyone, I am facing a duplicate vertex creation issue even though the unique index is present in that property and when i retrive the data with the same index it returns only 1 record. Please see below information for the same. Storage Backend - Cassandra CQL Janusgraph version - 0.5.2 index - Composite Uniqueness - True Consistency - yes Index Status - ENABLED Below are the code snippet - Index Status : Thanks, Pawan |
|
Re: Diagnosing slow write speeds to BigTable
AC
I have a follow-up question in addition to my reply above: Is there any guide for understanding the JanusGraph metrics available? I have written a basic metrics integration but I'm finding it quite hard to interpret the metrics that are being produced. On Tue, Nov 16, 2021 at 12:35 PM AC via lists.lfaidata.foundation <acrane=twitter.com@...> wrote:
|
|
Re: Diagnosing slow write speeds to BigTable
AC
Hey again Boxuan, thanks for your help in this thread! 2) That is a good idea, I will try making some writes to BigTable outside of JanusGraph in this container. However, considering that the BigTable client stats and BigTable server stats both report low latencies from within the JanusGraph application, this is looking like a JanusGraph-related issue. I will report back with results today. On Tue, Nov 16, 2021 at 11:48 AM Boxuan Li <liboxuan@...> wrote: I am not an expert on this and I've never used BigTable or GCP before, but here are my two cents: |
|
Re: Diagnosing slow write speeds to BigTable
Boxuan Li
I am not an expert on this and I've never used BigTable or GCP before, but here are my two cents:
1) Did you test the read speed? Is it also very slow compared to writing? 2) Did you try using an HBase/Bigtable client (in the same GCP container as your JanusGraph instance) to write to your BigTable cluster? If it's also very slow then the problem might be with your network or other setups. Best, Boxuan |
|
Diagnosing slow write speeds to BigTable
AC
Hey there, folks. Firstly I want to say thanks for your help with the previous bug we uncovered. I'm evaluating JanusGraph performance on BigTable and observing very slow write speeds when writing even a single vertex and committing a transaction. Starting a new transaction, writing a single vertex, and committing the transaction takes at minimum 5-6 seconds. BigTable metrics indicate that the backend is never taking more than 100ms (max) to perform a write. It's hard to imagine that any amount of overhead on the BigTable side would bring this up to 5-6 seconds. The basic BigTable stats inside our application also look reasonable. Here is the current configuration: "storage.backend": "hbase" "metrics.enabled": true "cache.db-cache": false "query.batch": true "storage.page-size": 1000 "storage.hbase.ext.hbase.client.connection.impl": "com.google.cloud.bigtable.hbase2_x.BigtableConnection" "storage.hbase.ext.google.bigtable.grpc.retry.deadlineexceeded.enable": true "storage.hbase.ext.google.bigtable.grpc.channel.count": 50 "storage.lock.retries": 5 "storage.lock.wait-time": 50.millis This is running in a GCP container that is rather beefy and not doing anything else, and is located in the same region as the BigTable cluster. Other traffic to/from the container seems fine. I'm currently using hbase-shaded-client rev 2.1.5 since that's aligned to JanusGraph 0.5.3 which we are currently using. I experimented with up to 2.4.8 and saw no difference. I'm also using bigtable-hbase-2.x-shaded 1.25.1, the latest stable revision. I'm at a loss how to progress further with my diagnosis, as all evidence indicates that the latency is originating with JanusGraph's operation. How can I better find and eliminate the source of this latency? Thanks! |
|
Re: How to change GLOBAL_OFFLINE configuration when graph can't be instantiated
toom@...
Hi Marc,
Your solution works if the configuration hasn't been changed yet. If you change the index backend and set a wrong hostname, you cannot access your data anymore: mgmt = graph.openManagement() mgmt.set("index.search.backend", "elasticsearch") mgmt.set("index.search.hostname", "non-existant.hostname") mgmt.commit() Then the database cannot be open. Regards, Toom. |
|
Re: Potential transaction issue (JG 0.6.0)
Boxuan Li
I agree with Sergey that "this problem was just hidden in the previous version as resources were not released properly".
I tried to reproduce in Java (not remote graph) but failed. @Charles, are you able to release the complete recipe of your code, or spot anything that I am missing? My code is as follows (you can put it in JanusGraphTest.java and run): @Test |
|
Re: Cassandra 4
hadoopmarc@...
Hi,
There is an issue tracking this, but no PR's yet, see: https://github.com/JanusGraph/janusgraph/issues/2325 Best wishes, Marc |
|
Cassandra 4
Kusnierz.Krzysztof@...
Hi, has anyone tried JG with Cassandra 4 ? Does it work ?
|
|
Re: How to Merge Two Vertices in JanusGraph into single vertex
hadoopmarc@...
Hi Krishna,
Nope. However, you are not the first to ask, see: https://stackoverflow.com/questions/46363737/tinkerpop-gremlin-merge-vertices-and-edges/46435070#46435070 Best wishes, Marc |
|
How to Merge Two Vertices in JanusGraph into single vertex
krishna.sailesh2@...
Hi Folks |
|
Re: Usage of CustomID on Vertexes
hadoopmarc@...
Hi Hazal,
Your comment is correct: the graph.set-vertex-id feature is not documented further than this, so using it is not advised. You are also right that lookups in the index require additional processing. However, depending on the ordering of inserts and their distribution across JanusGraph instances, many lookups can be avoided if vertices are still in the JanusGraph cache. Also, using storage backend ids assigned by JanusGraph will be more efficient for vertex reads later on because of the id partitioning applied. So I support your presumption that using an indexed property is to be preferred. Best wishes, Marc |
|
Re: JanusGraph server clustering with NodeJS
hadoopmarc@...
I read some conceptual confusion, so let me try:
|
|
Re: JanusGraph server clustering with NodeJS
sppriyaindu@...
We are also facing similar issue .. Could you please direct us how do we handle Janus cluster using node js
|
|