Date   

Re: Removing a vertex is not removing recently added properties in different transaction

Priyanka Jindal
 

Please find my answers inline:

  • do you use CompositeIndex or MixedIndex?
- i am suing composite index
  • is it certain that the two transaction do not overlap in time (as "next" suggests)?
- They do not overlap in time.
  • do the two transactions occur in the same janusgraph instance?
- Yes they do
  • is hbase configured as a single host or as a cluster?
- Its a cluster.

If i add some delay b/w two operations then vertices are getting removed correctly.


Re: Confused about GraphSON edges definition

hadoopmarc@...
 

Hi Laura,

If you want to know, you would better ask on the TinkerPop users list. Note that graphSON is not designed as a human-readable or standardized interchange format, but rather as interchange format between TinkerPop-compatible processes. If you want to create or modify a graphSON file, it is easier to instantiate a TinkerGraph and use the TinkerPop API.

Best wishes,   Marc


Re: CQL scaling limit?

hadoopmarc@...
 

Nice work!


Confused about GraphSON edges definition

Laura Morales <lauretas@...>
 

I'm looking at this example from TinkerPop https://tinkerpop.apache.org/docs/current/dev/io/#graphson

{"id":{"@type":"g:Int32","@value":1},"label":"person","outE":{"created":[{"id":{"@type":"g:Int32","@value":9},"inV":{"@type":"g:Int32","@value":3},"properties":{"weight":{"@type":"g:Double","@value":0.4}}}],"knows":[{"id":{"@type":"g:Int32","@value":7},"inV":{"@type":"g:Int32","@value":2},"properties":{"weight":{"@type":"g:Double","@value":0.5}}},{"id":{"@type":"g:Int32","@value":8},"inV":{"@type":"g:Int32","@value":4},"properties":{"weight":{"@type":"g:Double","@value":1.0}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":0},"value":"marko"}],"age":[{"id":{"@type":"g:Int64","@value":1},"value":{"@type":"g:Int32","@value":29}}]}}
{"id":{"@type":"g:Int32","@value":2},"label":"person","inE":{"knows":[{"id":{"@type":"g:Int32","@value":7},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Double","@value":0.5}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":2},"value":"vadas"}],"age":[{"id":{"@type":"g:Int64","@value":3},"value":{"@type":"g:Int32","@value":27}}]}}
{"id":{"@type":"g:Int32","@value":3},"label":"software","inE":{"created":[{"id":{"@type":"g:Int32","@value":9},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Double","@value":0.4}}},{"id":{"@type":"g:Int32","@value":11},"outV":{"@type":"g:Int32","@value":4},"properties":{"weight":{"@type":"g:Double","@value":0.4}}},{"id":{"@type":"g:Int32","@value":12},"outV":{"@type":"g:Int32","@value":6},"properties":{"weight":{"@type":"g:Double","@value":0.2}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":4},"value":"lop"}],"lang":[{"id":{"@type":"g:Int64","@value":5},"value":"java"}]}}
{"id":{"@type":"g:Int32","@value":4},"label":"person","inE":{"knows":[{"id":{"@type":"g:Int32","@value":8},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Double","@value":1.0}}}]},"outE":{"created":[{"id":{"@type":"g:Int32","@value":10},"inV":{"@type":"g:Int32","@value":5},"properties":{"weight":{"@type":"g:Double","@value":1.0}}},{"id":{"@type":"g:Int32","@value":11},"inV":{"@type":"g:Int32","@value":3},"properties":{"weight":{"@type":"g:Double","@value":0.4}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":6},"value":"josh"}],"age":[{"id":{"@type":"g:Int64","@value":7},"value":{"@type":"g:Int32","@value":32}}]}}
{"id":{"@type":"g:Int32","@value":5},"label":"software","inE":{"created":[{"id":{"@type":"g:Int32","@value":10},"outV":{"@type":"g:Int32","@value":4},"properties":{"weight":{"@type":"g:Double","@value":1.0}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":8},"value":"ripple"}],"lang":[{"id":{"@type":"g:Int64","@value":9},"value":"java"}]}}
{"id":{"@type":"g:Int32","@value":6},"label":"person","outE":{"created":[{"id":{"@type":"g:Int32","@value":12},"inV":{"@type":"g:Int32","@value":3},"properties":{"weight":{"@type":"g:Double","@value":0.2}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":10},"value":"peter"}],"age":[{"id":{"@type":"g:Int64","@value":11},"value":{"@type":"g:Int32","@value":35}}]}}

I don't understand two things, can anyone help me understand them?

- why do I need an "outE" *and* "inE" definition for the same edge? Why can't I just define one or the other? If I define both, the edge is created when importing the file, otherwise if I only use "outE" the edge is not created

- why is everything given an id? Including edges and properties (for example "properties":{"name":[{"id":{"@type":"g:Int64","@value":0},"value":"marko"}). Removing all the "id" except for nodes IDs seems to work fine


Re: Removing a vertex is not removing recently added properties in different transaction

hadoopmarc@...
 

The behavior you describe sounds like the behavior one experiences for transactions occurring in parallel. So let us investigate some further:
  • do you use CompositeIndex or MixedIndex?
  • is it certain that the two transaction do not overlap in time (as "next" suggests)?
  • do the two transactions occur in the same janusgraph instance?
  • is hbase configured as a single host or as a cluster?

Marc


Re: Index stuck on INSTALLED (single instance of JanusGraph)

fredrick.eisele@...
 

It still does not work for me.

graph.getOpenTransactions().forEach { tx ->  tx.commit() }


Re: CQL scaling limit?

madams@...
 

Hi Marc,

I tried rerunning the scaling test on a fresh graph with ids.block-size=10000000 , unfortunately I haven't seen any performance gain.

I also tried ids.block-size=10000000 and ids.authority.conflict-avoidance-mode=GLOBAL_AUTO, but there also there was no performance gain.
I used GLOBAL_AUTO as it was the easiest to test, I ran the test twice to make sure the result was not just due to unlucky random tag assignment. I didn't do the math, but I guess I would have to be very unlucky to get twice a very bad random tag allocation!

 

I tried something else which turned out to be very successful:

instead of inserting all the properties in the graph, I tried only inserting the ones necessary to feed the composite indexes and vertex-centric indexes. The indexes are used to execute efficiently the "get element or create it" logic. This test scaled quite nicely up to 64 indexers (instead of 4 before)!




Out of all the tests I tried so far, the two most successful ones were:

  1. decreasing the cql consistency level (from Quorum to ANY/ONE)
  2. decreasing the number of properties


What's interesting with these two cases, is that they didn't significantly increased the performance of a single indexer, they really increased the horizontal scalability we could achieve.

My best guess for why it is the case: they reduced the amount of work the ScyllaDB coordinators had to do by:

  1. decreasing the amount of coordination necessary to get a majority answer (Quorum)
  2. decreasing the size in bytes of the cql unlogged batches, some of our properties can be quite big ( > 1KB )

I would happily continue digging into this, unfortunately we have other priorities that turned up. We're putting the testing on the side for the moment.

I thought I would post my complete findings/guess anyway in case they are useful to someone.

 

Thank you so much for your help!
Cheers,
Marc


Removing a vertex is not removing recently added properties in different transaction

Priyanka Jindal
 

I am using janus client with hbase as storage backend.
In my case, I am using index ind1 to fetch vertices from the graph. Upon fetching I am adding some properties (e.g one such property is p1) to the vertices and committing the transaction.
In the next transaction, I am fetching the vertices using index ind2 where one key in the index is the property (p1) added in the last transaction. I get the vertices and remove them. Vertices are reported to be removed successfully. But sometimes they are still present with only the properties (p1) added in the previous transaction. Although other properties/edges have been removed. This is happening very intermittently. 
It would be really helpful if someone has an idea about this and can explain me.


TTL for write-ahead logs not working

Radhika Kundam
 

Hi,

I enabled write-ahead logs to support index recovery for secondary persistence failures. I am trying to set TTL for write-ahead logs through JanusGraphManagement setting "log.tx.ttl".
Tried below use case.
1. Set write-ahead log TTL as 10 min.
2. Created few failed entries by bringing Solr(Index Client) down.
3. Waited for more than TTL time(even waited for 1hr) and bring Solr UP.
Expected behavior is failed entries should not recovered as write-ahead log might be gone by then.
Actual behavior is failed entries are recovered successfully.

I triaged and able to see that it's updating "root.log.ttl" properly while creating instance of KCVLogManager for tx log.
Please let me know if any additional configuration is required or if my understanding about expected behavior is not correct.

Thank you,
Radhika


Re: Not able to enable Write-ahead logs using tx.log-tx for existing JanusGraph setup

Radhika Kundam
 

Thank you Boxuan for the confirmation. I used the same approach of reopening graph as of now.

It would be good if "logTransactions" can be refreshed on update of tx.log-tx by providing setter method without reopening graph.
Because as per my understanding, reopening of graph is required only for this tx.log-tx management setting(but not for any other management settings) as this property should be reflected for logTransactions.


Re: Too low Performance when running PageRank and WCC on Graph500

shepherdkingqsp@...
 

Well, the spec I am using is 32vCPU and 128GB memory. But I am testing Janusgraph with Cassandra as storage backend.

I think this is not a thing with memory spec. I think maybe it is a thing with configuration. (Cause you can see the result of Khop is reasonable.)

Best regards,
Shipeng


Re: Not able to enable Write-ahead logs using tx.log-tx for existing JanusGraph setup

Boxuan Li
 

Hi Radhika,

Got it. You are right, you need to reopen the graph after setting `tx.log-tx`.


Re: multilevel properties depth

hadoopmarc@...
 

Hi Laura,

You can only add a single level of metaproperties. One can understand this from the java docs.
gremlin> g.V(1).properties('name').next().getClass()
==>class org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerVertexProperty
gremlin> g.V(1).properties('name').properties('metatest').next().getClass()
==>class org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerProperty
In a TinkerGraph a regular property is a TinkerVertexProperty with a property() method to add metaproperties.
In a TinkerGraph a metaproperty is a TinkerProperty without property() method.

Best wishes,    Marc


Re: graphml properties of properties

hadoopmarc@...
 

Hi Laura,

Using TinkerGraph I exported a graph to graphSON in the way shown above. I reloaded it as follows:
gremlin> graph = TinkerGraph.open();
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.io('data/metatest.json').read().iterate()
gremlin> g.V().elementMap()
==>[id:1,label:person,name:marko,age:29]
==>[id:2,label:person,name:vadas,age:27]
==>[id:3,label:software,name:lop,lang:java]
==>[id:4,label:person,name:josh,age:32]
==>[id:5,label:software,name:ripple,lang:java]
==>[id:6,label:person,name:peter,age:35]
==>[id:13,label:person,name:turing]
gremlin> g.V(1).properties('name').elementMap()
==>[id:0,key:name,value:marko,metatest:hi]
gremlin>
So, the metaproperty added s read from graphSON. Do you mean to say that you cannot do the same with JanusGraph? I did not check myself.

Best wishes,    Marc


Re: Too low Performance when running PageRank and WCC on Graph500

hadoopmarc@...
 

Hi Shipeng,

Did you use their machine specs: 32 vCPU and 244 GB memory?  The graph is pretty big for in-memory use during OLAP:
marc@antecmarc:~$ curl http://service.tigergraph.com/download/benchmark/dataset/graph500-22/graph500-22_unique_node | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.6M  100 17.6M    0     0  3221k      0  0:00:05  0:00:05 --:--:-- 4166k
2396019
marc@antecmarc:~$ curl http://service.tigergraph.com/download/benchmark/dataset/graph500-22/graph500-22 | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  989M  100  989M    0     0  5427k      0  0:03:06  0:03:06 --:--:-- 6123k
67108864
Best wishes,    Marc


Re: CQL scaling limit?

hadoopmarc@...
 

Just one more thing to rule out: did you set cpu.request and cpu.limit of the indexer containers to the same value? You want the pods to be really independent for this test.


multilevel properties depth

Laura Morales <lauretas@...>
 

How many "levels" of multilevel properties are supported by Janus? What I mean is, can I only add properties about other properties, or can I add an arbitrary number of multilevel properties, that is properties about properties about properties about properties about properties...


Re: CQL scaling limit?

hadoopmarc@...
 

Hi Marc,
If you know how to handle MetricManager, that sounds fine. I was thinking in more basic terms: adding some log statements to you indexer java code.

Regarding the id block allocation, some features seem to have been added, which are still largely undocumented, see:
https://github.com/JanusGraph/janusgraph/blob/83c93fe717453ec31086ca1a208217a747ebd1a8/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConflictAvoidanceMode.java
https://docs.janusgraph.org/basics/janusgraph-cfg/#idsauthority
Notice that the default value for ids.authority.conflict-avoidance-mode is NONE. Given the rigor you show in your attempts,  trying other values seems worth a try too!

Best wishes,    Marc


Re: CQL scaling limit?

madams@...
 

Hi Marc,

We're running on Kubernetes, and there's no cpu limitations on the indexer.
Thanks for pointing this out, I actually haven't checked the overall resources of the cluster... It's a good sanity check:

From left to right, top to bottom:

  • The total cpu usage per node on the cluster, there's about 20 of them and other non janusgraph applications are running, that's why it's not 0 when the tests are not running
  • The processed records per Indexer, there's two tests here in one panel:
    1. First group is the scaling test with the cql QUORUM consistency
    2. Second group is the scaling test with write consistency ANY and the read consistency ONE
  • The IO Waiting time per node in the cluster (unfortunately I don't have this metric per indexer)
  • ID block allocation warning logs events (the exact log message looks like "Temporary storage exception while acquiring id block - retrying in PT2.4S: {}")

The grey areas represents the moment when the overall performance stopped scaling linearly with the number of indexers.

We're not maxing out the cpus yet, so it looks like we can still push the cluster. I don't have the IO waiting time per indexer unfortunately, but the node-exporter metric on IO waiting time fits with the grey areas in the graphs.

As you mentioned the ID Block allocation, I checked the logs for warning messages, and they are actually id allocation warning messages, I looked for other warning messages but didn't find any.

I tried increasing the Id Block size to 10 000 000 but didn't see any improvement - that said, from my understanding of the ID allocation it is the perfect suspect. I'll rerun these tests on a completely fresh graph with ids.block-size=10000000 to double check.

If that does not work, I'll try upgrading to the master version and re run the test. Any tip on how to log which part is slowing the insertion? I was thinking maybe of using the org.janusgraph.util.stats.MetricManager to time the execution time of parts of the code of the org.janusgraph.graphdb.database.StandardJanusGraph.commit() method.


Thanks a lot,
Cheers,
Marc


Re: CQL scaling limit?

hadoopmarc@...
 

Hi Marc,

Just to be sure: the indexers itself are not limited in the number of CPU they can grab? The 60 indexers run on the same machine? Or in independent cloud containers?

If the indexers are not CPU limited, it would be interesting to log where the time is spent: their own java code, waiting for transactions to complete, waiting for the id manager to return id-blocks?

Best wishes,   Marc

521 - 540 of 6656