Date   

Graph corruption?

Joe Obernberger
 

Hi all - I'm seeing this from a recent graph I built:

gremlin> :> g.V(4162).valueMap()
==>{source=[DS_106], sourceName=[GDELTRecord3]}
gremlin> :> g.V(4146).valueMap()
==>{source=[DS_106], sourceName=[GDELTRecord3]}
gremlin> :> g.V(4226).valueMap()
==>{source=[DS_106], sourceName=[GDELTRecord3]}
gremlin> :> g.V(4250).valueMap()
==>{source=[DS_106], sourceName=[GDELTRecord3]}
gremlin>
gremlin>
gremlin> :> g.V().has("source","DS_106")
==>v[4226]

The graph has an index on source like this:

PropertyKey sourceProperty = mgmt.makePropertyKey("source").dataType(String.class).cardinality(Cardinality.SINGLE).make();
JanusGraphIndex sourceIndex = mgmt.buildIndex("bySourceComposite", Vertex.class).addKey(sourceProperty).unique().buildCompositeIndex();
mgmt.setConsistency(sourceProperty, ConsistencyModifier.LOCK);
mgmt.setConsistency(sourceIndex, ConsistencyModifier.LOCK);

How could the graph end up with several vertices with the same source string?  Still learning graphs...

Thank you!

-Joe


--
This email has been checked for viruses by AVG.
https://www.avg.com


Re: Getting Edges - Performance

Joe Obernberger
 

Hi Boxuan -

Cluster is a 15 node cassandra cluster:
nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load        Tokens  Owns  Host ID                               Rack
UN  172.16.100.45   961.03 GiB  250     ?     07bccfce-45f1-41a3-a5c4-ee748a7a9b98  rack1
UN  172.16.100.251  946.8 GiB   200     ?     274a6e8d-de37-4e0b-b000-02d221d858a5  rack1
UN  172.16.100.35   688.11 GiB  200     ?     59150c47-274a-46fb-9d5e-bed468d36797  rack1
UN  172.16.100.252  909.26 GiB  200     ?     e83aa851-69b4-478f-88f6-60e657ea6539  rack1
UN  172.16.100.249  1.01 TiB    200     ?     49e4f571-7d1c-4e1e-aca7-5bbe076596f7  rack1
UN  172.16.100.38   809.11 GiB  200     ?     0d9509cc-2f23-4117-a883-469a1be54baf  rack1
UN  172.16.100.36   909.15 GiB  200     ?     d9702f96-256e-45ae-8e12-69a42712be50  rack1
UN  172.16.100.39   990.91 GiB  200     ?     93f9cb0f-ea71-4e3d-b62a-f0ea0e888c47  rack1
UN  172.16.100.248  867.42 GiB  200     ?     4bbbe57c-6219-41e5-bbac-de92a9594d53  rack1
UN  172.16.100.44   891.53 GiB  200     ?     b2e5366e-8386-40ec-a641-27944a5a7cfa  rack1
UN  172.16.100.37   493.34 GiB  120     ?     08a19658-40be-4e55-8709-812b3d4ac750  rack1
UN  172.16.100.250  972.12 GiB  200     ?     b74b6e65-af63-486a-b07f-9e304ec30a39  rack1
UN  172.16.20.16    1.84 TiB    500     ?     1ccd2cc5-3ee5-43c5-a8c3-7065bdc24297  rack1
UN  172.16.100.34   782.35 GiB  200     ?     352fd049-32f8-4be8-9275-68b145ac2832  rack1
UN  172.16.100.42   1.81 TiB    500     ?     b088a8e6-42f3-4331-a583-47ef5149598f  rack1

Janusgraph instance?  The java code talks directly to cassandra via the Janusgraph library.  Should I be using a Gremlin server instead?
Right now, I'm trying to load data into the graph as quickly as possible.  The service that does that needs to query the graph many times in order to put in the correct nodes/edges.  Slowness happens frequently:
Took: 965mSec - [DS_106_2022-06-18T12-15-58.184Z_4iaiTwYa] - num handleCID calls: 234 edgeHits: 819 vertHits: 231 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 4 edgeMiss: 6
Took: 2072mSec - [DS_106_2022-06-18T10-20-54.955Z_VHf1MKA2] - num handleCID calls: 701 edgeHits: 2467 vertHits: 697 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 3 edgeMiss: 2
Took: 2203mSec - [DS_106_2022-06-18T09-44-20.106Z_jUcshXdg] - num handleCID calls: 511 edgeHits: 1735 vertHits: 507 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 5 edgeMiss: 3
Took: 1925mSec - [DS_106_2022-06-18T12-15-58.205Z_DFnPCE0w] - num handleCID calls: 157 edgeHits: 499 vertHits: 155 - eCacheSize: 2 vertCacheSize: 2 vertMiss: 2 edgeMiss: 1
Took: 94mSec - [DS_91_2022-06-18T12-15-58.724Z_P6nyGwbl] - num handleCID calls: 124 edgeHits: 407 vertHits: 124 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 2 edgeMiss: 3
Took: 846mSec - [DS_106_2022-06-18T10-20-54.985Z_XO7LVfDU] - num handleCID calls: 52 edgeHits: 170 vertHits: 50 - eCacheSize: 2 vertCacheSize: 2 vertMiss: 2 edgeMiss: 3
Took: 675mSec - [DS_106_2022-06-18T09-44-19.922Z_d2qo7ioj] - num handleCID calls: 650 edgeHits: 2612 vertHits: 650 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 2 edgeMiss: 5
Took: 705mSec - [DS_106_2022-06-18T12-15-58.724Z_U5seNz1V] - num handleCID calls: 273 edgeHits: 926 vertHits: 272 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 3 edgeMiss: 1
Took: 687mSec - [DS_106_2022-06-18T10-21-01.336Z_7kToNeIj] - num handleCID calls: 29 edgeHits: 44 vertHits: 24 - eCacheSize: 2 vertCacheSize: 2 vertMiss: 5 edgeMiss: 2
Took: 4024mSec - [DS_106_2022-06-18T09-44-19.846Z_Dc3sFChv] - num handleCID calls: 777 edgeHits: 3233 vertHits: 773 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 5 edgeMiss: 11
Took: 4641mSec - [DS_106_2022-06-18T10-21-03.304Z_bup3D681] - num handleCID calls: 307 edgeHits: 1272 vertHits: 300 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 8 edgeMiss: 15
Took: 4665mSec - [DS_106_2022-06-18T12-16-02.292Z_XwMKRm0i] - num handleCID calls: 310 edgeHits: 1290 vertHits: 303 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 8 edgeMiss: 15
Took: 1939mSec - [DS_106_2022-06-18T09-44-21.925Z_uGFYTu1V] - num handleCID calls: 383 edgeHits: 1350 vertHits: 379 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 4 edgeMiss: 5
Took: 3032mSec - [DS_106_2022-06-18T10-21-14.617Z_TbwWwuSG] - num handleCID calls: 826 edgeHits: 2900 vertHits: 823 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 5 edgeMiss: 5
Took: 3217mSec - [DS_106_2022-06-18T12-16-02.275Z_xBhlZITR] - num handleCID calls: 183 edgeHits: 580 vertHits: 179 - eCacheSize: 3 vertCacheSize: 2 vertMiss: 4 edgeMiss: 2
Took: 2606mSec - [DS_106_2022-06-18T09-44-21.877Z_YClmiFCl] - num handleCID calls: 335 edgeHits: 1156 vertHits: 332 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 4 edgeMiss: 3
Took: 2276mSec - [DS_106_2022-06-18T10-21-14.630Z_bv48tkF8] - num handleCID calls: 507 edgeHits: 1782 vertHits: 504 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 4 edgeMiss: 3
Took: 1908mSec - [DS_106_2022-06-18T09-44-20.254Z_XhGH0KI4] - num handleCID calls: 583 edgeHits: 2153 vertHits: 581 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 3 edgeMiss: 6
Took: 2463mSec - [DS_106_2022-06-18T12-16-07.498Z_acwPqSnu] - num handleCID calls: 48 edgeHits: 162 vertHits: 48 - eCacheSize: 2 vertCacheSize: 2 vertMiss: 1 edgeMiss: 6
Took: 386mSec - [DS_106_2022-06-18T10-21-15.651Z_KWOek5Nm] - num handleCID calls: 11 edgeHits: 19 vertHits: 8 - eCacheSize: 2 vertCacheSize: 2 vertMiss: 4 edgeMiss: 5
Took: 78mSec - [DS_91_2022-06-18T12-16-10.616Z_jQPoltfT] - num handleCID calls: 56 edgeHits: 268 vertHits: 54 - eCacheSize: 3 vertCacheSize: 2 vertMiss: 2 edgeMiss: 2
Took: 64mSec - [DS_91_2022-06-18T12-16-18.846Z_vzIEpQFH] - num handleCID calls: 96 edgeHits: 278 vertHits: 91 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 6 edgeMiss: 3
Took: 1425mSec - [DS_106_2022-06-18T09-44-21.953Z_aOwVhMAS] - num handleCID calls: 979 edgeHits: 4172 vertHits: 977 - eCacheSize: 3 vertCacheSize: 3 vertMiss: 4 edgeMiss: 7

The num handleCID calls queries janusgraph [cassandra] several times per call, but I'm using internal hashmaps of Vertex and Edge to help speed that up.

Could you explain p50, 90, p99 latency?

-Joe

On 6/23/2022 7:33 PM, Boxuan Li wrote:

Your Cassandra statistics shows pretty low latency, so the problem does not seem to be related to Cassandra. I would check if it’s a network problem. It might also be helpful if you could provide more details, e.g.:

1. How is your cluster set up? Is JanusGraph instance in the same data center as your Cassandra cluster?
2. How is your workload? QPS?
3. How often do you see such slowness? What is p50, p90, p99 latency of such kind of queries?

Best,
Boxuan

On Jun 23, 2022, at 5:03 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

One thing of note is the tablehistogram for the graphindex table:

nodetool tablehistograms graphsource.graphindex
graphsource/graphindex histograms
Percentile      Read Latency     Write Latency          SSTables    Partition Size        Cell Count
                    (micros)          (micros)                             (bytes)
50%                   152.32             35.43              0.00               179                 1
75%                   545.79             51.01              2.00               179                 1
95%                   943.13             61.21              4.00               215                 1
98%                  1131.75             73.46              5.00               215                 1
99%                  1131.75             88.15              5.00               215                 1
Min                    42.51             11.87              0.00                36                 0
Max                  1131.75             88.15              5.00         802187438          25109160

Stats:

nodetool tablestats graphsource.graphindex
Total number of tables: 95
----------------
Keyspace : graphsource
        Read Count: 28106793
        Read Latency: 2.0613708762860283 ms
        Write Count: 54754221
        Write Latency: 0.089838214609975 ms
        Pending Flushes: 0
                Table: graphindex
                SSTable count: 25
                Old SSTable count: 0
                Space used (live): 2858126352
                Space used (total): 2858126352
                Space used by snapshots (total): 2032016596
                Off heap memory used (total): 22737395
                SSTable Compression Ratio: 0.4225597463117299
                Number of partitions (estimate): 10030080
                Memtable cell count: 6363776
                Memtable data size: 217020150
                Memtable off heap memory used: 0
                Memtable switch count: 77
                Local read count: 11659961
                Local read latency: 0.315 ms
                Local write count: 41850997
                Local write latency: 0.039 ms
                Pending flushes: 0
                Percent repaired: 0.0
                Bytes repaired: 0.000KiB
                Bytes unrepaired: 3.628GiB
                Bytes pending repair: 0.000KiB
                Bloom filter false positives: 40827
                Bloom filter false ratio: 0.00000
                Bloom filter space used: 12591976
                Bloom filter off heap memory used: 12591776
                Index summary off heap memory used: 9669843
                Compression metadata off heap memory used: 475776
                Compacted partition minimum bytes: 36
                Compacted partition maximum bytes: 802187438
                Compacted partition mean bytes: 423
                Average live cells per slice (last five minutes): 1.0
                Maximum live cells per slice (last five minutes): 1
                Average tombstones per slice (last five minutes): 1.0
                Maximum tombstones per slice (last five minutes): 1
                Dropped Mutations: 0
                Droppable tombstone ratio: 0.00000

It looks like the index is not well partitioned?

-Joe

On 6/23/2022 4:47 PM, Joe Obernberger via lists.lfaidata.foundation wrote:

Thank you Boxuan - the code (REST service) that is modifying the graph is being called continuously when running.  A slow example looks like this:

Metrics: Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[edgeID.eq(DS_106|geotime|hou...                     1           1        1393.727   100.00
  constructGraphCentricQuery                                                                   0.011
  GraphCentricQuery                                                                         1393.698
    \_condition=(edgeID = DS_106|geotime|hour 1km|2022-6-14T4|18SVK 86 22|Date,Actor2Geo_Long_Actor2Geo_Lat)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[1]
    \_index=byEdgeIDComposite
    backend-query                                                      1                    1393.128
    \_query=byEdgeIDComposite:multiKSQ[1]
                                            >TOTAL                     -           -        1393.727        -

Code is:
System.out.println("Metrics: "+traversal.E().has("edgeID", edgeID).profile().next().toString());
dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

If the time to run that exceeds 500mSec, I print the message.

-Joe

On 6/23/2022 4:15 PM, Boxuan Li wrote:
Profiler documentation is available here: https://tinkerpop.apache.org/docs/current/reference/#profile-step 

Can you do

traversal.E().has("edgeID", edgeID).profile().next()

and paste the output (when you experience the slowness) here?

Best,
Boxuan

On Jun 23, 2022, at 2:29 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

Looking for documentation on how to do profile() - do you have any?

Queries like this are also slow:
Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.

-Joe

On 6/22/2022 4:53 PM, Boxuan Li wrote:
It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile() output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 
It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).
Outbound edges could be large - 100s to 10000s.

-Joe

On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to 
better this performance?
Thank you!

-Joe


-- 
This email has been checked for viruses by AVG.
https://www.avg.com









AVG
                                                  logo
This email has been checked for viruses by AVG antivirus software. 
www.avg.com




Re: Threaded Operations - Quarkus

Boxuan Li
 

Gotcha. The recommended way is to use TinkerPop API to do CRUD operations, i.e. GraphTraversalSource.addV() is recommended as it is considered the “standard” way.

The `addV` method inherently calls `JanusGraph.addVertex` method, so technically they are the same.

Best,
Boxuan

On Jun 23, 2022, at 2:22 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

Sorry for the delay.
When a request comes in via REST, Quarkus creates a thread to handle it; I believe it actually comes from a thread pool.
This code now does:
GraphTraversalSource traversal = StaticInfo.getSingleton().getGraph().buildTransaction().checkExternalVertexExistence(false).consistencyChecks(false).propertyPrefetching(true).start().traversal();
for each incoming request.  It then does a tx().commit and a traversal.close() when complete.

The StaticInfo class contains a JanusGraph object (getGraph()).
What I'm wondering is there appears to be two ways to add nodes and edges.  One is using a GraphTraversalSource, and one is using a JanusGraph:
GraphTraversalSource.addV()
Or
JanusGraph.addVertex

-Joe

On 6/22/2022 4:51 PM, Boxuan Li wrote:
When you say use JanusGraph.tx().createdThreadedTx() directly, what do you mean? Can you give an example?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Tuesday, June 21, 2022 4:25 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 
Thanks for all the help on this.  I'm coming closer to a solution thanks to you all.
Question - I've been using GraphTraversalSource to do all the adding vertices and edges to my graph.  Example:
GraphTraversalSource traversal = JanusGraph.tx().createThreadedTx().traversal();
Is it better to use JanusGraph.tx().createdThreadedTx() directly?

-Joe

On 6/17/2022 3:03 PM, Boxuan Li wrote:
Yeah using `newTransaction()` won't make a difference in your use case. Based on your input, there are a couple of things you could try:

  1. As suggested by Kevin, you could use locking. See https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency. It is slow but it will hopefully solve most race conditions you have. Based on my understanding of Cassandra's nature, I think you could still see such inconsistencies but the chance is much lower for sure.
  2. You could periodically identify and remove the inconsistencies using an offline pipeline.
  3. You could use an external locking service on client side. For example, using Redis to make sure a conflicting transaction won't start at the first place.
These solutions have their own pros & cons, so it really depends on you.

Best,
Boxuan


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 2:24 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 
So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }
being called from multiple threads results in several vertices with the same 'someID'.
Not sure how to fix this.
-Joe
On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:
Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?
My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of 
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.
At least that's my current plan.  Not sure if it will work.
-Joe
On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 
Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.
Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request.  
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.
Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())
try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }
.
.
.
if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        } 
I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }
Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!
-Joe
On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc 



AVG logo
This email has been checked for viruses by AVG antivirus software. 
www.avg.com



Re: Getting Edges - Performance

Boxuan Li
 

Your Cassandra statistics shows pretty low latency, so the problem does not seem to be related to Cassandra. I would check if it’s a network problem. It might also be helpful if you could provide more details, e.g.:

1. How is your cluster set up? Is JanusGraph instance in the same data center as your Cassandra cluster?
2. How is your workload? QPS?
3. How often do you see such slowness? What is p50, p90, p99 latency of such kind of queries?

Best,
Boxuan

On Jun 23, 2022, at 5:03 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

One thing of note is the tablehistogram for the graphindex table:

nodetool tablehistograms graphsource.graphindex
graphsource/graphindex histograms
Percentile      Read Latency     Write Latency          SSTables    Partition Size        Cell Count
                    (micros)          (micros)                             (bytes)
50%                   152.32             35.43              0.00               179                 1
75%                   545.79             51.01              2.00               179                 1
95%                   943.13             61.21              4.00               215                 1
98%                  1131.75             73.46              5.00               215                 1
99%                  1131.75             88.15              5.00               215                 1
Min                    42.51             11.87              0.00                36                 0
Max                  1131.75             88.15              5.00         802187438          25109160

Stats:

nodetool tablestats graphsource.graphindex
Total number of tables: 95
----------------
Keyspace : graphsource
        Read Count: 28106793
        Read Latency: 2.0613708762860283 ms
        Write Count: 54754221
        Write Latency: 0.089838214609975 ms
        Pending Flushes: 0
                Table: graphindex
                SSTable count: 25
                Old SSTable count: 0
                Space used (live): 2858126352
                Space used (total): 2858126352
                Space used by snapshots (total): 2032016596
                Off heap memory used (total): 22737395
                SSTable Compression Ratio: 0.4225597463117299
                Number of partitions (estimate): 10030080
                Memtable cell count: 6363776
                Memtable data size: 217020150
                Memtable off heap memory used: 0
                Memtable switch count: 77
                Local read count: 11659961
                Local read latency: 0.315 ms
                Local write count: 41850997
                Local write latency: 0.039 ms
                Pending flushes: 0
                Percent repaired: 0.0
                Bytes repaired: 0.000KiB
                Bytes unrepaired: 3.628GiB
                Bytes pending repair: 0.000KiB
                Bloom filter false positives: 40827
                Bloom filter false ratio: 0.00000
                Bloom filter space used: 12591976
                Bloom filter off heap memory used: 12591776
                Index summary off heap memory used: 9669843
                Compression metadata off heap memory used: 475776
                Compacted partition minimum bytes: 36
                Compacted partition maximum bytes: 802187438
                Compacted partition mean bytes: 423
                Average live cells per slice (last five minutes): 1.0
                Maximum live cells per slice (last five minutes): 1
                Average tombstones per slice (last five minutes): 1.0
                Maximum tombstones per slice (last five minutes): 1
                Dropped Mutations: 0
                Droppable tombstone ratio: 0.00000

It looks like the index is not well partitioned?

-Joe

On 6/23/2022 4:47 PM, Joe Obernberger via lists.lfaidata.foundation wrote:

Thank you Boxuan - the code (REST service) that is modifying the graph is being called continuously when running.  A slow example looks like this:

Metrics: Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[edgeID.eq(DS_106|geotime|hou...                     1           1        1393.727   100.00
  constructGraphCentricQuery                                                                   0.011
  GraphCentricQuery                                                                         1393.698
    \_condition=(edgeID = DS_106|geotime|hour 1km|2022-6-14T4|18SVK 86 22|Date,Actor2Geo_Long_Actor2Geo_Lat)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[1]
    \_index=byEdgeIDComposite
    backend-query                                                      1                    1393.128
    \_query=byEdgeIDComposite:multiKSQ[1]
                                            >TOTAL                     -           -        1393.727        -

Code is:
System.out.println("Metrics: "+traversal.E().has("edgeID", edgeID).profile().next().toString());
dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

If the time to run that exceeds 500mSec, I print the message.

-Joe

On 6/23/2022 4:15 PM, Boxuan Li wrote:
Profiler documentation is available here: https://tinkerpop.apache.org/docs/current/reference/#profile-step 

Can you do

traversal.E().has("edgeID", edgeID).profile().next()

and paste the output (when you experience the slowness) here?

Best,
Boxuan

On Jun 23, 2022, at 2:29 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

Looking for documentation on how to do profile() - do you have any?

Queries like this are also slow:
Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.

-Joe

On 6/22/2022 4:53 PM, Boxuan Li wrote:
It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile() output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 
It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).
Outbound edges could be large - 100s to 10000s.

-Joe

On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to 
better this performance?
Thank you!

-Joe


-- 
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo
This email has been checked for viruses by AVG antivirus software. 
www.avg.com




Re: Getting Edges - Performance

Joe Obernberger
 

One thing of note is the tablehistogram for the graphindex table:

nodetool tablehistograms graphsource.graphindex
graphsource/graphindex histograms
Percentile      Read Latency     Write Latency          SSTables    Partition Size        Cell Count
                    (micros)          (micros)                             (bytes)
50%                   152.32             35.43              0.00               179                 1
75%                   545.79             51.01              2.00               179                 1
95%                   943.13             61.21              4.00               215                 1
98%                  1131.75             73.46              5.00               215                 1
99%                  1131.75             88.15              5.00               215                 1
Min                    42.51             11.87              0.00                36                 0
Max                  1131.75             88.15              5.00         802187438          25109160

Stats:

nodetool tablestats graphsource.graphindex
Total number of tables: 95
----------------
Keyspace : graphsource
        Read Count: 28106793
        Read Latency: 2.0613708762860283 ms
        Write Count: 54754221
        Write Latency: 0.089838214609975 ms
        Pending Flushes: 0
                Table: graphindex
                SSTable count: 25
                Old SSTable count: 0
                Space used (live): 2858126352
                Space used (total): 2858126352
                Space used by snapshots (total): 2032016596
                Off heap memory used (total): 22737395
                SSTable Compression Ratio: 0.4225597463117299
                Number of partitions (estimate): 10030080
                Memtable cell count: 6363776
                Memtable data size: 217020150
                Memtable off heap memory used: 0
                Memtable switch count: 77
                Local read count: 11659961
                Local read latency: 0.315 ms
                Local write count: 41850997
                Local write latency: 0.039 ms
                Pending flushes: 0
                Percent repaired: 0.0
                Bytes repaired: 0.000KiB
                Bytes unrepaired: 3.628GiB
                Bytes pending repair: 0.000KiB
                Bloom filter false positives: 40827
                Bloom filter false ratio: 0.00000
                Bloom filter space used: 12591976
                Bloom filter off heap memory used: 12591776
                Index summary off heap memory used: 9669843
                Compression metadata off heap memory used: 475776
                Compacted partition minimum bytes: 36
                Compacted partition maximum bytes: 802187438
                Compacted partition mean bytes: 423
                Average live cells per slice (last five minutes): 1.0
                Maximum live cells per slice (last five minutes): 1
                Average tombstones per slice (last five minutes): 1.0
                Maximum tombstones per slice (last five minutes): 1
                Dropped Mutations: 0
                Droppable tombstone ratio: 0.00000

It looks like the index is not well partitioned?

-Joe

On 6/23/2022 4:47 PM, Joe Obernberger via lists.lfaidata.foundation wrote:

Thank you Boxuan - the code (REST service) that is modifying the graph is being called continuously when running.  A slow example looks like this:

Metrics: Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[edgeID.eq(DS_106|geotime|hou...                     1           1        1393.727   100.00
  constructGraphCentricQuery                                                                   0.011
  GraphCentricQuery                                                                         1393.698
    \_condition=(edgeID = DS_106|geotime|hour 1km|2022-6-14T4|18SVK 86 22|Date,Actor2Geo_Long_Actor2Geo_Lat)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[1]
    \_index=byEdgeIDComposite
    backend-query                                                      1                    1393.128
    \_query=byEdgeIDComposite:multiKSQ[1]
                                            >TOTAL                     -           -        1393.727        -

Code is:
System.out.println("Metrics: "+traversal.E().has("edgeID", edgeID).profile().next().toString());
dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

If the time to run that exceeds 500mSec, I print the message.

-Joe

On 6/23/2022 4:15 PM, Boxuan Li wrote:
Profiler documentation is available here: https://tinkerpop.apache.org/docs/current/reference/#profile-step 

Can you do

traversal.E().has("edgeID", edgeID).profile().next()

and paste the output (when you experience the slowness) here?

Best,
Boxuan

On Jun 23, 2022, at 2:29 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

Looking for documentation on how to do profile() - do you have any?

Queries like this are also slow:
Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.

-Joe

On 6/22/2022 4:53 PM, Boxuan Li wrote:
It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile() output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 
It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).
Outbound edges could be large - 100s to 10000s.

-Joe

On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to 
better this performance?
Thank you!

-Joe


-- 
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo
This email has been checked for viruses by AVG antivirus software. 
www.avg.com



Re: Poor load balancing in 0.5.3

Boxuan Li
 

I suspect this is an issue with gremlin java-driver. I just reported here: https://issues.apache.org/jira/browse/TINKERPOP-2766

I believe you should provide all hostnames for now.

Best regards,
Boxuan

On Jun 23, 2022, at 2:29 PM, Doug Whitfield via lists.lfaidata.foundation <dwhitfield=perforce.com@...> wrote:

Hi folks,

From what we are seeing in the gremlin driver code, it is expected that all gremlin server host names are provided and not a VIP. Is that correct?  
For example,
  Cluster cluster = Cluster.build()
    .addContactPoints("localhost")
    .port(server.getPort())
    .create();
In the example, 'localhost' is where we put in a generic VIP that points to 3 servers. Is this a bad approach?
In the end, what we are seeing is that load is severely unbalanced.
 
Despite my many protests to upgrade, we are running JanusGraph v0.5.3 and tinkerpop-3.4.6.
 
Please let us know if there is any additional information that would be helpful.

Best Regards,
 
Douglas Whitfield | Enterprise Architect
 

This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.




Re: Getting Edges - Performance

Joe Obernberger
 

Thank you Boxuan - the code (REST service) that is modifying the graph is being called continuously when running.  A slow example looks like this:

Metrics: Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[edgeID.eq(DS_106|geotime|hou...                     1           1        1393.727   100.00
  constructGraphCentricQuery                                                                   0.011
  GraphCentricQuery                                                                         1393.698
    \_condition=(edgeID = DS_106|geotime|hour 1km|2022-6-14T4|18SVK 86 22|Date,Actor2Geo_Long_Actor2Geo_Lat)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[1]
    \_index=byEdgeIDComposite
    backend-query                                                      1                    1393.128
    \_query=byEdgeIDComposite:multiKSQ[1]
                                            >TOTAL                     -           -        1393.727        -

Code is:
System.out.println("Metrics: "+traversal.E().has("edgeID", edgeID).profile().next().toString());
dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

If the time to run that exceeds 500mSec, I print the message.

-Joe

On 6/23/2022 4:15 PM, Boxuan Li wrote:

Profiler documentation is available here: https://tinkerpop.apache.org/docs/current/reference/#profile-step 

Can you do

traversal.E().has("edgeID", edgeID).profile().next()

and paste the output (when you experience the slowness) here?

Best,
Boxuan

On Jun 23, 2022, at 2:29 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

Looking for documentation on how to do profile() - do you have any?

Queries like this are also slow:
Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.

-Joe

On 6/22/2022 4:53 PM, Boxuan Li wrote:
It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile() output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 
It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).
Outbound edges could be large - 100s to 10000s.

-Joe

On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to 
better this performance?
Thank you!

-Joe


-- 
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo
This email has been checked for viruses by AVG antivirus software. 
www.avg.com



Re: Getting Edges - Performance

Boxuan Li
 

Profiler documentation is available here: https://tinkerpop.apache.org/docs/current/reference/#profile-step 

Can you do

traversal.E().has("edgeID", edgeID).profile().next()

and paste the output (when you experience the slowness) here?

Best,
Boxuan

On Jun 23, 2022, at 2:29 PM, Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...> wrote:

Looking for documentation on how to do profile() - do you have any?

Queries like this are also slow:
Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.

-Joe

On 6/22/2022 4:53 PM, Boxuan Li wrote:
It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile() output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 
It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).
Outbound edges could be large - 100s to 10000s.

-Joe

On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation<joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to 
better this performance?
Thank you!

-Joe


-- 
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo
This email has been checked for viruses by AVG antivirus software. 
www.avg.com



Re: Getting Edges - Performance

Joe Obernberger
 

Looking for documentation on how to do profile() - do you have any?


Queries like this are also slow:

Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();


There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.


-Joe


On 6/22/2022 4:53 PM, Boxuan Li wrote:

It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile()​ output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 

It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).

Outbound edges could be large - 100s to 10000s.


-Joe


On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to
better this performance?
Thank you!

-Joe


--
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Poor load balancing in 0.5.3

Doug Whitfield
 

Hi folks,


From what we are seeing in the gremlin driver code, it is expected that all gremlin server host names are provided and not a VIP. Is that correct?  

For example,

  Cluster cluster = Cluster.build()

    .addContactPoints("localhost")

    .port(server.getPort())

    .create();

In the example, 'localhost' is where we put in a generic VIP that points to 3 servers. Is this a bad approach?

In the end, what we are seeing is that load is severely unbalanced.

 

Despite my many protests to upgrade, we are running JanusGraph v0.5.3 and tinkerpop-3.4.6.

 

Please let us know if there is any additional information that would be helpful.


Best Regards,

 

Douglas Whitfield | Enterprise Architect

 

 


This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.



Re: Threaded Operations - Quarkus

Joe Obernberger
 

Sorry for the delay.

When a request comes in via REST, Quarkus creates a thread to handle it; I believe it actually comes from a thread pool.

This code now does:

GraphTraversalSource traversal = StaticInfo.getSingleton().getGraph().buildTransaction().checkExternalVertexExistence(false).consistencyChecks(false).propertyPrefetching(true).start().traversal();

for each incoming request.  It then does a tx().commit and a traversal.close() when complete.


The StaticInfo class contains a JanusGraph object (getGraph()).

What I'm wondering is there appears to be two ways to add nodes and edges.  One is using a GraphTraversalSource, and one is using a JanusGraph:

GraphTraversalSource.addV()

Or

JanusGraph.addVertex


-Joe


On 6/22/2022 4:51 PM, Boxuan Li wrote:

When you say use JanusGraph.tx().createdThreadedTx() directly, what do you mean? Can you give an example?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Tuesday, June 21, 2022 4:25 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thanks for all the help on this.  I'm coming closer to a solution thanks to you all.

Question - I've been using GraphTraversalSource to do all the adding vertices and edges to my graph.  Example:

GraphTraversalSource traversal = JanusGraph.tx().createThreadedTx().traversal();

Is it better to use JanusGraph.tx().createdThreadedTx() directly?


-Joe


On 6/17/2022 3:03 PM, Boxuan Li wrote:
Yeah using `newTransaction()` won't make a difference in your use case. Based on your input, there are a couple of things you could try:

  1. As suggested by Kevin, you could use locking. See https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency. It is slow but it will hopefully solve most race conditions you have. Based on my understanding of Cassandra's nature, I think you could still see such inconsistencies but the chance is much lower for sure.
  2. You could periodically identify and remove the inconsistencies using an offline pipeline.
  3. You could use an external locking service on client side. For example, using Redis to make sure a conflicting transaction won't start at the first place.
These solutions have their own pros & cons, so it really depends on you.

Best,
Boxuan


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 2:24 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }

being called from multiple threads results in several vertices with the same 'someID'.

Not sure how to fix this.

-Joe

On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:

Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?

My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();

Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.

At least that's my current plan.  Not sure if it will work.

-Joe

On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.

Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request. 
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.

Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())

try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }

.
.
.

if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        }

I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }

Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!

-Joe

On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc



AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Getting Edges - Performance

Boxuan Li
 

It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile()​ output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
 

It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).

Outbound edges could be large - 100s to 10000s.


-Joe


On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to
better this performance?
Thank you!

-Joe


--
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Threaded Operations - Quarkus

Boxuan Li
 

When you say use JanusGraph.tx().createdThreadedTx() directly, what do you mean? Can you give an example?


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Tuesday, June 21, 2022 4:25 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thanks for all the help on this.  I'm coming closer to a solution thanks to you all.

Question - I've been using GraphTraversalSource to do all the adding vertices and edges to my graph.  Example:

GraphTraversalSource traversal = JanusGraph.tx().createThreadedTx().traversal();

Is it better to use JanusGraph.tx().createdThreadedTx() directly?


-Joe


On 6/17/2022 3:03 PM, Boxuan Li wrote:
Yeah using `newTransaction()` won't make a difference in your use case. Based on your input, there are a couple of things you could try:

  1. As suggested by Kevin, you could use locking. See https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency. It is slow but it will hopefully solve most race conditions you have. Based on my understanding of Cassandra's nature, I think you could still see such inconsistencies but the chance is much lower for sure.
  2. You could periodically identify and remove the inconsistencies using an offline pipeline.
  3. You could use an external locking service on client side. For example, using Redis to make sure a conflicting transaction won't start at the first place.
These solutions have their own pros & cons, so it really depends on you.

Best,
Boxuan


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 2:24 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }

being called from multiple threads results in several vertices with the same 'someID'.

Not sure how to fix this.

-Joe

On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:

Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?

My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();

Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.

At least that's my current plan.  Not sure if it will work.

-Joe

On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.

Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request. 
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.

Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())

try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }

.
.
.

if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        }

I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }

Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!

-Joe

On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc



AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Getting Edges - Performance

Joe Obernberger
 

It's usually small; typically around 3 edges.
Reading your article (https://li-boxuan.medium.com/janusgraph-deep-dive-part-3-speed-up-edge-queries-3b9eb5ba34f8).

Outbound edges could be large - 100s to 10000s.


-Joe


On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to
better this performance?
Thank you!

-Joe


--
This email has been checked for viruses by AVG.
https://www.avg.com









AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Getting Edges - Performance

Boxuan Li
 

When it takes over 3 seconds to return, how many edges does it return?


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
 
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to
better this performance?
Thank you!

-Joe


--
This email has been checked for viruses by AVG.
https://www.avg.com







Getting Edges - Performance

Joe Obernberger
 

Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to better this performance?
Thank you!

-Joe


--
This email has been checked for viruses by AVG.
https://www.avg.com


Re: Threaded Operations - Quarkus

Joe Obernberger
 

Thanks for all the help on this.  I'm coming closer to a solution thanks to you all.

Question - I've been using GraphTraversalSource to do all the adding vertices and edges to my graph.  Example:

GraphTraversalSource traversal = JanusGraph.tx().createThreadedTx().traversal();

Is it better to use JanusGraph.tx().createdThreadedTx() directly?


-Joe


On 6/17/2022 3:03 PM, Boxuan Li wrote:

Yeah using `newTransaction()` won't make a difference in your use case. Based on your input, there are a couple of things you could try:

  1. As suggested by Kevin, you could use locking. See https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency. It is slow but it will hopefully solve most race conditions you have. Based on my understanding of Cassandra's nature, I think you could still see such inconsistencies but the chance is much lower for sure.
  2. You could periodically identify and remove the inconsistencies using an offline pipeline.
  3. You could use an external locking service on client side. For example, using Redis to make sure a conflicting transaction won't start at the first place.
These solutions have their own pros & cons, so it really depends on you.

Best,
Boxuan


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 2:24 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }

being called from multiple threads results in several vertices with the same 'someID'.

Not sure how to fix this.

-Joe

On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:

Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?

My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();

Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.

At least that's my current plan.  Not sure if it will work.

-Joe

On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.

Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request. 
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.

Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())

try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }

.
.
.

if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        }

I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }

Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!

-Joe

On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc



AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Threaded Operations - Quarkus

Boxuan Li
 

Yeah using `newTransaction()` won't make a difference in your use case. Based on your input, there are a couple of things you could try:

  1. As suggested by Kevin, you could use locking. See https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency. It is slow but it will hopefully solve most race conditions you have. Based on my understanding of Cassandra's nature, I think you could still see such inconsistencies but the chance is much lower for sure.
  2. You could periodically identify and remove the inconsistencies using an offline pipeline.
  3. You could use an external locking service on client side. For example, using Redis to make sure a conflicting transaction won't start at the first place.
These solutions have their own pros & cons, so it really depends on you.

Best,
Boxuan


From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 2:24 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }

being called from multiple threads results in several vertices with the same 'someID'.

Not sure how to fix this.

-Joe

On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:

Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?

My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();

Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.

At least that's my current plan.  Not sure if it will work.

-Joe

On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.

Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request. 
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.

Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())

try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }

.
.
.

if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        }

I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }

Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!

-Joe

On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc



AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Threaded Operations - Quarkus

Kevin Schmidt
 

This from Titan, the predecessor to Janusgraph, but see https://groups.google.com/g/aureliusgraphs/c/z6kyGSlifXE/m/aLc2Zwb_BAAJ for some experience with a similar issue.

You can either do locking and be slow and incur any other downsides of that, particularly if you want to do this across JVMs, or accept that you will have some (small?) risk of duplicates and deal with it in your traversals or do something periodically to identify and remove them.

On Fri, Jun 17, 2022 at 11:24 AM Joe Obernberger <joseph.obernberger@...> wrote:

So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }

being called from multiple threads results in several vertices with the same 'someID'.

Not sure how to fix this.

-Joe

On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:

Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?

My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();

Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.

At least that's my current plan.  Not sure if it will work.

-Joe

On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.

Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request. 
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.

Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())

try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }

.
.
.

if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        }

I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }

Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!

-Joe

On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc



AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com



Re: Threaded Operations - Quarkus

Joe Obernberger
 

So - unsurprisingly, Boxuan is correct.
Code like this:
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();
try {
                    datasourceVertex = traversal.V().has("someID", id).next();
                } catch (java.util.NoSuchElementException nse) {
                    datasourceVertex = traversal.addV("source").property("someID", id).next();
                }

being called from multiple threads results in several vertices with the same 'someID'.

Not sure how to fix this.

-Joe

On 6/17/2022 10:28 AM, Joe Obernberger via lists.lfaidata.foundation wrote:

Good stuff - thank you Boxuan.
Backend is Cassandra running on bare metal on 15 nodes.
Race condition is rare.
When the race condition happens, I'm seeing duplicate nodes/edges; basically the graph becomes invalid.
Yes.  This is a good idea.  I could write a spark job to examine the graph and fix up discrepancies.  Smart.
Not sure what a locking services is?  Example?

My current plan (not tested yet) is to use a static class that contains the JanusGraph 'graph'.  On Quarkus when a REST call comes in, a new thread is created.  That thread will use Marc's idea of
GraphTraversalSource traversal = StaticInfo.getGraph().newTransaction().traversal();

Do stuff and then traversal.tx().commit();
That will be done in a loop so that if the commit fails, it will retry X times.

At least that's my current plan.  Not sure if it will work.

-Joe

On 6/17/2022 8:52 AM, Boxuan Li wrote:
Hi Joe,

Unfortunately the way Marc suggests won’t help with your usecase. Tbh I would have suggested the same answer as Marc before I saw your second post. If one has one JVM thread handling multiple transactions (not familiar with quarkus so not sure if that is possible), then one has to do what Marc suggested. But in your usecase, it won't be any different from your current usage because JanusGraph will automatically create threaded transaction for each thread (using ThreadLocal) when you use the traversal object.

The real issue in your use case is that you want ACID support, which really depends on your backend storage. At least in our officially supported Cassandra, HBase, and BigTable adapters, this is not (yet) supported.

There are a few workarounds, though. Before discussing that further, I would like to ask a few questions:

  1. What is your backend storage and is it distributed?
  2. How often does this “race condition” happen? Is it very rare or it’s fairly common?
  3. What is your end goal? Do you want to reduce the chance of this “race condition”, or you want to make sure this does not happen at all?
  4. Are you willing to resolve such duplicate vertices/edges at either read time or offline?
  5. Are you willing to introduce a third dependency, e.g. a distributed locking service?

Best,
Boxuan

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via lists.lfaidata.foundation <joseph.obernberger=gmail.com@...>
Sent: Friday, June 17, 2022 8:12:04 AM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Threaded Operations - Quarkus
 

Thank you Marc.  I'm currently doing everything with a traversal, and then doing a traversal.tx().commit()
Sounds like what you suggested is what I want, but just to be clear:
Here's what I'm trying to do.

Thread 1/JVM1 gets a request that requires adding new vertices and edges to the graph.
Thread 2/JVM1 gets a similar request. 
Some of the vertices added in Thread 1 end up having the same attributes/name has vertices from Thread 2, but I only want to have one vertex if it's going to have the same attributes.
If Thread 1 adds that vertex before it does a commit, then Thread 2, when it looks up said vertex won't find it; so it will also add it.

Code example (traversal is a GraphTraversalSource gotten from JanusGraphFactory.traversal())

try {
            correlationVertex = traversal.V().has("correlationID", correlationID).next();
        } catch (java.util.NoSuchElementException nse) {
            correlationVertex = null;
        }

.
.
.

if (correlationVertex == null) {
            correlationVertex = traversal.addV("correlation").property("correlationID", correlationID).next();
            correlationVertex.property("a", blah1);
            correlationVertex.property("b", blah2);

        }

I do similar things with edges:

        try {
            dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();
        } catch (NoSuchElementException nse) {
            dataSourceToCorrelationEdge = null;
        }

Ultimately, I'd like to have several JVMs handling these requests; each which runs multiple threads.
I'll look at using a new transaction per call.  Thank you!

-Joe

On 6/17/2022 8:01 AM, hadoopmarc@... wrote:
Hi Joe,

Do you mean with threadsafe transactions that requests from different client threads should be handled independently, that is in different JanusGraph Transactions?

In that case, I think you want to use a GraphTraversalSource per request like this:

g = graph.newTransaction().traversal()

Best wishes,    Marc



AVG logo

This email has been checked for viruses by AVG antivirus software.
www.avg.com