Re: Getting Edges - Performance

Joe Obernberger

One thing of note is the tablehistogram for the graphindex table:

nodetool tablehistograms graphsource.graphindex
graphsource/graphindex histograms
Percentile      Read Latency     Write Latency          SSTables    Partition Size        Cell Count
                    (micros)          (micros)                             (bytes)
50%                   152.32             35.43              0.00               179                 1
75%                   545.79             51.01              2.00               179                 1
95%                   943.13             61.21              4.00               215                 1
98%                  1131.75             73.46              5.00               215                 1
99%                  1131.75             88.15              5.00               215                 1
Min                    42.51             11.87              0.00                36                 0
Max                  1131.75             88.15              5.00         802187438          25109160


nodetool tablestats graphsource.graphindex
Total number of tables: 95
Keyspace : graphsource
        Read Count: 28106793
        Read Latency: 2.0613708762860283 ms
        Write Count: 54754221
        Write Latency: 0.089838214609975 ms
        Pending Flushes: 0
                Table: graphindex
                SSTable count: 25
                Old SSTable count: 0
                Space used (live): 2858126352
                Space used (total): 2858126352
                Space used by snapshots (total): 2032016596
                Off heap memory used (total): 22737395
                SSTable Compression Ratio: 0.4225597463117299
                Number of partitions (estimate): 10030080
                Memtable cell count: 6363776
                Memtable data size: 217020150
                Memtable off heap memory used: 0
                Memtable switch count: 77
                Local read count: 11659961
                Local read latency: 0.315 ms
                Local write count: 41850997
                Local write latency: 0.039 ms
                Pending flushes: 0
                Percent repaired: 0.0
                Bytes repaired: 0.000KiB
                Bytes unrepaired: 3.628GiB
                Bytes pending repair: 0.000KiB
                Bloom filter false positives: 40827
                Bloom filter false ratio: 0.00000
                Bloom filter space used: 12591976
                Bloom filter off heap memory used: 12591776
                Index summary off heap memory used: 9669843
                Compression metadata off heap memory used: 475776
                Compacted partition minimum bytes: 36
                Compacted partition maximum bytes: 802187438
                Compacted partition mean bytes: 423
                Average live cells per slice (last five minutes): 1.0
                Maximum live cells per slice (last five minutes): 1
                Average tombstones per slice (last five minutes): 1.0
                Maximum tombstones per slice (last five minutes): 1
                Dropped Mutations: 0
                Droppable tombstone ratio: 0.00000

It looks like the index is not well partitioned?


On 6/23/2022 4:47 PM, Joe Obernberger via wrote:

Thank you Boxuan - the code (REST service) that is modifying the graph is being called continuously when running.  A slow example looks like this:

Metrics: Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
JanusGraphStep([],[edgeID.eq(DS_106|geotime|hou...                     1           1        1393.727   100.00
  constructGraphCentricQuery                                                                   0.011
  GraphCentricQuery                                                                         1393.698
    \_condition=(edgeID = DS_106|geotime|hour 1km|2022-6-14T4|18SVK 86 22|Date,Actor2Geo_Long_Actor2Geo_Lat)
    backend-query                                                      1                    1393.128
                                            >TOTAL                     -           -        1393.727        -

Code is:
System.out.println("Metrics: "+traversal.E().has("edgeID", edgeID).profile().next().toString());
dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

If the time to run that exceeds 500mSec, I print the message.


On 6/23/2022 4:15 PM, Boxuan Li wrote:
Profiler documentation is available here: 

Can you do

traversal.E().has("edgeID", edgeID).profile().next()

and paste the output (when you experience the slowness) here?


On Jun 23, 2022, at 2:29 PM, Joe Obernberger via <> wrote:

Looking for documentation on how to do profile() - do you have any?

Queries like this are also slow:
Edge dataSourceToCorrelationEdge = traversal.E().has("edgeID", edgeID).next();

There is an index on edgeID (mgmt.buildIndex("byEdgeIDComposite", Edge.class).addKey(edgeID).unique().buildCompositeIndex();)
I'm trying to cache the actual JanusGraph edgeID string in an outside cache to help with this, but when it's not in the cache, it can take 2 seconds to return.


On 6/22/2022 4:53 PM, Boxuan Li wrote:
It's very suspicious. It shouldn't take 3 seconds just to load 3 edges. Can you provide the profile() output here when you see such slowness? It is also worth trying if other queries (e.g. loading a vertex) experience similar slowness.

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via<>
Sent: Wednesday, June 22, 2022 4:51 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: Re: [janusgraph-users] Getting Edges - Performance
It's usually small; typically around 3 edges.
Reading your article (
Outbound edges could be large - 100s to 10000s.


On 6/22/2022 4:49 PM, Boxuan Li wrote:
When it takes over 3 seconds to return, how many edges does it return?

From: janusgraph-users@... <janusgraph-users@...> on behalf of Joe Obernberger via<>
Sent: Wednesday, June 22, 2022 1:06 PM
To: janusgraph-users@... <janusgraph-users@...>
Subject: [janusgraph-users] Getting Edges - Performance
Hi All - I'm seeing a performance issue with this statement in Java code:

Iterator<Edge> edgeIt = vert.edges(Direction.IN);

in some cases this is taking over 3 seconds to return.  What can I do to 
better this performance?
Thank you!


This email has been checked for viruses by AVG.

AVG logo
This email has been checked for viruses by AVG antivirus software.

Join to automatically receive all group messages.