CQL scaling limit?


madams@...
 

Hi all,

We've been trying to scale horizontally Janusgraph to keep up with the data throughput, but we're reaching some limit in scaling.
We've been trying different things to pinpoint the bottleneck but we're struggling to find it... Some help would be most welcome :)

Our current setup:

  • 6 ScyllaDB Instances
  • 6 Elasticsearch Instances
  • Our "indexers" running Janusgraph as a lib, they can be scaled up and down
    • They read data from our sources and write it to Janusgraph
    • Each Indexer runs on his own jvm
Our scaling test looks like this:



Each indexer counts the number of records it processes. We start with one indexer, and every 5 ~ 10 minutes we double the number of indexers and look how the overall performance increased.
From top down, left to right these panels represent:

  • Total Processed Records: The overall performance of the system
  • Average Processed Records: Average per Indexer, ideally this should be a flat curve
  • Number of Running Indexers: We scaled at 1, 2, 4, 8, 16, 32, 64
  • Processed Records Per Indexer
  • Cpu Usage: The cpu usage per Indexer
  • Heap: Heap usage per indexer. The red line is the max memory the Heap can take, we left a generous margin


As you can see past 4 Indexers, the performance per Indexer decreases until no additional throughput can be reached. At first we thought this might simply be due to resource limitations, but ScyllaDB and Elasticsearch are not really struggling. The ScyllaDB load, read and write latency looked good during this test:



Both ScyllaDB and Elasticsearch are running on NVMe hard disks, with 10GB+ of ram. ScyllaDB is also deployed with CPU pinning. If we try a Cassandra Stress test, we can really max out ScyllaDB.

Our janusgraph configuration looks like:

storage.backend=cql
storage.hostname=scylla
storage.cql.replication-factor=3
index.search.backend=elasticsearch
index.search.hostname=elasticsearch
index.search.elasticsearch.transport-scheme=http
index.search.elasticsearch.create.ext.number_of_shards=4
index.search.elasticsearch.create.ext.number_of_replicas=2
graph.replace-instance-if-exists=true
schema.default=none
tx.log-tx=true
tx.max-commit-time=170000
ids.block-size=100000
ids.authority.wait-time=1000


Each input record contains ~20 vertices and ~20 edges. The workflow of the indexer is:

  1. For each vertex, check if it exists in the graph using a composite index. Create it if it does not.
  2. Insert edges using the vertex id's returned by step 1

Each transaction inserts ~ 10 records. Each indexer runs 10 transactions in parallel.

We tried different things but without success:

  • Increasing/decreasing the transaction size
  • Increasing/decreasing storage.cql.batch-statement-size
  • Enabling/disabling batch loading
  • Increasing the ids.block-size to 10 000 000

The most successful test so far was to switch the cql write consistency level to ANY and the read consistency level to ONE:



This time the indexer scaled nicely up to 16 indexers, and the overall performance was still increased by scaling to 32 indexers. Once we reached 64 indexers though the performance dramatically dropped. There ScyllaDB had a little more load, but it still doesn't look like it's struggling:

We don't use LOCK consistency. We never update vertices or edges, so FORK consistency doesn't look useful in our case.

It really looks like something somewhere is a bottleneck when we start scaling.

I checked out the Janusgraph github repo locally and went through it to try to understand what set of operations Janusgraph does to insert vertices/edges and how this binds to transactions, but I'm struggling a little to find that information.

So, any idea/recommendations?

Cheers,
Marc

Join {janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.