Hi Marc,
I tried rerunning the scaling test on a fresh graph with ids.block-size=10000000 , unfortunately I haven't seen any performance gain.
I also tried ids.block-size=10000000 and ids.authority.conflict-avoidance-mode=GLOBAL_AUTO, but there also there was no performance gain.
I used GLOBAL_AUTO as it was the easiest to test, I ran the test twice to make sure the result was not just due to unlucky random tag assignment. I didn't do the math, but I guess I would have to be very unlucky to get twice a very bad random tag allocation!
I tried something else which turned out to be very successful:
instead of inserting all the properties in the graph, I tried only inserting the ones necessary to feed the composite indexes and vertex-centric indexes. The indexes are used to execute efficiently the "get element or create it" logic. This test scaled quite nicely up to 64 indexers (instead of 4 before)!

Out of all the tests I tried so far, the two most successful ones were:
- decreasing the cql consistency level (from Quorum to ANY/ONE)
- decreasing the number of properties
What's interesting with these two cases, is that they didn't significantly increased the performance of a single indexer, they really increased the horizontal scalability we could achieve.
My best guess for why it is the case: they reduced the amount of work the ScyllaDB coordinators had to do by:
- decreasing the amount of coordination necessary to get a majority answer (Quorum)
- decreasing the size in bytes of the cql unlogged batches, some of our properties can be quite big ( > 1KB )
I would happily continue digging into this, unfortunately we have other priorities that turned up. We're putting the testing on the side for the moment.
I thought I would post my complete findings/guess anyway in case they are useful to someone.
Thank you so much for your help!
Cheers,
Marc