Note: lists.lfaidata.foundation will be down for maintenance on Monday, September 26th, starting at 9AM Pacific Time (4PM Monday September 26, 2022 UTC), for approximately one hour.
Janus graph - write throughput speedup question
Rami Mankevich <rami.m...@...>
the ingestion is based on Vertices with 3-4 properties and edges connect them
In case of event i need to : upsert vertices and create edges
so i traverse on composite index twice to get vertices ( in case i dont find them - i create them)
Then i add the edge
Checking the existence for every and each vertex twice for every event ( and i have thousands / sec)
Is very problematic and decreases the throughput from 3000 / core for insert only to 700/800 /core for every vertex existence test
In order to improve this i have several ideas
1. search vertex by id only with manual id assignment but this can impact vertx distribution accross cassandra cluster
2. making upsert for edge creation - in case id doesnt exist - create it without any properties and add edge, later on if vertx creation will occure - add ptoperties to existing vertex id ( its actually empty vertex instance)
What do you think about ideas or maybe you have another options?
Main question is : How to create edges without existing vertices in order not to check their existence and get write perf degradation?