Jakub Liska <liska...@...>
we've been using ScyllaDB for persisting and analyzing impressions and especially cookies, digital fingerprints and other similar internet "identities".
The ultimate goal is finding relationships between these ids, so in the end you'd have many x_by_y tables, in fact, there is many2many relationship between these ids.
This is a bit troublesome on columnar databases as they require designing column families with the most equally sized partitions
as possible, which is a real challenge in this niche as you find yourself unable to comply to the "same partition size" rule of thumb because :
1) certain amount of internet users are not humans but machines and they are able to generate unpredictable amount of impressions
2) it's a multitenant environment where websites and campaings can be tiny or huge in respect to amount of impressions
3) it is not true time-series data where you could leverage time for proper partition sizing
Now, I'm aware of the fact that Graph databases are perfect match for niches with complex graphs which this use case is not,
there would be just a several types of edges and nodes, but am I right saying that I could leverage JanusGraph for the varying partition size problem?
Can JanusGraph properly deal with many2many relationships ranging from 1 to 1 million and scale well at the same time?
Is this taken care of under the hood at the C* / ScyllaDB level?