janus cassandra limitations
mirosla...@...
Ok so i get it a bit wrong in my initial assumption. 1. "vertexindex" stores values for all properties for all vertices. In my case key=0x00 is 'false' and this value is stored in 90% of my vertices. so still in theory you could have so many vertices as titan schema allows but you could not store same value for any property more than 2^30 times. 2. "edgestorage" contains information about all vertices with all properties values references and all edges per vertex this means one vertex could have in theory maximum of 2^30 edges 3. Request to janusgraph designers:
On Thursday, August 3, 2017 at 12:58:29 AM UTC+2, Kelvin Lawrence wrote:


Kelvin Lawrence <kelvin....@...>
Hi Mirosław, Janus graph uses an adjacency list model for storing vertices and edges. A vertex, its properties and all of its adjacent edges are stored in a single Cassandra row, The Janus Graph documentation goes into these issues in some detail. http://docs.janusgraph.org/latest/index.html You are using a very old version of Titan BTW. It would be worth upgrading if you can. Cheers, Kelvin
On Wednesday, August 2, 2017 at 10:36:39 AM UTC5, Mirosław Głusiuk wrote:


mirosla...@...
Hi all, from what I know janus is fork of titan which means if it does not have different storage impl it could have problems with bigger data count. "janusgraph/titan can store up to a quintillion edges (2^60) and half as many vertices. " "The maximum number of cells (rows x columns) in a single partition is 2 billion." 2 billions is about (2^31) in cassandra schema we always have 2 columns per table so you could store about (2^30) values per key so if not mistaken "half as many vertices" is not for cassandra storage backend? I'm using titan 0.4.4 and after having like 50M+ vertices I have spot cassandra started to complain about "Compacting large partition titan/vertexindex:00". So my question is what is real janusgraph/titan limit for cassandra backend which will not "kill" cassandra? Btw I also spot that some keys from "edgestore" table for "supernodes" are also bigger than 1GB with my current graph. Could anyone explain how janusgraph stores data in cassandra and how to configure it to prevent storing huge rows?

