Re: Support for Partitioned Vertices in JanusGraphHadoop for OLAP queries


Florian Hockmann <f...@...>
 

I'm not aware of anyone working on this right now, but supernodes are definitely a big problem for graph databases, including JanusGraph, so any improvements in that area would be a great help for many users.
Regarding graph partitioning as a countermeasure for supernodes, I just want to point out that it depends on the size of your cluster of storage backend nodes how much it helps. This blog post by Ted Wilmes explains in greater detail. It talks about DSE Graph, but the same basically applies to JanusGraph.
So, you might need to implement something yourself to work around the supernode problem, like some bucketing approach where you split your supernodes up. If you want more information about supernodes and the impact they have on JanusGraph, we had a thread a while back on janusgraph-users on that topic.

Am Freitag, 2. August 2019 19:48:25 UTC+2 schrieb kes...@...:

Hey there,

I have been working with JanusGraph recently and unfortunately the dataset that I am dealing with is susceptible to supernodes (10+ mil edges into a single vertex). It seems that partitioning vertices with a particular vertex labels is the way to distribute these dense vertices in the storage backend: https://docs.janusgraph.org/latest/graph-partitioning.html but I see that these partitioned vertices must be filtered out for OLAP queries: https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-hadoop-parent/janusgraph-hadoop-core/src/main/java/org/janusgraph/hadoop/config/JanusGraphHadoopConfiguration.java#L51-L57

Are there any plans to remove this restriction anytime soon/is there anyone currently working on this problem? 

Thanks,

Joseph

Join janusgraph-dev@lists.lfaidata.foundation to automatically receive all group messages.