Re: Low-hanging fruit for JanusGraph

Adam Phelps <a...@...>

On 1/19/17 1:06 PM, Austin Sharp wrote:
4. Handle supernodes better, for instance by streaming - i.e., when
traversing over edges, don't pull all edges in from Cassandra at once.
This is my personal bugbear - we keep having to change our schema and
use indices or properties when edges are by far the best fit for the
model, because Titan can easily blow through the Cassandra frame size
even if you set up vertex partitioning to split adjacency lists, among
other issues.
This is a big one for us as well, and while we have stuck with a schema that allows supernodes we've made all sorts of work arounds in our java code which accesses Titan.

Although in our case we're dealing with HBase underneath, and so the limits are somewhat different. However similar solutions should be applicable to any backend, either by changing the row structure or by having the clients page through the results with multiple calls.

Related to this area, I think the "hands-off-the-backend" approach that the Titan project took should be ditched. For systems like this installing custom HBase filters, co-processors, etc can be hugely beneficial in terms of performance. From talking to the Datastax folks about their new graph product it sounds like they've done a lot to integrate the graph DB with the Casandra nodes themselves, and I think a similar approach will be needed to move JanusGraph forward.

- Adam

Join { to automatically receive all group messages.