Date   

Re: Low-hanging fruit for JanusGraph

Adam Phelps <a...@...>
 

On 1/19/17 1:06 PM, Austin Sharp wrote:
4. Handle supernodes better, for instance by streaming - i.e., when
traversing over edges, don't pull all edges in from Cassandra at once.
This is my personal bugbear - we keep having to change our schema and
use indices or properties when edges are by far the best fit for the
model, because Titan can easily blow through the Cassandra frame size
even if you set up vertex partitioning to split adjacency lists, among
other issues.
This is a big one for us as well, and while we have stuck with a schema that allows supernodes we've made all sorts of work arounds in our java code which accesses Titan.

Although in our case we're dealing with HBase underneath, and so the limits are somewhat different. However similar solutions should be applicable to any backend, either by changing the row structure or by having the clients page through the results with multiple calls.

Related to this area, I think the "hands-off-the-backend" approach that the Titan project took should be ditched. For systems like this installing custom HBase filters, co-processors, etc can be hugely beneficial in terms of performance. From talking to the Datastax folks about their new graph product it sounds like they've done a lot to integrate the graph DB with the Casandra nodes themselves, and I think a similar approach will be needed to move JanusGraph forward.

- Adam


Low-hanging fruit for JanusGraph

Austin Sharp <austins...@...>
 

Hi all,

extremely excited that JanusGraph is out into the world! I have been working with Titan since 0.4.x and have been hoping for a long time to see new maintainers so that bug reports, pull requests, etc don't go unheeded.

There are a few long-standing Titan issues that would be easy for JanusGraph to pick up and run with, to immediately differentiate from the existing Titan codebase and give people like myself the excuse we're looking for to migrate over! I suspect others have their own wishlists, so I'd encourage everyone to chime in.

A few of my personal ones:

1. Update to newer versions of dependencies (Guava 21, Cassandra 3, ElasticSearch, etc).
2. Provide a well documented migration path from Titan 1.0
3. Keep up to date with Tinkerpop (I know this has been a stated goal elsewhere)
4. Handle supernodes better, for instance by streaming - i.e., when traversing over edges, don't pull all edges in from Cassandra at once. This is my personal bugbear - we keep having to change our schema and use indices or properties when edges are by far the best fit for the model, because Titan can easily blow through the Cassandra frame size even if you set up vertex partitioning to split adjacency lists, among other issues.

Excited to see where things go, and hopefully I can switch to using JanusGraph ASAP!


JanusGraph project at The Linux Foundation

Jason Plurad <plu...@...>
 

Today, The Linux Foundation announced the creation of the JanusGraph project. We're excited to push forward the open source, collaborative effort on scalable graph databases that was initiated by the Aurelius team with Titan. JanusGraph will continue to be a native Apache TinkerPop implementation. The first effort definitely includes finally upgrading beyond 3.0.1-incubating :)

There are janusgraph-users and janusgraph-dev Google Groups created for public mailing list collaboration, but as is the case with other providers in the TinkerPop ecosystem, I'd expect cross-traffic to continue with the TinkerPop lists.

If you will be in Austin this weekend for Graph Day Texas, there is a great lineup of graph-related talks. Ted Wilmes and I from Apache TinkerPop will be there, and so will others involved in getting JanusGraph established. After the Graph Day happy hour, we will gather up for an informal meetup/birds-of-a-feather with anybody interested.


Have a good one,
Jason

6661 - 6663 of 6663