[PROPOSAL] CQL Storage Backend Implementation


Paul Kendall <paul....@...>
 

Hi,
We have mostly completed work on a CQL storage backend for JanusGraph which is table compatible with the Astyanax implementation.
There is still some missing functionality for some of the configuration options including SSL support.

One of the goals we have in using JanusGraph is to embed it into our application but for it to talk to remote cassandra and elasticsearch clusters.
The current project structure makes this quite difficult in that it pollutes our classpath with cassandra-all and friends. Currently what we have done is duplicate some of the functionality from the cassandra project and re-namespace it under cql so that it is completely independent.

An alternative approach might be to create a cassandra-core project and split the backends i.e. Astyanax, Thrift, Embedded and ultimately CQL as dependants on cassandra-core. This would allow us to share some of the common configuration code and more importanly, some of unit test code.

We imagine this would be a good starting point for https://github.com/JanusGraph/janusgraph/issues/35

Cheers,
Paul Kendall & Samant Maharaj


Jason Plurad <plu...@...>
 

Paul, that sounds great. The proposal to create a cassandra-core makes sense to me.
What version of CQL are you coding against, 3.1 or 3.3?


On Monday, March 6, 2017 at 8:33:53 PM UTC-5, Paul Kendall wrote:
Hi,
We have mostly completed work on a CQL storage backend for JanusGraph which is table compatible with the Astyanax implementation.
There is still some missing functionality for some of the configuration options including SSL support.

One of the goals we have in using JanusGraph is to embed it into our application but for it to talk to remote cassandra and elasticsearch clusters.
The current project structure makes this quite difficult in that it pollutes our classpath with cassandra-all and friends. Currently what we have done is duplicate some of the functionality from the cassandra project and re-namespace it under cql so that it is completely independent.

An alternative approach might be to create a cassandra-core project and split the backends i.e. Astyanax, Thrift, Embedded and ultimately CQL as dependants on cassandra-core. This would allow us to share some of the common configuration code and more importanly, some of unit test code.

We imagine this would be a good starting point for https://github.com/JanusGraph/janusgraph/issues/35

Cheers,
Paul Kendall & Samant Maharaj


Paul Kendall <paul....@...>
 

Jason, currently we are targetting the 3.1.4 driver, but I don't think there is anything that precludes us from using the 3.3 version either, but I will check on that.
We will look into completing the unit tests today then we'll comment on https://github.com/JanusGraph/janusgraph/issues/35 with a link to our fork so we can get feedback.
We'll also look into creating a cassandra-core project that can be used to contain common code for all the cassandra variants.


Samant Maharaj <samant...@...>
 

Hi all,

We've got an initial implementation of the CQL based backend. At this stage it's been implemented as a completely separate module, however ideally we'd like to extract a cassandra-core module for common code.

Here's a link to the branch in our fork: https://github.com/orionhealth/janusgraph/tree/feature/cql-backend

Features left to implement:
  • Pooling configuration
  • Retry configuration
  • Support for per partition query limits (available in C* since 3.6). We're planning to implement this by switching the query strategy based on the version metadata detected from the C* cluster.
Note, we've introduced dependencies on the Datastax Cassandra driver as well as Javaslang which makes functional style programming in Java a lot easier.

It'd be great to get some feedback to ensure we're not going about this the wrong way.

Regards,
Samant Maharaj & Paul Kendall.

On Wednesday, 8 March 2017 06:47:05 UTC+13, Paul Kendall wrote:
Jason, currently we are targetting the 3.1.4 driver, but I don't think there is anything that precludes us from using the 3.3 version either, but I will check on that.
We will look into completing the unit tests today then we'll comment on https://github.com/JanusGraph/janusgraph/issues/35 with a link to our fork so we can get feedback.
We'll also look into creating a cassandra-core project that can be used to contain common code for all the cassandra variants.


Jason Plurad <plu...@...>
 

Thanks for sharing your branch. I tried out your backend, and it worked fine. I verified that it is storage compatible with the cassandrathrift driver. Good stuff. I opened a PR on your repo to handle a few things.

I'm +0 on Javaslang as a new dependency. License looks fine, and I assume its performance is reasonable close to plain old Java?

-- Jason


On Wednesday, March 8, 2017 at 7:49:20 PM UTC-5, Samant Maharaj wrote:
Hi all,

We've got an initial implementation of the CQL based backend. At this stage it's been implemented as a completely separate module, however ideally we'd like to extract a cassandra-core module for common code.


Features left to implement:
  • Pooling configuration
  • Retry configuration
  • Support for per partition query limits (available in C* since 3.6). We're planning to implement this by switching the query strategy based on the version metadata detected from the C* cluster.
Note, we've introduced dependencies on the Datastax Cassandra driver as well as Javaslang which makes functional style programming in Java a lot easier.

It'd be great to get some feedback to ensure we're not going about this the wrong way.

Regards,
Samant Maharaj & Paul Kendall.

On Wednesday, 8 March 2017 06:47:05 UTC+13, Paul Kendall wrote:
Jason, currently we are targetting the 3.1.4 driver, but I don't think there is anything that precludes us from using the 3.3 version either, but I will check on that.
We will look into completing the unit tests today then we'll comment on https://github.com/JanusGraph/janusgraph/issues/35 with a link to our fork so we can get feedback.
We'll also look into creating a cassandra-core project that can be used to contain common code for all the cassandra variants.


Paul Kendall <paul....@...>
 

Thanks Jason, Samant has merged your PR and changed the section of processing the contact points to a more functional style and you'll see that it's a lot easir to read and understand. That's part of the reason we like using javaslang. It makes the code simpler and more concise with less boilerplate. It's similar to googles guave library but a lot more functional and has a very active community behind it.

On Friday, March 10, 2017 at 11:46:59 AM UTC+13, Jason Plurad wrote:
Thanks for sharing your branch. I tried out your backend, and it worked fine. I verified that it is storage compatible with the cassandrathrift driver. Good stuff. I opened a PR on your repo to handle a few things.

I'm +0 on Javaslang as a new dependency. License looks fine, and I assume its performance is reasonable close to plain old Java?

-- Jason

On Wednesday, March 8, 2017 at 7:49:20 PM UTC-5, Samant Maharaj wrote:
Hi all,

We've got an initial implementation of the CQL based backend. At this stage it's been implemented as a completely separate module, however ideally we'd like to extract a cassandra-core module for common code.


Features left to implement:
  • Pooling configuration
  • Retry configuration
  • Support for per partition query limits (available in C* since 3.6). We're planning to implement this by switching the query strategy based on the version metadata detected from the C* cluster.
Note, we've introduced dependencies on the Datastax Cassandra driver as well as Javaslang which makes functional style programming in Java a lot easier.

It'd be great to get some feedback to ensure we're not going about this the wrong way.

Regards,
Samant Maharaj & Paul Kendall.

On Wednesday, 8 March 2017 06:47:05 UTC+13, Paul Kendall wrote:
Jason, currently we are targetting the 3.1.4 driver, but I don't think there is anything that precludes us from using the 3.3 version either, but I will check on that.
We will look into completing the unit tests today then we'll comment on https://github.com/JanusGraph/janusgraph/issues/35 with a link to our fork so we can get feedback.
We'll also look into creating a cassandra-core project that can be used to contain common code for all the cassandra variants.