Date
1 - 5 of 5
JanusGraph and Cassandra-3 (issue 172)
Kedar Mhaswade <ke...@...>
Hello all, See Issue-172 whose synopsis is: CassandraBinaryRecordReader not compatible with Cassandra 3.0+. There are a few comments and several useful links on the bug report that I have been discussing with Ted Wilmes. I have also asked Cassandra developers for advice. It appears to me that we'll have to make JanusGraph require Cassandra-3 because of this issue. Thoughts, comments? Regards, Kedar |
|
Ted Wilmes <twi...@...>
Hi Kedar, Thanks for bringing this discussion over to the dev list. I think it'll benefit from the wider audience. If I read things correctly, I think that CqlRecordReader is available in 2.1 [1]. Perhaps we could see if it'll work against Cassandra 3? Looking at the commit history since then, there are a number of additions but at its heart, it looks like that code is using the Datastax Cassandra driver to communicate with the cluster. The other JanusGraph CQL PR that is in-flight [2] brings this driver in at version 3.1.4. Looking at the compatibility table, that should be compatible with Cassandra 3.0+[3]. What would you think about basing your work off of that branch to see if you can get something up and running? Bumping the Janus dependency to 3.0 may be necessary, but that would most likely cause problems for folks who were running Cassandra in embedded mode. I'm not sure if anyone does that (ie. connecting using the C* apis vs. thrift, astyanax drivers with Janus & Cassandra in the same JVM), but if you do, speak up! I think for the moment it would be good to see if we could get this working without the version bump. Thanks for all the work on this, Ted On Wednesday, May 3, 2017 at 8:34:51 AM UTC-5, Kedar Mhaswade wrote:
|
|
Kedar Mhaswade <ke...@...>
Hi Ted, Thanks for your encouragement. Some responses inline. On Wed, May 3, 2017 at 9:51 AM, Ted Wilmes <twi...@...> wrote:
Correct. This comes via the cassandra-all-2.1.9 dependency that JanusGraph right now has.
It's the path that we are inclined toward taking. In Cassandra 2.x, there was a class CassandraServer which had methods like execute_cql_query, execute_cql3_query and so on that used to handle the CQL queries (CassandraServer was in a package named erroneously as org.apache.cassandra.thrift, a thift subpackage class handling CQL queries! -- looking at you, Cassandra devs). In Cassandra 3.x, CassandraServer class is gone, but hopefully the actual CQL queries would be handled in a backward compatible manner by its reincarnation. That way, all the unit tests etc. will continue to run against the embedded Cassandra-2.x and in real life JG can speak with Cassandra-3.x at least for analytic queries (issued via the janusgraph-hadoop-parent/org.janusgraph.hadoop.formats.cassandra package). Thus, if CQL server implementation inside of Cassandra-3.x is backward compatible with the Cassandra-2.1.9 client used by hadoop classes in JanusGraph, we should be (barely) covered ;). This also lets JanusGraph not require Cassandra-3.x. To be a little more concrete, I am planning to take a shot at reimplementing the logic in org.janusgraph.hadoop.formats.cassandra.CassandraBinaryRecordReader and org.janusgraph.hadoop.formats.cassandra.CassandraBinaryInputFormat by wrapping the org.apache.cassandra.hadoop.cql3.CqlRecordReader instead of org.apache.cassandra.hadoop.ColumnFamilyRecordReader (A Cassandra-2.x class which really is the problem because that embodies the incompatible schema change and of course is gone in Cassandra-3.x -- see my email to Cassandra devs). We hope that this change is going to be fairly localized. Another data point is that org.apache.cassandra.hadoop.cql3.CqlRecordReader is part of Cassandra 3.x which means if we were to upgrade JG to use Cassandra-3.x, we'll be in good shape. If anyone on the list is knowledgeable (LaRocque, Broecheler from git log) about this, please let me know if this approach will work.
We have been looking at that contribution, but we feel like that it is more geared toward the OLTP like queries and it is yet another variable into an already high combinatoric complexity of dependencies. So, for now, we'll not rely on cql-backend branch changes.
If the plan above works, then we just may have the best of both the worlds, because the embedded Cassandra-2.x will continue to work. Makes sense? Regards, Kedar
|
|
Ted Wilmes <twi...@...>
That sounds like a good plan to me Kedar. Thanks for the detailed response. Your point on not basing off that other branch makes good sense to me. For some reason, I was thinking (incorrectly) that the CassandraBinaryRecordReader lived in the Cassandra module, which it does not. Looking forward to testing this out! Thanks, Ted On Wed, May 3, 2017 at 4:24 PM, Kedar Mhaswade <ke...@...> wrote:
|
|
Kedar Mhaswade <ke...@...>
Thanks Ted. Going ahead, I will use the report page (https://github.com/JanusGraph/janusgraph/issues/172) to post updates etc. Will resort to this thread if I am stuck and I am sure you guys will help me out! Regards, Kedar On Wed, May 3, 2017 at 4:10 PM, Ted Wilmes <twi...@...> wrote:
|
|