[DISCUSS] Elasticsearch Http using Jest


Keith Lohnes <loh...@...>
 

I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


Alexander Patrikalakis <amcpatr...@...>
 

resending because I made a typo: hadoop -> hbase

We could do a split for ES much like the split that we do for hbase (098, 10 etc). have a janusgraph-es/janusgraph-es-core that JanusGraph uses, and shims like janusgraph-es1, janusgraph-es2-transport, and janusgraph-es5-jest. Since we are newly introducing 2.X support in the in-flight PR and the coding is already done, perhaps we only add transport support on es2 lineage and only support HTTP/jest on es5 lineage?


On Friday, March 3, 2017 at 12:40:07 AM UTC+9, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


Alexander Patrikalakis <amcpatr...@...>
 

In the scheme below if 1.5=transport and 2.x=transport and 5.x=http/jest, then we don't need to pull out protocols in the shim names, so the shims would just be:
janusgraph-es/janusgraph-es1
janusgraph-es/janusgraph-es2
janusgraph-es/janusgraph-es5


On Saturday, March 4, 2017 at 12:07:24 AM UTC+9, Alexander Patrikalakis wrote:
resending because I made a typo: hadoop -> hbase

We could do a split for ES much like the split that we do for hbase (098, 10 etc). have a janusgraph-es/janusgraph-es-core that JanusGraph uses, and shims like janusgraph-es1, janusgraph-es2-transport, and janusgraph-es5-jest. Since we are newly introducing 2.X support in the in-flight PR and the coding is already done, perhaps we only add transport support on es2 lineage and only support HTTP/jest on es5 lineage?

On Friday, March 3, 2017 at 12:40:07 AM UTC+9, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


sjudeng <sju...@...>
 

I think JanusGraph should end formal support for 1.x, though it would be great if users could still have the ability to use it (even if unsupported/tested) via your Jest-based implementation. For one I'm biased as the author of the above PR, which does drop support for 1.x and I'd really like to see get merged before moving forward. More importantly though in working through updates to support 5.x I was happy to find the Elasticsearch distribution zip artitfacts available for 2.x and 5.x (but not 1.x) in Maven central. The availability of the ES distribution for 2.x and 5.x artifacts enables their automated use in unit tests and in building JanusGraph releases (e.g. for use in running embedded ES instances). This allows for JanusGraph to remove the hacked elasticsearch and elasticsearch.in.sh scripts from janusgraph-dist and also avoids the JarHell issues both during testing and when starting embedded ES instances. This improves maintainability and stability.

The proposed update to support HTTP ES client via Jest sounds great to me. Personally I think it would be nice to avoid adding compability shims to enable the cross-version support. But either way I do want to make sure that testing rigor is not lost in the updates. I think formal/full support for any ES version should require that the full janusgraph-es test suite can be run (automatically) against that version and corresponding embedded ES instances for that version are supported through janusgraph-dist. I have tested this works for 2.x and 5.x, but I'm not sure that it would work for 1.x ... at least not without really going backwards/hacking project configuration.

If you're able to make this work to support full test suite/embedded instances across the three versions cleanly, that'd be great. Otherwise I'd propose docs would be updated to indicate that JanusGraph fully supports 2.x and 5.x but that 1.x is no longer maintained (read "tested") though can still be used if necessary through the relevant Jest-jar change. It seems to me this would be sufficient. JanusGraph has already started down the road of making breaking changes in the coarse of moving beyond Titan, including dropping support for old versions of HBase and updating TinkerPop, which because of the underlying Spark update would require an update to users compute clusters.


On Thursday, March 2, 2017 at 9:40:07 AM UTC-6, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


Jason Plurad <plu...@...>
 

It's also similar to the split that we'll have with Cassandra with cassandra-thrift and cassandra-cql.

Since ES is going in the direction of HTTP-only client access (good read here), it might make more sense to use their client API rather than Jest. With ES introducing their own Java REST Client with version 5.0.0, I wonder what Jest's longevity would be, other than perhaps backwards compatibility with pre-5.0 ES versions.


On Friday, March 3, 2017 at 10:08:57 AM UTC-5, Alexander Patrikalakis wrote:
In the scheme below if 1.5=transport and 2.x=transport and 5.x=http/jest, then we don't need to pull out protocols in the shim names, so the shims would just be:
janusgraph-es/janusgraph-es1
janusgraph-es/janusgraph-es2
janusgraph-es/janusgraph-es5


On Saturday, March 4, 2017 at 12:07:24 AM UTC+9, Alexander Patrikalakis wrote:
resending because I made a typo: hadoop -> hbase

We could do a split for ES much like the split that we do for hbase (098, 10 etc). have a janusgraph-es/janusgraph-es-core that JanusGraph uses, and shims like janusgraph-es1, janusgraph-es2-transport, and janusgraph-es5-jest. Since we are newly introducing 2.X support in the in-flight PR and the coding is already done, perhaps we only add transport support on es2 lineage and only support HTTP/jest on es5 lineage?

On Friday, March 3, 2017 at 12:40:07 AM UTC+9, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


Keith Lohnes <loh...@...>
 

I proposed Jest for a few reasons. The ES Rest client is rather low level, as mentioned in #92. Jest is much higher level client that adds a lot of niceties for Java. I think that alone will keep Jest around for a while.

WRT to versioning, unless there's another reason aside from Jest, 2.x and 5.x could be combined as they'll likely use the same Jest jar version.

Jest also makes it simple to use the same code across 1.x, 2.x, and 5.x versions for constructing ES queries. The code I've written doesn't use anything that would change/break between the two different versions of Jest that would need to be used for 1.x and 2+ compatibility, meaning we could easily introduce http for all 3 versions.

Either way, there's no additional effort to introduce http interface in 2.x vs 5.x alone.

As I mentioned in #79 there are a couple of prs that would need to be merged in Jest for 5.x to work completely correctly.


On Friday, March 3, 2017 at 10:26:49 AM UTC-5, Jason Plurad wrote:
It's also similar to the split that we'll have with Cassandra with cassandra-thrift and cassandra-cql.

Since ES is going in the direction of HTTP-only client access (good read here), it might make more sense to use their client API rather than Jest. With ES introducing their own Java REST Client with version 5.0.0, I wonder what Jest's longevity would be, other than perhaps backwards compatibility with pre-5.0 ES versions.


On Friday, March 3, 2017 at 10:08:57 AM UTC-5, Alexander Patrikalakis wrote:
In the scheme below if 1.5=transport and 2.x=transport and 5.x=http/jest, then we don't need to pull out protocols in the shim names, so the shims would just be:
janusgraph-es/janusgraph-es1
janusgraph-es/janusgraph-es2
janusgraph-es/janusgraph-es5


On Saturday, March 4, 2017 at 12:07:24 AM UTC+9, Alexander Patrikalakis wrote:
resending because I made a typo: hadoop -> hbase

We could do a split for ES much like the split that we do for hbase (098, 10 etc). have a janusgraph-es/janusgraph-es-core that JanusGraph uses, and shims like janusgraph-es1, janusgraph-es2-transport, and janusgraph-es5-jest. Since we are newly introducing 2.X support in the in-flight PR and the coding is already done, perhaps we only add transport support on es2 lineage and only support HTTP/jest on es5 lineage?

On Friday, March 3, 2017 at 12:40:07 AM UTC+9, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


sjudeng <sju...@...>
 

Regarding the compatibility shim approach I think this should be avoided if at all possible. I don't think using Jest gets away from needing version-specific ES client code to support other (node/transport) clients in code base (unless we drop node/transport clients in favor of HTTP-only) and definitely to supported running embedded ES in testing/release. Unless I'm wrong about this then if we did want to do compatibility shim approach I think we'd end up needing to create separate JanusGraph releases tied to the specific version of ES. This is not currently necessary as one JanusGraph release can service all versions for relevant modules (e.g. hbase), though I don't know if this will come up again with cassandra-cql work. I really don't think this complexity should be introduced just to continue supporting ES 1.x.


sjudeng <sju...@...>
 

Although the more I think about it I guess this issue is going to be present no matter what until we can go full HTTP. Just to throw it out there, why not drop node/transport and go full HTTP? It's the future anyway, we can do it right now to support 2.x and 5.x and do it even cleaner once Jest PRs are merged. Then we have a single JanusGraph distribution that supports ES 1.x-5.x. User's only have to update their configs to change 9300 to 9200.


On Thursday, March 2, 2017 at 9:40:07 AM UTC-6, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


Keith Lohnes <loh...@...>
 

I don't see a problem with just going full http, it would definitely make things easier for me. But the Jest jars for 1.x vs 2.x + are going to need to be different. I'm not sure what the preferred method of dealing with that would be.


On Friday, March 3, 2017 at 11:31:44 AM UTC-5, sjudeng wrote:
Although the more I think about it I guess this issue is going to be present no matter what until we can go full HTTP. Just to throw it out there, why not drop node/transport and go full HTTP? It's the future anyway, we can do it right now to support 2.x and 5.x and do it even cleaner once Jest PRs are merged. Then we have a single JanusGraph distribution that supports ES 1.x-5.x. User's only have to update their configs to change 9300 to 9200.

On Thursday, March 2, 2017 at 9:40:07 AM UTC-6, Keith Lohnes wrote:
I started some conversation over at https://github.com/JanusGraph/janusgraph/pull/79#pullrequestreview-24343839, and Jason Plurad suggested I move that over here. 

I have some code that's been used in a Titan deployment using the apache licensed Jest ES http client. There was some discussion in that PR about whether to continue to support the Transport/node client in there as well.

The key points of the conversation there
1. Versions to support (1.x, 2.x, 5.x)

   With the Jest client, we could support all three pretty easily. 1 vs 2.x and 5.x would be changing the .jar out.  There's some open PRs in the Jest repo that need to get merged for 5.x support, but once those are and we update the jar version, we'd be able to support 5.x. Maintaining 1.x support for a little while would be nice for people with production Titan instances, as Adam Phelps pointed out. 1.x and 2.x could use the same code, they just need different jars.

2. HTTP vs Transport/Node
    I think in #92 there's a mention of Transport being deprecated. My first instinct is to say that Janus should mark Transport/Node as deprecated and continue to support Transport/Node clients until a major version release at which point support could be removed.  I have some work done to split out the Transport/Node clients from the Http client, and make for an easy removal once that decision has been made.


sjudeng <sju...@...>
 

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right? Then I'm back to my suggestion to drop formal support for ES 1.x but documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments. It's great Jest gives us this option for basically free because I really don't think JanusGraph should introduce build/release complexity just to accommodate it. In my opinion if users have really stable Titan deployments and they're not able to update relevant cluster components (storage, indexing, compute), then I'd think they should stay on that baseline until the new capabilities being offered by JanusGraph are compelling enough to warrant the upgrade investment. Otherwise you're just upgrading to change names from Titan to JanusGraph. If this is a step some users want then I'd recommend JanusGraph create an initial release based on an earlier commit after name changes but before the potentially breaking updates to hbase, tinkerpop and elasticsearch.


Keith Lohnes <loh...@...>
 

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right?

Yup. Once those PRs are merged in the Jest project.

documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments

+1

I personally think only supporting http makes sense and going the route @sjudeng mentioned. It allows current titan users some flexibility in their migration to Janus while not stopping progress on the ES backend.

On Friday, March 3, 2017 at 1:06:15 PM UTC-5, sjudeng wrote:

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right? Then I'm back to my suggestion to drop formal support for ES 1.x but documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments. It's great Jest gives us this option for basically free because I really don't think JanusGraph should introduce build/release complexity just to accommodate it. In my opinion if users have really stable Titan deployments and they're not able to update relevant cluster components (storage, indexing, compute), then I'd think they should stay on that baseline until the new capabilities being offered by JanusGraph are compelling enough to warrant the upgrade investment. Otherwise you're just upgrading to change names from Titan to JanusGraph. If this is a step some users want then I'd recommend JanusGraph create an initial release based on an earlier commit after name changes but before the potentially breaking updates to hbase, tinkerpop and elasticsearch.


Jason Plurad <plu...@...>
 

How comfortable are we taking Jest on as a dependency? Or rather, if we take on Jest as a dependency, we'll need to be prepared to help push them along with fixes. Jest Issue 409 is for ES 5.0 support, which was released Oct 2016. It doesn't seem like ES 5.x support will be there in time for the initial JanusGraph release, but we'd want it for the next JanusGraph release.

Keith, which versions of ES 2.x have you tested your code against? Have you done any testing against ES 5.x to see if your code is dependent on any of the outstanding PRs on Jest? Wonder if we'd get lucky and find that it already works.

I agree with sjudeng regarding ES 1.x, which is already end of life, so I don't think JanusGraph should support it. ES 2.4.x has an EOL date in Feb 2018 (aligned with ES 6.0 release), so there's plenty of useful life left in that.


On Friday, March 3, 2017 at 1:24:59 PM UTC-5, Keith Lohnes wrote:

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right?

Yup. Once those PRs are merged in the Jest project.

documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments

+1

I personally think only supporting http makes sense and going the route @sjudeng mentioned. It allows current titan users some flexibility in their migration to Janus while not stopping progress on the ES backend.

On Friday, March 3, 2017 at 1:06:15 PM UTC-5, sjudeng wrote:

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right? Then I'm back to my suggestion to drop formal support for ES 1.x but documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments. It's great Jest gives us this option for basically free because I really don't think JanusGraph should introduce build/release complexity just to accommodate it. In my opinion if users have really stable Titan deployments and they're not able to update relevant cluster components (storage, indexing, compute), then I'd think they should stay on that baseline until the new capabilities being offered by JanusGraph are compelling enough to warrant the upgrade investment. Otherwise you're just upgrading to change names from Titan to JanusGraph. If this is a step some users want then I'd recommend JanusGraph create an initial release based on an earlier commit after name changes but before the potentially breaking updates to hbase, tinkerpop and elasticsearch.


Adam Phelps <a...@...>
 

(Sorry for the delay, I'm definitely used to lists where reply-to-list is the default, so I send this to @sjudeng along last Friday)

Since I was the guy that spoke up on the Github issue regarding dropping ES 1.x support I figured I should speak up here as well.

To start with, I have no issue with JanusGraph adding support for 2.X or 5.X and *eventually* removing support for 1.X. My only gripe is that I strongly feel the initial release of JanusGraph should allow for relatively smooth upgrade of existing Titan 1.0 based production systems, which means at least retaining support for the newest versions of HBase/Cassandra/ElasticSearch/etc that were supported in Titan 1.0.

I don't know how many folks out there are currently running production systems with Titan 1.0, but in my case we have a large mission critical system built on top of Titan (on HBase 1.2 and ES 1.7) that is constantly growing as we process incoming datastreams and actively serving both internal and external customer queries. I'd love to upgrade this quickly to JanusGraph, at which point I may be able to spend some cycles fixing some of the down sides we've experienced with Titan+HBase, but if we also have to upgrade ES in order to do so then who knows when that will be able to fit into our roadmap.

If JanusGraph is going to be adopted for production users of Titan, it really needs to not be a breaking upgrade for the first release. Sure, state that ES1.X will be removed from future releases, but stick to clear improvements for the JanusGraph initial release that can be deployed with existing infrastructure.

(I actually don't know enough about ES itself to comment on the technical details of it being brought up in this thread, however I do feel that they should be a lower priority than compatibility with Titan 1.0 systems)

- Adam Phelps

On 3/3/17 7:17 AM, sjudeng wrote:
I think JanusGraph should end formal support for 1.x, though it would be
great if users could still have the ability to use it (even if
unsupported/tested) via your Jest-based implementation. For one I'm
biased as the author of the above PR, which does drop support for 1.x
and I'd really like to see get merged before moving forward. More
importantly though in working through updates to support 5.x I was happy
to find the Elasticsearch distribution zip artitfacts available for 2.x
and 5.x (but not 1.x) in Maven central. The availability of the ES
distribution for 2.x and 5.x artifacts enables their automated use in
unit tests and in building JanusGraph releases (e.g. for use in running
embedded ES instances). This allows for JanusGraph to remove the hacked
elasticsearch and elasticsearch.in.sh scripts from janusgraph-dist and
also avoids the JarHell issues both during testing and when starting
embedded ES instances. This improves maintainability and stability.


Jason Plurad <plu...@...>
 

Adam, thanks for chiming in on this thread. I asked to move the conversation to this dev mailing list for broader exposure to the community since there are only a few people following that specific pull request. This is a good time to point out that discussing significant work, such as incrementing the version of a core dependency, should be done on this mailing list (as described in the rather new developer doc) so we can build towards a consensus. Anybody else in the community that has an opinion on the matter should chime in on this thread.

You make a good point on supporting ES 1.7.x. Even though it is end of life, JanusGraph has code that works against it already. If a critical security flaw is found in ES 1.7.x, what would your course of action be? I'd think you'd either have to patch ES 1.7.x privately (EOL, so no more fixes will be released for it) or migrate the cluster to ES 2.x. The former sounds like a lot to ask without company or community support, and the latter would require JanusGraph to support it.



On Monday, March 6, 2017 at 1:00:27 PM UTC-5, Adam Phelps wrote:
(Sorry for the delay, I'm definitely used to lists where reply-to-list
is the default, so I send this to @sjudeng along last Friday)

Since I was the guy that spoke up on the Github issue regarding dropping
ES 1.x support I figured I should speak up here as well.

To start with, I have no issue with JanusGraph adding support for 2.X or
5.X and *eventually* removing support for 1.X.  My only gripe is that I
strongly feel the initial release of JanusGraph should allow for
relatively smooth upgrade of existing Titan 1.0 based production
systems, which means at least retaining support for the newest versions
of HBase/Cassandra/ElasticSearch/etc that were supported in Titan 1.0.

I don't know how many folks out there are currently running production
systems with Titan 1.0, but in my case we have a large mission critical
system built on top of Titan (on HBase 1.2 and ES 1.7) that is
constantly growing as we process incoming datastreams and actively
serving both internal and external customer queries.  I'd love to
upgrade this quickly to JanusGraph, at which point I may be able to
spend some cycles fixing some of the down sides we've experienced with
Titan+HBase, but if we also have to upgrade ES in order to do so then
who knows when that will be able to fit into our roadmap.

If JanusGraph is going to be adopted for production users of Titan, it
really needs to not be a breaking upgrade for the first release.  Sure,
state that ES1.X will be removed from future releases, but stick to
clear improvements for the JanusGraph initial release that can be
deployed with existing infrastructure.

(I actually don't know enough about ES itself to comment on the
technical details of it being brought up in this thread, however I do
feel that they should be a lower priority than compatibility with Titan
1.0 systems)

- Adam Phelps

On 3/3/17 7:17 AM, sjudeng wrote:
> I think JanusGraph should end formal support for 1.x, though it would be
> great if users could still have the ability to use it (even if
> unsupported/tested) via your Jest-based implementation. For one I'm
> biased as the author of the above PR, which does drop support for 1.x
> and I'd really like to see get merged before moving forward. More
> importantly though in working through updates to support 5.x I was happy
> to find the Elasticsearch distribution zip artitfacts available for 2.x
> and 5.x (but not 1.x) in Maven central. The availability of the ES
> distribution for 2.x and 5.x artifacts enables their automated use in
> unit tests and in building JanusGraph releases (e.g. for use in running
> embedded ES instances). This allows for JanusGraph to remove the hacked
> elasticsearch and elasticsearch.in.sh scripts from janusgraph-dist and
> also avoids the JarHell issues both during testing and when starting
> embedded ES instances. This improves maintainability and stability.




sjudeng <sju...@...>
 

Regarding an initial release with full compatibility with Titan 1.0, what about creating an issue requesting a release be created based on an earlier commit before potentially breaking changes? In particular to accommodate this use case I'd recommend a commit prior to at least the TinkerPop update, as this would be a breaking change for any production OLAP users. Early on either accidentally or on purpose I think JanusGraph began taking steps beyond Titan, especially in terms of updating dependencies. The initial focus hasn't seemed to be to create an initial "transition" release. But maybe the question was never asked. If there was interest I'd think a different branch other than master would need to be created to accommodate this.

I do think the ongoing development on master, which is focusing on fixing and modernizing the code base, should be allowed to continue. I think there's some developer momentum to finally move beyond Titan and there should be a place for that.

As an example, and to summarize current ES discussions, if we move forward we could have full ES 2.x and 5.x compatibility by merging the existing PR and then updating to use the ES 5.x REST client (complete, pending PR). With Keith's work we can then update to use Jest instead of the ES REST client as this should allow easier management until ES releases a higher level REST client. This would provide great benefits including the ability to still support ES 1.x as described above, as well as being able to update to Lucene 6. Keith I still don't know whether you'd want to start with/without the pending-new PR updates to support ES 5.x REST client. But if/when the existing PR is merged I'll push it and you can decide.


Adam <a...@...>
 

On 3/6/17 11:18 AM, Jason Plurad wrote:
You make a good point on supporting ES 1.7.x. Even though it is end of
life, JanusGraph has /code that works against it already/. If a critical
security flaw is found in ES 1.7.x, what would your course of action be?
I'd think you'd either have to patch ES 1.7.x privately (EOL, so no more
fixes will be released for it) or migrate the cluster to ES 2.x. The
former sounds like a lot to ask without company or community support,
and the latter would require JanusGraph to support it.
Really, that would depend on the potential impact of the security flaw. Nothing within our infrastructure is directly accessible to outside access, and the API we present to customers is well separated from direct Titan access. As such its pretty unlikely that a security flaw in ES itself would present any exterior facing potential for attack.

Again, I'm not saying that upgrading ES versions isn't something we'll eventually do. What I'm trying to get at is that I view JanusGraph as simply a newer version of Titan (some people may disagree, but functionally that's what it is) and breaking Titan based systems that are using the newest dependencies that Titan supported for the first release of JanusGraph is going to be a hard sell for anyone that has a current production system.

If this change had added a compatibility layer that would support both 1.X and the newer ES versions I would have had no problems with it, similarly if there was an initial compatibility release of JanusGraph with the understanding that future releases would have breaking changes (preferably announced months in advance) would also be fine.

- Adam


sjudeng <sju...@...>
 

I just pushed the commit to update to support ES REST client to the above PR. I figured the PR merge is on hold anyway pending the outcome here and I didn't want to sit on the 2.x/5.x work as I think there are some useful bits there, especially regarding testing.


Keith Lohnes <loh...@...>
 

sjudeng, I think those commits might be the right way forward. It satisfies the need for http. Tested 1.x support, even with Jest, is difficult, and things get much easier with only supporting 2.x/5.x. I'd suggest taking sjudeng's changes.

On Tuesday, March 7, 2017 at 11:18:14 PM UTC-5, sjudeng wrote:
I just pushed the commit to update to support ES REST client to the above PR. I figured the PR merge is on hold anyway pending the outcome here and I didn't want to sit on the 2.x/5.x work as I think there are some useful bits there, especially regarding testing.


sjudeng <sju...@...>
 

Going forward do you think we'd next look at dropping transport client support and then updating to use Jest? If so I'd think you could just delete the `rest` package and replace with `jest`, which I'm sure would remove a lot of the boilerplate object code added to support the low-level REST client. If updating to Jest provides shading of ES settings/query builder objects then that alone might make it worthwhile at least until ES releases a higher level REST client. In particular this would hopefully allow updating Lucene independent of ES version.


Keith Lohnes <loh...@...>
 

I see your concern with the Lucene stuff. I think what I might be able to do is

  1. Get rid of the transport client.
  2. Move the starting of the server responsibility on to the test code and use the Elasticsearch Integration testing library.
  3. Add Jest.

Let me look in to this a little bit (maybe can wind up just helping you out with it for your PR anyway). 


On Wed, Mar 8, 2017 at 11:16 AM sjudeng <sju...@...> wrote:
Going forward do you think we'd next look at dropping transport client support and then updating to use Jest? If so I'd think you could just delete the `rest` package and replace with `jest`, which I'm sure would remove a lot of the boilerplate object code added to support the low-level REST client. If updating to Jest provides shading of ES settings/query builder objects then that alone might make it worthwhile at least until ES releases a higher level REST client. In particular this would hopefully allow updating Lucene independent of ES version.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
For more options, visit https://groups.google.com/d/optout.