[DISCUSS] Elasticsearch Http using Jest
sjudeng <sju...@...>
Sounds good. Hopefully the Jest 5.x jar will be backwards compatible with ES 2.x in the same way the ES REST client is. |
|
Keith Lohnes <loh...@...>
sjudeng, thanks for looking things over. I've moved away from using the ESIntegTestCase. I couldn't get it to work. The reason for separating out the was 1.x support, so I'll likely undo that change. I also think retargeting the changes on to 79 is a good idea as well. I've been keeping an eye on the Jest repo and it looks like things are starting to move a little bit on the 5.x support, which is why I started back up on coding on that PR a little. >Even better it'll also remove the ES dependency It won't. There's still a few things in the code that rely on the ES dependency. Here's the list. ``` import org.elasticsearch.Version; import org.elasticsearch.common.unit.DistanceUnit; import org.elasticsearch.common.xcontent.XContentBuilder; import org.elasticsearch.common.xcontent.XContentFactory; import org.elasticsearch.search.builder.SearchSourceBuilder; import org.elasticsearch.search.sort.FieldSortBuilder; import org.elasticsearch.search.sort.SortOrder; import org.elasticsearch.index.query.OrFilterBuilder; import org.elasticsearch.index.query.AndFilterBuilder; import org.elasticsearch.index.query.FilterBuilder; import org.elasticsearch.index.query.FilterBuilders; import org.elasticsearch.index.query.QueryBuilders; ``` These could probably be replaced, but I hadn't really intended to do that work right now. So it will get things closer to being able to remove that ES dependency, but it's not _quite_ there yet. On Fri, Mar 24, 2017 at 7:54 PM sjudeng <sju...@...> wrote:
|
|
sjudeng <sju...@...>
Keith, I looked over the latest commit on your Jest update branch. Do you think the update to support running tests through ESIntegTestCase will be worth the additional complexity of needing to separate out into multiple ES test modules? Personally my preference in this case would be to stay with the approach of testing through a full instance via the ES distribution artifact available since 2.x and added in #79. I think there's value in running tests against an actual (fully unmocked) ES instance, the implementation is cleaner, and it supports testing against any 2.x and 5.x distribution. We already have approval to drop support for 1.x (after the 0.1.0 release). Also since I think we have consensus on the path forward to update ES (merging #79 and then migrating to Jest), would you consider retargeting your Jest update commits onto the branch in #79? My hope is we'll be able to get #79 merged after the 0.1.0 release is out/started. I think your updates to use Jest will improve that implementation by removing all the custom REST request/response objects I added. Even better it'll also remove the ES dependency (right?), which will free us up for the Lucene upgrade. Thanks for your work on this and for making it visible via your WIP PR. However we proceed I think communicating early and often is helpful here. |
|
sjudeng <sju...@...>
I added a comment to the existing release discussion regarding creating an initial compatibility release and started a vote thread. https://groups.google.com/d/msg/janusgraph-dev/8jkMnkKzmC0/nvfgaChmBQAJ https://groups.google.com/forum/#!topic/janusgraph-dev/FLWuQMcW8-4 |
|
Jason Plurad <plu...@...>
sjudeng, I think the ES plan sounds good. This thread has been open for a while, and we have gotten any other input from the community. It sounds fair to start up a [VOTE] thread. Re: creating the initial JanusGraph release prior to breaking changes, that's a separate thread all together. On Wednesday, March 8, 2017 at 2:58:46 PM UTC-5, sjudeng wrote:
|
|
sjudeng <sju...@...>
Keith, Sounds good. Are you going to rebase those commits on mine or keep them separate for now? I want to avoid painful merge/conflict issues for either of us later down the line. Jason, Are we close to moving forward here? I propose we create an issue requesting an initial JanusGraph release based on a commit prior to breaking changes. This would mitigate the concern regarding ES 1.x compatibility. This might be a good idea anyway to address ongoing questions on a timeline for an initial JanusGraph release. If that's separated out from ongoing dependency update work then I think it could get put together relatively quickly on a separate release branch. In parallel to the initial compatibility release work, we could then proceed with (among other things) ES development. On this I propose:
|
|
Keith Lohnes <loh...@...>
>clean up the implementation and still support conclusively testing against any (supported) ES distribution version. That's my main driver for this. I see having that transport client as pretty unnecessary. It's not compiling right now since I'm cleaning things up and working on tests, but if you want to take a look at what I'm doing, I opened up [a PR](https://github.com/JanusGraph/janusgraph/pull/160) On Wed, Mar 8, 2017 at 1:17 PM sjudeng <sju...@...> wrote:
|
|
sjudeng <sju...@...>
Keith, I added a PR comment regarding testing before seeing this. Definitely would welcome your efforts on any of the items you mentioned. Probably need some consensus here regarding dropping transport client (though I'm a +1). I'm a little concerned on the integration test stuff as I think there's value in having a full ES instance for testing. But on the flip side it would be pretty compelling if this would clean up the implementation and still support conclusively testing against any (supported) ES distribution version. |
|
Keith Lohnes <loh...@...>
I see your concern with the Lucene stuff. I think what I might be able to do is
Let me look in to this a little bit (maybe can wind up just helping you out with it for your PR anyway). On Wed, Mar 8, 2017 at 11:16 AM sjudeng <sju...@...> wrote:
|
|
sjudeng <sju...@...>
Going forward do you think we'd next look at dropping transport client support and then updating to use Jest? If so I'd think you could just delete the `rest` package and replace with `jest`, which I'm sure would remove a lot of the boilerplate object code added to support the low-level REST client. If updating to Jest provides shading of ES settings/query builder objects then that alone might make it worthwhile at least until ES releases a higher level REST client. In particular this would hopefully allow updating Lucene independent of ES version. |
|
Keith Lohnes <loh...@...>
sjudeng, I think those commits might be the right way forward. It satisfies the need for http. Tested 1.x support, even with Jest, is difficult, and things get much easier with only supporting 2.x/5.x. I'd suggest taking sjudeng's changes. On Tuesday, March 7, 2017 at 11:18:14 PM UTC-5, sjudeng wrote:
|
|
sjudeng <sju...@...>
I just pushed the commit to update to support ES REST client to the above PR. I figured the PR merge is on hold anyway pending the outcome here and I didn't want to sit on the 2.x/5.x work as I think there are some useful bits there, especially regarding testing. |
|
Adam <a...@...>
On 3/6/17 11:18 AM, Jason Plurad wrote:
You make a good point on supporting ES 1.7.x. Even though it is end ofReally, that would depend on the potential impact of the security flaw. Nothing within our infrastructure is directly accessible to outside access, and the API we present to customers is well separated from direct Titan access. As such its pretty unlikely that a security flaw in ES itself would present any exterior facing potential for attack. Again, I'm not saying that upgrading ES versions isn't something we'll eventually do. What I'm trying to get at is that I view JanusGraph as simply a newer version of Titan (some people may disagree, but functionally that's what it is) and breaking Titan based systems that are using the newest dependencies that Titan supported for the first release of JanusGraph is going to be a hard sell for anyone that has a current production system. If this change had added a compatibility layer that would support both 1.X and the newer ES versions I would have had no problems with it, similarly if there was an initial compatibility release of JanusGraph with the understanding that future releases would have breaking changes (preferably announced months in advance) would also be fine. - Adam |
|
sjudeng <sju...@...>
Regarding an initial release with full compatibility with Titan 1.0, what about creating an issue requesting a release be created based on an earlier commit before potentially breaking changes? In particular to accommodate this use case I'd recommend a commit prior to at least the TinkerPop update, as this would be a breaking change for any production OLAP users. Early on either accidentally or on purpose I think JanusGraph began taking steps beyond Titan, especially in terms of updating dependencies. The initial focus hasn't seemed to be to create an initial "transition" release. But maybe the question was never asked. If there was interest I'd think a different branch other than master would need to be created to accommodate this. I do think the ongoing development on master, which is focusing on fixing and modernizing the code base, should be allowed to continue. I think there's some developer momentum to finally move beyond Titan and there should be a place for that. As an example, and to summarize current ES discussions, if we move forward we could have full ES 2.x and 5.x compatibility by merging the existing PR and then updating to use the ES 5.x REST client (complete, pending PR). With Keith's work we can then update to use Jest instead of the ES REST client as this should allow easier management until ES releases a higher level REST client. This would provide great benefits including the ability to still support ES 1.x as described above, as well as being able to update to Lucene 6. Keith I still don't know whether you'd want to start with/without the pending-new PR updates to support ES 5.x REST client. But if/when the existing PR is merged I'll push it and you can decide. |
|
Jason Plurad <plu...@...>
Adam, thanks for chiming in on this thread. I asked to move the conversation to this dev mailing list for broader exposure to the community since there are only a few people following that specific pull request. This is a good time to point out that discussing significant work, such as incrementing the version of a core dependency, should be done on this mailing list (as described in the rather new developer doc) so we can build towards a consensus. Anybody else in the community that has an opinion on the matter should chime in on this thread. You make a good point on supporting ES 1.7.x. Even though it is end of life, JanusGraph has code that works against it already. If a critical security flaw is found in ES 1.7.x, what would your course of action be? I'd think you'd either have to patch ES 1.7.x privately (EOL, so no more fixes will be released for it) or migrate the cluster to ES 2.x. The former sounds like a lot to ask without company or community support, and the latter would require JanusGraph to support it. On Monday, March 6, 2017 at 1:00:27 PM UTC-5, Adam Phelps wrote: (Sorry for the delay, I'm definitely used to lists where reply-to-list |
|
Adam Phelps <a...@...>
(Sorry for the delay, I'm definitely used to lists where reply-to-list is the default, so I send this to @sjudeng along last Friday)
toggle quoted message
Show quoted text
Since I was the guy that spoke up on the Github issue regarding dropping ES 1.x support I figured I should speak up here as well. To start with, I have no issue with JanusGraph adding support for 2.X or 5.X and *eventually* removing support for 1.X. My only gripe is that I strongly feel the initial release of JanusGraph should allow for relatively smooth upgrade of existing Titan 1.0 based production systems, which means at least retaining support for the newest versions of HBase/Cassandra/ElasticSearch/etc that were supported in Titan 1.0. I don't know how many folks out there are currently running production systems with Titan 1.0, but in my case we have a large mission critical system built on top of Titan (on HBase 1.2 and ES 1.7) that is constantly growing as we process incoming datastreams and actively serving both internal and external customer queries. I'd love to upgrade this quickly to JanusGraph, at which point I may be able to spend some cycles fixing some of the down sides we've experienced with Titan+HBase, but if we also have to upgrade ES in order to do so then who knows when that will be able to fit into our roadmap. If JanusGraph is going to be adopted for production users of Titan, it really needs to not be a breaking upgrade for the first release. Sure, state that ES1.X will be removed from future releases, but stick to clear improvements for the JanusGraph initial release that can be deployed with existing infrastructure. (I actually don't know enough about ES itself to comment on the technical details of it being brought up in this thread, however I do feel that they should be a lower priority than compatibility with Titan 1.0 systems) - Adam Phelps On 3/3/17 7:17 AM, sjudeng wrote:
I think JanusGraph should end formal support for 1.x, though it would be |
|
Jason Plurad <plu...@...>
How comfortable are we taking Jest on as a dependency? Or rather, if we take on Jest as a dependency, we'll need to be prepared to help push them along with fixes. Jest Issue 409 is for ES 5.0 support, which was released Oct 2016. It doesn't seem like ES 5.x support will be there in time for the initial JanusGraph release, but we'd want it for the next JanusGraph release. Keith, which versions of ES 2.x have you tested your code against? Have you done any testing against ES 5.x to see if your code is dependent on any of the outstanding PRs on Jest? Wonder if we'd get lucky and find that it already works. I agree with sjudeng regarding ES 1.x, which is already end of life, so I don't think JanusGraph should support it. ES 2.4.x has an EOL date in Feb 2018 (aligned with ES 6.0 release), so there's plenty of useful life left in that. On Friday, March 3, 2017 at 1:24:59 PM UTC-5, Keith Lohnes wrote:
|
|
Keith Lohnes <loh...@...>
Yup. Once those PRs are merged in the Jest project.
+1 I personally think only supporting http makes sense and going the route @sjudeng mentioned. It allows current titan users some flexibility in their migration to Janus while not stopping progress on the ES backend. On Friday, March 3, 2017 at 1:06:15 PM UTC-5, sjudeng wrote:
|
|
sjudeng <sju...@...>
I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right? Then I'm back to my suggestion to drop formal support for ES 1.x but documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments. It's great Jest gives us this option for basically free because I really don't think JanusGraph should introduce build/release complexity just to accommodate it. In my opinion if users have really stable Titan deployments and they're not able to update relevant cluster components (storage, indexing, compute), then I'd think they should stay on that baseline until the new capabilities being offered by JanusGraph are compelling enough to warrant the upgrade investment. Otherwise you're just upgrading to change names from Titan to JanusGraph. If this is a step some users want then I'd recommend JanusGraph create an initial release based on an earlier commit after name changes but before the potentially breaking updates to hbase, tinkerpop and elasticsearch. |
|
Keith Lohnes <loh...@...>
I don't see a problem with just going full http, it would definitely make things easier for me. But the Jest jars for 1.x vs 2.x + are going to need to be different. I'm not sure what the preferred method of dealing with that would be. On Friday, March 3, 2017 at 11:31:44 AM UTC-5, sjudeng wrote:
|
|