Date   

Re: [DISCUSS] Elasticsearch Http using Jest

Jason Plurad <plu...@...>
 

How comfortable are we taking Jest on as a dependency? Or rather, if we take on Jest as a dependency, we'll need to be prepared to help push them along with fixes. Jest Issue 409 is for ES 5.0 support, which was released Oct 2016. It doesn't seem like ES 5.x support will be there in time for the initial JanusGraph release, but we'd want it for the next JanusGraph release.

Keith, which versions of ES 2.x have you tested your code against? Have you done any testing against ES 5.x to see if your code is dependent on any of the outstanding PRs on Jest? Wonder if we'd get lucky and find that it already works.

I agree with sjudeng regarding ES 1.x, which is already end of life, so I don't think JanusGraph should support it. ES 2.4.x has an EOL date in Feb 2018 (aligned with ES 6.0 release), so there's plenty of useful life left in that.


On Friday, March 3, 2017 at 1:24:59 PM UTC-5, Keith Lohnes wrote:

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right?

Yup. Once those PRs are merged in the Jest project.

documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments

+1

I personally think only supporting http makes sense and going the route @sjudeng mentioned. It allows current titan users some flexibility in their migration to Janus while not stopping progress on the ES backend.

On Friday, March 3, 2017 at 1:06:15 PM UTC-5, sjudeng wrote:

I think you said the Jest 2.x jars should work with ES 2.x and 5.x, right? Then I'm back to my suggestion to drop formal support for ES 1.x but documentation could be updated to provide workaround steps (e.g. manually delete jest-2.x.jar and download/add jest-1.x.jar to classpath) to allow (untested) support for legacy ES 1.x deployments. It's great Jest gives us this option for basically free because I really don't think JanusGraph should introduce build/release complexity just to accommodate it. In my opinion if users have really stable Titan deployments and they're not able to update relevant cluster components (storage, indexing, compute), then I'd think they should stay on that baseline until the new capabilities being offered by JanusGraph are compelling enough to warrant the upgrade investment. Otherwise you're just upgrading to change names from Titan to JanusGraph. If this is a step some users want then I'd recommend JanusGraph create an initial release based on an earlier commit after name changes but before the potentially breaking updates to hbase, tinkerpop and elasticsearch.


Re: [DISCUSS] Elasticsearch Http using Jest

Adam Phelps <a...@...>
 

(Sorry for the delay, I'm definitely used to lists where reply-to-list is the default, so I send this to @sjudeng along last Friday)

Since I was the guy that spoke up on the Github issue regarding dropping ES 1.x support I figured I should speak up here as well.

To start with, I have no issue with JanusGraph adding support for 2.X or 5.X and *eventually* removing support for 1.X. My only gripe is that I strongly feel the initial release of JanusGraph should allow for relatively smooth upgrade of existing Titan 1.0 based production systems, which means at least retaining support for the newest versions of HBase/Cassandra/ElasticSearch/etc that were supported in Titan 1.0.

I don't know how many folks out there are currently running production systems with Titan 1.0, but in my case we have a large mission critical system built on top of Titan (on HBase 1.2 and ES 1.7) that is constantly growing as we process incoming datastreams and actively serving both internal and external customer queries. I'd love to upgrade this quickly to JanusGraph, at which point I may be able to spend some cycles fixing some of the down sides we've experienced with Titan+HBase, but if we also have to upgrade ES in order to do so then who knows when that will be able to fit into our roadmap.

If JanusGraph is going to be adopted for production users of Titan, it really needs to not be a breaking upgrade for the first release. Sure, state that ES1.X will be removed from future releases, but stick to clear improvements for the JanusGraph initial release that can be deployed with existing infrastructure.

(I actually don't know enough about ES itself to comment on the technical details of it being brought up in this thread, however I do feel that they should be a lower priority than compatibility with Titan 1.0 systems)

- Adam Phelps

On 3/3/17 7:17 AM, sjudeng wrote:
I think JanusGraph should end formal support for 1.x, though it would be
great if users could still have the ability to use it (even if
unsupported/tested) via your Jest-based implementation. For one I'm
biased as the author of the above PR, which does drop support for 1.x
and I'd really like to see get merged before moving forward. More
importantly though in working through updates to support 5.x I was happy
to find the Elasticsearch distribution zip artitfacts available for 2.x
and 5.x (but not 1.x) in Maven central. The availability of the ES
distribution for 2.x and 5.x artifacts enables their automated use in
unit tests and in building JanusGraph releases (e.g. for use in running
embedded ES instances). This allows for JanusGraph to remove the hacked
elasticsearch and elasticsearch.in.sh scripts from janusgraph-dist and
also avoids the JarHell issues both during testing and when starting
embedded ES instances. This improves maintainability and stability.


Re: [DISCUSS] Elasticsearch Http using Jest

Jason Plurad <plu...@...>
 

Adam, thanks for chiming in on this thread. I asked to move the conversation to this dev mailing list for broader exposure to the community since there are only a few people following that specific pull request. This is a good time to point out that discussing significant work, such as incrementing the version of a core dependency, should be done on this mailing list (as described in the rather new developer doc) so we can build towards a consensus. Anybody else in the community that has an opinion on the matter should chime in on this thread.

You make a good point on supporting ES 1.7.x. Even though it is end of life, JanusGraph has code that works against it already. If a critical security flaw is found in ES 1.7.x, what would your course of action be? I'd think you'd either have to patch ES 1.7.x privately (EOL, so no more fixes will be released for it) or migrate the cluster to ES 2.x. The former sounds like a lot to ask without company or community support, and the latter would require JanusGraph to support it.



On Monday, March 6, 2017 at 1:00:27 PM UTC-5, Adam Phelps wrote:
(Sorry for the delay, I'm definitely used to lists where reply-to-list
is the default, so I send this to @sjudeng along last Friday)

Since I was the guy that spoke up on the Github issue regarding dropping
ES 1.x support I figured I should speak up here as well.

To start with, I have no issue with JanusGraph adding support for 2.X or
5.X and *eventually* removing support for 1.X.  My only gripe is that I
strongly feel the initial release of JanusGraph should allow for
relatively smooth upgrade of existing Titan 1.0 based production
systems, which means at least retaining support for the newest versions
of HBase/Cassandra/ElasticSearch/etc that were supported in Titan 1.0.

I don't know how many folks out there are currently running production
systems with Titan 1.0, but in my case we have a large mission critical
system built on top of Titan (on HBase 1.2 and ES 1.7) that is
constantly growing as we process incoming datastreams and actively
serving both internal and external customer queries.  I'd love to
upgrade this quickly to JanusGraph, at which point I may be able to
spend some cycles fixing some of the down sides we've experienced with
Titan+HBase, but if we also have to upgrade ES in order to do so then
who knows when that will be able to fit into our roadmap.

If JanusGraph is going to be adopted for production users of Titan, it
really needs to not be a breaking upgrade for the first release.  Sure,
state that ES1.X will be removed from future releases, but stick to
clear improvements for the JanusGraph initial release that can be
deployed with existing infrastructure.

(I actually don't know enough about ES itself to comment on the
technical details of it being brought up in this thread, however I do
feel that they should be a lower priority than compatibility with Titan
1.0 systems)

- Adam Phelps

On 3/3/17 7:17 AM, sjudeng wrote:
> I think JanusGraph should end formal support for 1.x, though it would be
> great if users could still have the ability to use it (even if
> unsupported/tested) via your Jest-based implementation. For one I'm
> biased as the author of the above PR, which does drop support for 1.x
> and I'd really like to see get merged before moving forward. More
> importantly though in working through updates to support 5.x I was happy
> to find the Elasticsearch distribution zip artitfacts available for 2.x
> and 5.x (but not 1.x) in Maven central. The availability of the ES
> distribution for 2.x and 5.x artifacts enables their automated use in
> unit tests and in building JanusGraph releases (e.g. for use in running
> embedded ES instances). This allows for JanusGraph to remove the hacked
> elasticsearch and elasticsearch.in.sh scripts from janusgraph-dist and
> also avoids the JarHell issues both during testing and when starting
> embedded ES instances. This improves maintainability and stability.




Re: [DISCUSS] Elasticsearch Http using Jest

sjudeng <sju...@...>
 

Regarding an initial release with full compatibility with Titan 1.0, what about creating an issue requesting a release be created based on an earlier commit before potentially breaking changes? In particular to accommodate this use case I'd recommend a commit prior to at least the TinkerPop update, as this would be a breaking change for any production OLAP users. Early on either accidentally or on purpose I think JanusGraph began taking steps beyond Titan, especially in terms of updating dependencies. The initial focus hasn't seemed to be to create an initial "transition" release. But maybe the question was never asked. If there was interest I'd think a different branch other than master would need to be created to accommodate this.

I do think the ongoing development on master, which is focusing on fixing and modernizing the code base, should be allowed to continue. I think there's some developer momentum to finally move beyond Titan and there should be a place for that.

As an example, and to summarize current ES discussions, if we move forward we could have full ES 2.x and 5.x compatibility by merging the existing PR and then updating to use the ES 5.x REST client (complete, pending PR). With Keith's work we can then update to use Jest instead of the ES REST client as this should allow easier management until ES releases a higher level REST client. This would provide great benefits including the ability to still support ES 1.x as described above, as well as being able to update to Lucene 6. Keith I still don't know whether you'd want to start with/without the pending-new PR updates to support ES 5.x REST client. But if/when the existing PR is merged I'll push it and you can decide.


Re: [DISCUSS] Elasticsearch Http using Jest

Adam <a...@...>
 

On 3/6/17 11:18 AM, Jason Plurad wrote:
You make a good point on supporting ES 1.7.x. Even though it is end of
life, JanusGraph has /code that works against it already/. If a critical
security flaw is found in ES 1.7.x, what would your course of action be?
I'd think you'd either have to patch ES 1.7.x privately (EOL, so no more
fixes will be released for it) or migrate the cluster to ES 2.x. The
former sounds like a lot to ask without company or community support,
and the latter would require JanusGraph to support it.
Really, that would depend on the potential impact of the security flaw. Nothing within our infrastructure is directly accessible to outside access, and the API we present to customers is well separated from direct Titan access. As such its pretty unlikely that a security flaw in ES itself would present any exterior facing potential for attack.

Again, I'm not saying that upgrading ES versions isn't something we'll eventually do. What I'm trying to get at is that I view JanusGraph as simply a newer version of Titan (some people may disagree, but functionally that's what it is) and breaking Titan based systems that are using the newest dependencies that Titan supported for the first release of JanusGraph is going to be a hard sell for anyone that has a current production system.

If this change had added a compatibility layer that would support both 1.X and the newer ES versions I would have had no problems with it, similarly if there was an initial compatibility release of JanusGraph with the understanding that future releases would have breaking changes (preferably announced months in advance) would also be fine.

- Adam


[PROPOSAL] CQL Storage Backend Implementation

Paul Kendall <paul....@...>
 

Hi,
We have mostly completed work on a CQL storage backend for JanusGraph which is table compatible with the Astyanax implementation.
There is still some missing functionality for some of the configuration options including SSL support.

One of the goals we have in using JanusGraph is to embed it into our application but for it to talk to remote cassandra and elasticsearch clusters.
The current project structure makes this quite difficult in that it pollutes our classpath with cassandra-all and friends. Currently what we have done is duplicate some of the functionality from the cassandra project and re-namespace it under cql so that it is completely independent.

An alternative approach might be to create a cassandra-core project and split the backends i.e. Astyanax, Thrift, Embedded and ultimately CQL as dependants on cassandra-core. This would allow us to share some of the common configuration code and more importanly, some of unit test code.

We imagine this would be a good starting point for https://github.com/JanusGraph/janusgraph/issues/35

Cheers,
Paul Kendall & Samant Maharaj


Re: [PROPOSAL] CQL Storage Backend Implementation

Jason Plurad <plu...@...>
 

Paul, that sounds great. The proposal to create a cassandra-core makes sense to me.
What version of CQL are you coding against, 3.1 or 3.3?


On Monday, March 6, 2017 at 8:33:53 PM UTC-5, Paul Kendall wrote:
Hi,
We have mostly completed work on a CQL storage backend for JanusGraph which is table compatible with the Astyanax implementation.
There is still some missing functionality for some of the configuration options including SSL support.

One of the goals we have in using JanusGraph is to embed it into our application but for it to talk to remote cassandra and elasticsearch clusters.
The current project structure makes this quite difficult in that it pollutes our classpath with cassandra-all and friends. Currently what we have done is duplicate some of the functionality from the cassandra project and re-namespace it under cql so that it is completely independent.

An alternative approach might be to create a cassandra-core project and split the backends i.e. Astyanax, Thrift, Embedded and ultimately CQL as dependants on cassandra-core. This would allow us to share some of the common configuration code and more importanly, some of unit test code.

We imagine this would be a good starting point for https://github.com/JanusGraph/janusgraph/issues/35

Cheers,
Paul Kendall & Samant Maharaj


Re: [PROPOSAL] CQL Storage Backend Implementation

Paul Kendall <paul....@...>
 

Jason, currently we are targetting the 3.1.4 driver, but I don't think there is anything that precludes us from using the 3.3 version either, but I will check on that.
We will look into completing the unit tests today then we'll comment on https://github.com/JanusGraph/janusgraph/issues/35 with a link to our fork so we can get feedback.
We'll also look into creating a cassandra-core project that can be used to contain common code for all the cassandra variants.


Re: [DISCUSS] Elasticsearch Http using Jest

sjudeng <sju...@...>
 

I just pushed the commit to update to support ES REST client to the above PR. I figured the PR merge is on hold anyway pending the outcome here and I didn't want to sit on the 2.x/5.x work as I think there are some useful bits there, especially regarding testing.


Re: [DISCUSS] Elasticsearch Http using Jest

Keith Lohnes <loh...@...>
 

sjudeng, I think those commits might be the right way forward. It satisfies the need for http. Tested 1.x support, even with Jest, is difficult, and things get much easier with only supporting 2.x/5.x. I'd suggest taking sjudeng's changes.

On Tuesday, March 7, 2017 at 11:18:14 PM UTC-5, sjudeng wrote:
I just pushed the commit to update to support ES REST client to the above PR. I figured the PR merge is on hold anyway pending the outcome here and I didn't want to sit on the 2.x/5.x work as I think there are some useful bits there, especially regarding testing.


Re: [DISCUSS] Elasticsearch Http using Jest

sjudeng <sju...@...>
 

Going forward do you think we'd next look at dropping transport client support and then updating to use Jest? If so I'd think you could just delete the `rest` package and replace with `jest`, which I'm sure would remove a lot of the boilerplate object code added to support the low-level REST client. If updating to Jest provides shading of ES settings/query builder objects then that alone might make it worthwhile at least until ES releases a higher level REST client. In particular this would hopefully allow updating Lucene independent of ES version.


Re: [DISCUSS] Elasticsearch Http using Jest

Keith Lohnes <loh...@...>
 

I see your concern with the Lucene stuff. I think what I might be able to do is

  1. Get rid of the transport client.
  2. Move the starting of the server responsibility on to the test code and use the Elasticsearch Integration testing library.
  3. Add Jest.

Let me look in to this a little bit (maybe can wind up just helping you out with it for your PR anyway). 


On Wed, Mar 8, 2017 at 11:16 AM sjudeng <sju...@...> wrote:
Going forward do you think we'd next look at dropping transport client support and then updating to use Jest? If so I'd think you could just delete the `rest` package and replace with `jest`, which I'm sure would remove a lot of the boilerplate object code added to support the low-level REST client. If updating to Jest provides shading of ES settings/query builder objects then that alone might make it worthwhile at least until ES releases a higher level REST client. In particular this would hopefully allow updating Lucene independent of ES version.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
For more options, visit https://groups.google.com/d/optout.


Re: [DISCUSS] Elasticsearch Http using Jest

sjudeng <sju...@...>
 

Keith,

I added a PR comment regarding testing before seeing this. Definitely would welcome your efforts on any of the items you mentioned. Probably need some consensus here regarding dropping transport client (though I'm a +1). I'm a little concerned on the integration test stuff as I think there's value in having a full ES instance for testing. But on the flip side it would be pretty compelling if this would clean up the implementation and still support conclusively testing against any (supported) ES distribution version.


Re: [DISCUSS] Elasticsearch Http using Jest

Keith Lohnes <loh...@...>
 

>clean up the implementation and still support conclusively testing against any (supported) ES distribution version.

That's my main driver for this. I see having that transport client as pretty unnecessary. It's not compiling right now since I'm cleaning things up and working on tests, but if you want to take a look at what I'm doing, I opened up [a PR](https://github.com/JanusGraph/janusgraph/pull/160)


On Wed, Mar 8, 2017 at 1:17 PM sjudeng <sju...@...> wrote:
Keith,

I added a PR comment regarding testing before seeing this. Definitely would welcome your efforts on any of the items you mentioned. Probably need some consensus here regarding dropping transport client (though I'm a +1). I'm a little concerned on the integration test stuff as I think there's value in having a full ES instance for testing. But on the flip side it would be pretty compelling if this would clean up the implementation and still support conclusively testing against any (supported) ES distribution version.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
For more options, visit https://groups.google.com/d/optout.


Re: [DISCUSS] Elasticsearch Http using Jest

sjudeng <sju...@...>
 

Keith, Sounds good. Are you going to rebase those commits on mine or keep them separate for now? I want to avoid painful merge/conflict issues for either of us later down the line.

Jason, Are we close to moving forward here? I propose we create an issue requesting an initial JanusGraph release based on a commit prior to breaking changes. This would mitigate the concern regarding ES 1.x compatibility. This might be a good idea anyway to address ongoing questions on a timeline for an initial JanusGraph release. If that's separated out from ongoing dependency update work then I think it could get put together relatively quickly on a separate release branch.

In parallel to the initial compatibility release work, we could then proceed with (among other things) ES development. On this I propose:
  1. Update janusgraph-es to support 2.x and 5.x with native ES REST client.
    • Action: Review and merge PR#79
  2. Update janusgraph-es to (at a minimum) remove transport client and replace native ES REST client with the Jest high level ES HTTP client
    • Action: Community consensus regarding removal of transport client support. Complete development/testing. Review and merge PR#160.


Re: [PROPOSAL] CQL Storage Backend Implementation

Samant Maharaj <samant...@...>
 

Hi all,

We've got an initial implementation of the CQL based backend. At this stage it's been implemented as a completely separate module, however ideally we'd like to extract a cassandra-core module for common code.

Here's a link to the branch in our fork: https://github.com/orionhealth/janusgraph/tree/feature/cql-backend

Features left to implement:
  • Pooling configuration
  • Retry configuration
  • Support for per partition query limits (available in C* since 3.6). We're planning to implement this by switching the query strategy based on the version metadata detected from the C* cluster.
Note, we've introduced dependencies on the Datastax Cassandra driver as well as Javaslang which makes functional style programming in Java a lot easier.

It'd be great to get some feedback to ensure we're not going about this the wrong way.

Regards,
Samant Maharaj & Paul Kendall.

On Wednesday, 8 March 2017 06:47:05 UTC+13, Paul Kendall wrote:
Jason, currently we are targetting the 3.1.4 driver, but I don't think there is anything that precludes us from using the 3.3 version either, but I will check on that.
We will look into completing the unit tests today then we'll comment on https://github.com/JanusGraph/janusgraph/issues/35 with a link to our fork so we can get feedback.
We'll also look into creating a cassandra-core project that can be used to contain common code for all the cassandra variants.


Re: [DISCUSS] Elasticsearch Http using Jest

Jason Plurad <plu...@...>
 

sjudeng, I think the ES plan sounds good. This thread has been open for a while, and we have gotten any other input from the community. It sounds fair to start up a [VOTE] thread.

Re: creating the initial JanusGraph release prior to breaking changes, that's a separate thread all together.


On Wednesday, March 8, 2017 at 2:58:46 PM UTC-5, sjudeng wrote:
Keith, Sounds good. Are you going to rebase those commits on mine or keep them separate for now? I want to avoid painful merge/conflict issues for either of us later down the line.

Jason, Are we close to moving forward here? I propose we create an issue requesting an initial JanusGraph release based on a commit prior to breaking changes. This would mitigate the concern regarding ES 1.x compatibility. This might be a good idea anyway to address ongoing questions on a timeline for an initial JanusGraph release. If that's separated out from ongoing dependency update work then I think it could get put together relatively quickly on a separate release branch.

In parallel to the initial compatibility release work, we could then proceed with (among other things) ES development. On this I propose:
  1. Update janusgraph-es to support 2.x and 5.x with native ES REST client.
    • Action: Review and merge PR#79
  2. Update janusgraph-es to (at a minimum) remove transport client and replace native ES REST client with the Jest high level ES HTTP client
    • Action: Community consensus regarding removal of transport client support. Complete development/testing. Review and merge PR#160.


Re: [DISCUSS] Moving toward an initial release

sjudeng <sju...@...>
 

Let me know if this should be moved to another thread but is there any interest in scoping down the initial JanusGraph release to just be a "compatibility" release? There has been concern raised in discussions (for example here) that some users on production Titan systems would benefit from an initial release that has no breaking changes, especially to core storage, indexing and compute system components. There are already potentially breaking changes in master such as the TinkerPop update and others are in work.

Beyond mitigating concerns regarding having an initially easy migration step from Titan to JanusGraph, pursing a release branch without the additional features and dependency updates currently in work would allow for a relatively more rapid (and hopefully more stable) initial JanusGraph release. Any feedback on inevitable compatibility issues could then be addressed to the benefit of the subsequent JanusGraph release.


[VOTE] Elasticsearch upgrade

sjudeng <sju...@...>
 

I'd like to request a vote on the following proposal for proceeding with updates to Elasticsearch in JanusGraph (see this discuss thread for more information):

  1. Update to support ES 2.x and 5.x with native ES REST client. This work is complete and available for review in #79.
  2. Update to remove transport client and replace native ES REST client with the Jest high level ES HTTP client. This work is in progress and being tracked in #160.


Re: [DISCUSS] Elasticsearch Http using Jest

sjudeng <sju...@...>
 

I added a comment to the existing release discussion regarding creating an initial compatibility release and started a vote thread.

https://groups.google.com/d/msg/janusgraph-dev/8jkMnkKzmC0/nvfgaChmBQAJ
https://groups.google.com/forum/#!topic/janusgraph-dev/FLWuQMcW8-4

61 - 80 of 1585