Elasticsearch 8 is of course a good point. But do we need to support that directly in 1.0.0? Elasticsearch 7.17 seems to be still supported roughly until 9.0.0 gets released if I understood their EoL policy correctly [1].
Can’t we also add support for Elasticsearch 8 in a minor update, like 1.1.0 and clearly document how people need to migrate their indices if they want to update their Elasticsearch installation to v8? What I mean is that it’s only a breaking change for users who actually update to Elasticsearch 8 and then they should expect some breaking changes in general.
Or do you expect the change on our side to already be breaking so that all users have to migrate their indices, even if they stay on ES 7?
In general, I think we will end up with a release date for 1.0.0 some time in January irrespective of this if we want to publish a release candidate first and then provide users some time to try it out and provide feedback. (And nobody argued against it, so I already went ahead and created a PR for the release candidate.)
[1]: https://www.elastic.co/support/eol
Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von Oleksandr Porunov
Gesendet: Dienstag, 29. November 2022 17:11
An: janusgraph-dev@...
Betreff: Re: [janusgraph-dev] [DISCUSS] 1.0.0 / 0.6.3 Releases
Hi,
Upgrade to ElasticSearch 8 will be a breaking change in regarding to Geoshape properties because we won't have Prefix Tree indexing strategy anymore. Users will need to do migration of any indexes containing Geoshape properties because Circle shape isn't compatible with the BDB indexing strategy (the only geoshape indexing strategy available in ES 8). See: https://github.com/JanusGraph/janusgraph/pull/3268
That PR is blocked until ElasticSearch 8.6.0 is released. I would prefer including such a breaking change into the 1.0.0 release.
In December there are usually many holidays and I suspect not many people will work during holidays. I think it would make sense to make `1.0.0` release early January, so that we have time to finish several other PRs and allow more people to test it because it will include quite a few breaking changes.
Best regards,
Oleksandr
Upgrade to ElasticSearch 8 will be a breaking change in regarding to Geoshape properties because we won't have Prefix Tree indexing strategy anymore. Users will need to do migration of any indexes containing Geoshape properties because Circle shape isn't compatible with the BDB indexing strategy (the only geoshape indexing strategy available in ES 8). See: https://github.com/JanusGraph/janusgraph/pull/3268
That PR is blocked until ElasticSearch 8.6.0 is released. I would prefer including such a breaking change into the 1.0.0 release.
In December there are usually many holidays and I suspect not many people will work during holidays. I think it would make sense to make `1.0.0` release early January, so that we have time to finish several other PRs and allow more people to test it because it will include quite a few breaking changes.
Best regards,
Oleksandr
Yes, a release candidate probably makes sense for the 1.0.0 release. I think we don’t need a formal voting process for a release candidate. So, if there are no objections, I start preparing such a release candidate which will also include a PR to update the version numbers on master.
If there are any issues that you think should be included in the release candidate, then please list them. Otherwise, simply everything that is merged into master until the PR that updates the version numbers is merged will be included.
Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von Boxuan Li
Gesendet: Montag, 28. November 2022 17:39
An: janusgraph-dev@...
Betreff: Re: [janusgraph-dev] [DISCUSS] 1.0.0 / 0.6.3 Releases
That sounds really awesome! Just one thought:
Shall we do a RC release (e.g. 1.0.0 RC1) before the official 1.0.0 release? There are quite some breaking and major changes in 1.0.0, and I feel we should be extra cautious about that.
Cheers,
Boxuan
Sent: Monday, November 28, 2022 10:42:31 AM
To: janusgraph-dev@... <janusgraph-dev@...>
Subject: [janusgraph-dev] [DISCUSS] 1.0.0 / 0.6.3 Releases
Hi,
I would like to start a discussion about our next release(s) as there were multiple requests from the community already.
Our last release (0.6.2) was also already half a year ago (May 31) and our latest major release (0.6.0) was over a year ago in September 2021.
We also already have quite some features ready for 1.0.0 (and I probably even forgot some other important ones):
- Support for Java 11: https://github.com/JanusGraph/janusgraph/issues/2161
- Support for Cassandra 4: https://github.com/JanusGraph/janusgraph/issues/2325
- Cache performance improvements: https://github.com/JanusGraph/janusgraph/issues/3185 and https://github.com/JanusGraph/janusgraph/issues/871
- Support for TinkerPop 3.6: https://github.com/JanusGraph/janusgraph/issues/3069
- Upgrade to Log4j2: https://github.com/JanusGraph/janusgraph/pull/2890
- Use mixed indices for numeric aggregations (min(), max(), mean(), sum()): https://github.com/JanusGraph/janusgraph/issues/3202
- Lots of dependency updates which fix security issues
I think these are really substantial improvements that we should release to our users and therefore at least not wait much longer for a major release. Users have also repeatedly asked for a release that contains some of these improvements, especially the security related dependency updates.
So, are there any important issues / PRs that you definitely want to see included in the next release? Otherwise, I suggest that we set a fixed deadline of when we start the release process and everything that is not merged until then will be moved to a follow-up release.
Are there any issues that should definitely be included in a 1.0.0 release but will not be finished in the next few weeks? I think in that case we could also decide to release 0.7.0 first from master so we don’t block the release just because we have decided that the next major release should be 1.0.0.
But I think it should also not be a problem to postpone any open issues to a later major release. As Oleksandr mentioned in the thread where we decided that 1.0.0 should be next major release, 1.0.0 only indicates that JanusGraph is ready for production usage [1]. It doesn’t mean that there won’t be breaking changes afterwards (as there definitely will be).
Together with 1.0.0 (or 0.7.0 if we decide on that as the next major release), we can also release 0.6.3 from the 0.6 branch. If you see any issues that should be included in that release, then that would of course also be good to know.
Does anyone want to volunteer being the release manager for these two releases? Otherwise, I can also do it.
Regards,
Florian
[1]: https://lists.lfaidata.foundation/g/janusgraph-dev/message/1517
Sent: Monday, November 28, 2022 10:42:31 AM
To: janusgraph-dev@... <janusgraph-dev@...>
Subject: [janusgraph-dev] [DISCUSS] 1.0.0 / 0.6.3 Releases
Hi,
I would like to start a discussion about our next release(s) as there were multiple requests from the community already.
Our last release (0.6.2) was also already half a year ago (May 31) and our latest major release (0.6.0) was over a year ago in September 2021.
We also already have quite some features ready for 1.0.0 (and I probably even forgot some other important ones):
- Support for Java 11: https://github.com/JanusGraph/janusgraph/issues/2161
- Support for Cassandra 4: https://github.com/JanusGraph/janusgraph/issues/2325
- Cache performance improvements: https://github.com/JanusGraph/janusgraph/issues/3185 and https://github.com/JanusGraph/janusgraph/issues/871
- Support for TinkerPop 3.6: https://github.com/JanusGraph/janusgraph/issues/3069
- Upgrade to Log4j2: https://github.com/JanusGraph/janusgraph/pull/2890
- Use mixed indices for numeric aggregations (min(), max(), mean(), sum()): https://github.com/JanusGraph/janusgraph/issues/3202
- Lots of dependency updates which fix security issues
I think these are really substantial improvements that we should release to our users and therefore at least not wait much longer for a major release. Users have also repeatedly asked for a release that contains some of these improvements, especially the security related dependency updates.
So, are there any important issues / PRs that you definitely want to see included in the next release? Otherwise, I suggest that we set a fixed deadline of when we start the release process and everything that is not merged until then will be moved to a follow-up release.
Are there any issues that should definitely be included in a 1.0.0 release but will not be finished in the next few weeks? I think in that case we could also decide to release 0.7.0 first from master so we don’t block the release just because we have decided that the next major release should be 1.0.0.
But I think it should also not be a problem to postpone any open issues to a later major release. As Oleksandr mentioned in the thread where we decided that 1.0.0 should be next major release, 1.0.0 only indicates that JanusGraph is ready for production usage [1]. It doesn’t mean that there won’t be breaking changes afterwards (as there definitely will be).
Together with 1.0.0 (or 0.7.0 if we decide on that as the next major release), we can also release 0.6.3 from the 0.6 branch. If you see any issues that should be included in that release, then that would of course also be good to know.
Does anyone want to volunteer being the release manager for these two releases? Otherwise, I can also do it.
Regards,
Florian
[1]: https://lists.lfaidata.foundation/g/janusgraph-dev/message/1517
Hi Dustin,
sure, that will be included. I should have mentioned that probably:
0.6.3 will include all issues / PRs tagged with the 0.6.3 milestone:
https://github.com/JanusGraph/janusgraph/milestone/24
1.0.0 will include everything from the 0.6.3 milestone, and additionally everything from the 1.0.0 milestone:
https://github.com/JanusGraph/janusgraph/milestone/21
The PR you linked is included in the 1.0.0 milestone.
Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von dustinwaguespack@...
Gesendet: Montag, 28. November 2022 16:58
An: janusgraph-dev@...
Betreff: Re: [janusgraph-dev] [DISCUSS] 1.0.0 / 0.6.3 Releases
Hi Florian,
Would this release contain the commits in this PR - https://github.com/JanusGraph/janusgraph/pull/2904?
v/r
Dustin
Would this release contain the commits in this PR - https://github.com/JanusGraph/janusgraph/pull/2904?
v/r
Dustin
Hi,
I would like to start a discussion about our next release(s) as there were multiple requests from the community already.
Our last release (0.6.2) was also already half a year ago (May 31) and our latest major release (0.6.0) was over a year ago in September 2021.
We also already have quite some features ready for 1.0.0 (and I probably even forgot some other important ones):
- Support for Java 11: https://github.com/JanusGraph/janusgraph/issues/2161
- Support for Cassandra 4: https://github.com/JanusGraph/janusgraph/issues/2325
- Cache performance improvements: https://github.com/JanusGraph/janusgraph/issues/3185 and https://github.com/JanusGraph/janusgraph/issues/871
- Support for TinkerPop 3.6: https://github.com/JanusGraph/janusgraph/issues/3069
- Upgrade to Log4j2: https://github.com/JanusGraph/janusgraph/pull/2890
- Use mixed indices for numeric aggregations (min(), max(), mean(), sum()): https://github.com/JanusGraph/janusgraph/issues/3202
- Lots of dependency updates which fix security issues
I think these are really substantial improvements that we should release to our users and therefore at least not wait much longer for a major release. Users have also repeatedly asked for a release that contains some of these improvements, especially the security related dependency updates.
So, are there any important issues / PRs that you definitely want to see included in the next release? Otherwise, I suggest that we set a fixed deadline of when we start the release process and everything that is not merged until then will be moved to a follow-up release.
Are there any issues that should definitely be included in a 1.0.0 release but will not be finished in the next few weeks? I think in that case we could also decide to release 0.7.0 first from master so we don’t block the release just because we have decided that the next major release should be 1.0.0.
But I think it should also not be a problem to postpone any open issues to a later major release. As Oleksandr mentioned in the thread where we decided that 1.0.0 should be next major release, 1.0.0 only indicates that JanusGraph is ready for production usage [1]. It doesn’t mean that there won’t be breaking changes afterwards (as there definitely will be).
Together with 1.0.0 (or 0.7.0 if we decide on that as the next major release), we can also release 0.6.3 from the 0.6 branch. If you see any issues that should be included in that release, then that would of course also be good to know.
Does anyone want to volunteer being the release manager for these two releases? Otherwise, I can also do it.
Regards,
Florian
[1]: https://lists.lfaidata.foundation/g/janusgraph-dev/message/1517
I would like to start a topic about JanusGraph db-cache we have today and ways to improve it for distributed environments.
I want to split this topic on several issues I see with this cache:
1) Invalidation
2) Performance
3) Sharding
Invalidation
The first main issue with this cache is that it doesn't have good invalidation mechanisms.As for now there are 3 invalidation mechanisms:
- Enough time passed (
cache.db-cache-time
). - Evicted due to cache size limitation (
cache.db-cache-time
). - Evicted on current JanusGraph instance only due to being mutated on the current JanusGraph instance.
Moreover, I think it would be great if there would be some kind of pattern implemented in JanusGraph to be able to invalidate data on mutation on all JanusGraph nodes ( the issue to track this feature is here: https://github.com/JanusGraph/janusgraph/issues/3155 ). I don't know how it's best to implement the later feature, but I have following ideas:
- We can reuse JanusGraph messaging mechanism (i.e. the one which works on top of a storage backend). With this feature users don't need any external messaging tools and can enable global db-cache invalidation on mutation easily. The downside is most likely performance, because usually those storage backends are not the best tools for messaging.
- Using external tool for messaging (let's say Kafka, Redis, etc.). The advantage would be performance most likely but the disadvantage is that the user now needs to manage a separate external system.
- Providing an interface for mutated data invalidation. We can make a general interface which accepts a set of keys which need to be evicted in db-cache and then the implementation can be either developed by the user themselves or they can use existing JanusGraph solutions (let.s say we will have 3 options at the beggining: `storage-messaging`, `redis-messaging`, `kafka-messaging` and we will be able to add more systems if there is interest in them).
That's just a brainstorming, so if anyone has any thoughts about it. please share. Maybe this issue should be solved differently and maybe I should look at it from a different angle.
For some reason we didn't look at this side too much but as noted here and here the Guava cache we use isn't the best option. I didn't investigate what is the best option for those caches we have (and we have 8 caches as commented here).
As an obvious solution is to move from Guava to Caffeine cache. That said, if anyone thinks we need to try another cache or have any thoughts about it, please post them here.
As for now db-cache caches all the data per JanusGraph instance. No any cache data is shared between multiple JanusGraph instances. In some use-cases this is an advantage but in some situations it's a disadvantage.
I think that it would makes sense to add several options for different db-cache implementation. Some implementations would be local only and some distributed against all participating JanusGraph nodes.
The are several implementations I could think of:
1) Default local only db-cache. This cache would use Caffeine cache implementation and there could be some invalidation strategies available to trigger invalidation on all JanusGraph nodes using some messaging tools (as described in `Invalidation` section).
2) Redis cache - this cache would use external Redis nodes to cache all the data. The advantage is that Redis may be shared and scaled separately. All the data in Redis will be distributed (depending on installation) which may improve cache usage in some cases. Maybe also using client-side caching could improve performance in some cases.
3) Hazelcast cache - this cache would require additional configuration and making sure that all JanusGraph nodes can reach other JanusGraph nodes. Hazelcast cache can be bound to JVM but being able to form a cluster on nodes for the cache and shard / replicate all the data between all the nodes in the cluster. With this cache users won't need to manage external cache nodes but users will need to make sure that their JanusGraph installation has direct network access to all nodes in the cluster.
Any thoughts about any of the above points (or maybe missing points / issues) are very welcome.
Best regards,
Oleksandr
Here are the results from the August 2022 license scan of the JanusGraph project. The scan was performed using the Linux Foundation Fossology server. Licenses and copyrights were examined.
The key findings (if any) and license summary can be found in the HTML report, the list of files in the spreadsheet, and also find the SPDX file listed below:
REPORTS:
lfai/JanusGraph, code pulled 2022-08-08
- report: https://lfscanning.org/reports/lfai/JanusGraph-2022-08-08-128b378d-8656-4907-ad70-b876a161c82d.html
- xlsx: https://lfscanning.org/reports/lfai/JanusGraph-2022-08-08-128b378d-8656-4907-ad70-b876a161c82d.xlsx
- spdx: https://github.com/lfscanning/spdx-lfai/tree/master/JanusGraph/2022-07/JanusGraph-2022-08-08.spdx
Please feel free to contact me with any questions about the scan results. Be sure to reply to me directly as I may not get an email sent directly to the distribution list. Since this is the first license scan for this project, if you have any general questions about the scanning process you can let me know that too.
Thanks, Jeff
Jeff Shapiro
408-910-7792
jshapiro@...
[Edited Message Follows]
1. Sure, see: https://github.com/JanusGraph/janusgraph/blob/master/LICENSE.txt
2. JanusGraph backends are not columnar in the usual meaning of the word, but rather assume a BigTable like KeyColumnValue store, see https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/keycolumnvalue/KeyColumnValueStore.java
This provides better horizontal scalability than technologies like solr/elacticsearch. A KeyColumnValue store client can see from the partitioned key which backend node to address, while solr/elastic involves a server-side routing step to direct a query to the right shard when requesting a specific document by id. This might be different for your inhouse store, though, so check whether it can implement the interface above efficiently. You may also want to check https://li-boxuan.medium.com/janusgraph-deep-dive-data-layout-in-janusgraph-cd33c045a495 whether your store supports this data layout.
2. JanusGraph backends are not columnar in the usual meaning of the word, but rather assume a BigTable like KeyColumnValue store, see https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/keycolumnvalue/KeyColumnValueStore.java
This provides better horizontal scalability than technologies like solr/elacticsearch. A KeyColumnValue store client can see from the partitioned key which backend node to address, while solr/elastic involves a server-side routing step to direct a query to the right shard when requesting a specific document by id. This might be different for your inhouse store, though, so check whether it can implement the interface above efficiently. You may also want to check https://li-boxuan.medium.com/janusgraph-deep-dive-data-layout-in-janusgraph-cd33c045a495 whether your store supports this data layout.
My company has a inhouse db like elasticsearch but only index based search, no columnar.
I see that janusgraph mostly has columnar db as the primary data storage.
1. can we plugin our own db, is it allowed?
2. why only columnar as the primary db, any reasons.
index based lucene also can support like getting all the indexed documents.
any reason why columnar as the primary backend?
thanks,
Kalpa
JanusGraph is an Apache TinkerPop enabled property graph database with support for a variety of storage and indexing backends. Thank you to all of the contributors.
https://github.com/JanusGraph/janusgraph/releases/tag/v0.6.2
A full binary distribution is provided for user convenience:
https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-full-0.6.2.zip
https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-0.6.2.zip
The online docs can be found here:
https://docs.janusgraph.org
on behalf of JanusGraph TSC
Thanks for your feedback, Oleksandr.
We now have 3 people here on the dev list in favour of starting a Discord Server and no one against it. Only one user responded on janusgraph-users in the thread you linked, and that user was also in favour of Discord. I also asked people on Gitter for their opinion [1] but didn’t really get a response there. Since also nobody on Gitter voiced concerns about moving to Discord, I don’t really see a reason against starting a Discord Server.
So, I went ahead and created a server. You can join via this link: https://discord.gg/5MnxF82VGw
I’d say we wait a week or so before promoting it also to users, so we have some time to use and test the server ourselves.
Regarding links between the TinkerPop server and our own: We can probably create some kind of “welcome” channel where we briefly explain the channels of our own server and then also link to the TinkerPop server for general Gremlin questions. TinkerPop already has such a welcome channel, and we can later ask to be linked there for JanusGraph specific questions.
[1]: https://gitter.im/janusgraph/janusgraph?at=6284afb0bd487e746b5c790f
Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von Oleksandr Porunov
Gesendet: Mittwoch, 1. Juni 2022 12:53
An: janusgraph-dev@...
Betreff: Re: [janusgraph-dev] [DISCUSS] Discord Server
Related thread in Users group: https://lists.lfaidata.foundation/g/janusgraph-users/topic/discuss_moving_from_gitter/91181812
I must merge the v0.6 branch into master to update the changelog.md.
Greetings,
Jan
Sent: Wednesday, June 1, 2022 12:18 PM
To: janusgraph-dev@... <janusgraph-dev@...>
Subject: Re: [janusgraph-dev] [VOTE] JanusGraph 0.6.2 release
I downloaded JanusGraph Full artifacts, started JanusGraph Server with Cassandra and ElasticSearch, checked that queries from `Basic Usage` documentation - looks good.
I quickly checked the content of Sonatype staging repository - looks good.
Release tag commit - looks good.
Changelog record - looks good.
My vote is +1.
Best regards,
Oleksandr Porunov
I think it would make sense to start a Discord server for JanusGraph and see how much popularity it gets. I guess we shouldn't abandon Gitter just now because we have 864 people there but we should prioritize using Discord and wait until it gets the same number of users as Gitter has right now.
As for another aspect - I'm also in favor of using Dedicated server because we will be able to create multiple channels which should be very useful to structure future discussions.
I didn't research much but if there is a good way of redirecting users with Gremlin questions to TinkerPop server and JanusGraph questions to JanusGraph server than it would be great. I would imagine some dual-linking between JanusGraph server and TinkerPop server but not sure it that's possible.
I downloaded JanusGraph Full artifacts, started JanusGraph Server with Cassandra and ElasticSearch, checked that queries from `Basic Usage` documentation - looks good.
I quickly checked the content of Sonatype staging repository - looks good.
Release tag commit - looks good.
Changelog record - looks good.
My vote is +1.
Best regards,
Oleksandr Porunov