[DISCUSS] Dropping HBase 1 support
Jansen, Jan
Hi I looked into the Hbase 1 support after Porunov asked why I want to drop if the builds are passing: https://github.com/JanusGraph/janusgraph/pull/2213#issuecomment-861620348. It seems that we stop testing HBase 1 in our CI solution already in branch 0.3. The main issue was a wrong combination of maven flags for stop testing. I tried to fix the flags and realized that isn't working against the Testcontainers solution, so revert internal back to commit before Testcontainers, see here https://github.com/GDATASoftwareAG/janusgraph/tree/test-hbase1. I had to fix some build issue than was able to execute tests which are failing https://github.com/GDATASoftwareAG/janusgraph/actions/runs/1024212663. (I didn't fix the HBase 2 build) My Idea would be to drop HBase 1 support. (Currently, HBase 1 support already requires a custom build of JG.) Any thoughts? Greetings, Jan
|
|
Re: [DISCUSS] JanusGraph versioning
Boxuan Li
1.0.0 sounds good to me. Maybe we can target 1.0.0 after this 0.6.0 release. I think we'd better not rename the incoming release from 0.6.0 to 1.0.0, because it contains many new changes and may take some time (+ bug fixes, if any) to get stable. I would rather see a stable 1.0.0 release with few new features than an unstable one with many new features.
Best regards, Boxuan Li
|
|
[DISCUSS] JanusGraph versioning
Hi,
I would like to start a discussion about JanusGraph versioning. Right now we have the next versioning: <GA indicator?>.<features and breaking changes>.<patch> So, for GA indicator we always have `0` and as far as I know we were waiting for JanusGraph to be stable to increment JanusGraph to 1.0.0 version. That said, I'm not sure how exactly should it be considered as stable. It was quite some time for JanusGraph to be used in production by many companies. We could think about stability as: - JanusGraph has been used in production for some time. - There are no breaking changes for some time. I think we meet the first case. I.e. JanusGraph is used in production but I'm not sure about the second option. We do have breaking changes often but mostly they are related to drop of support for EOL versions or driver upgrades. I guess, we will have such breaking changes continiously because the support for old drivers will be dropped and new drivers will be supported. Thus, such breaking changes are kind of natural thing I guess. I guess, bigger breaking changes are those, which influence storage layer (i.e. data). Last time we had such breaking changes was in 0.3.0 release (that said, they were very small and easy to be upgraded). So, if we count only such changes where you can't easily upgrade JanusGraph because you have data in old format - we meet second option as well in such situation. In case we consider JanusGraph to be stable enough, should we upgrade it to 1.x.y version? If we upgrade it, should we start following something like semantic versioning for all future versions (https://semver.org/) or should we think about different versioning / keep current versioning? When should we upgrade JanusGraph to 1.x.y? Should the first version `1.x.y` keep current `x.y` or reset it to `0.0`? I.e. should the first version be `1.7.0` or `1.0.0`? My thoughts on the above questions are: - We can consider JanusGraph to be stable enough - After the upgrade it would make sense to start using the same version number as in semantic versioning (i.e. MAJOR.MINOR.PATCH). - I guess we could do it on the next release after 0.6.0 but we potentially could rename 0.6.0 to 1.0.0 or 1.6.0 versions as well. I don't have good thoughts on that yet. - Both 1.0.0 and 1.7.0 / 1.6.0 are good to me. I don't have good thoughts about it yet as well. To be clear, I'm not insisting to bump JanusGraph to 1.x.y version immediately. What I wanted is to start a discussion about it to see other thoughts. Best regards, Oleksandr Porunov
|
|
JanusGraph Meetup #4 Recording
Ted Wilmes
Hello, Thanks to all who attended the meetup yesterday. If you weren't able to make it, you can find the recording at: https://www.experoinc.com/online-seminar/janusgraph-community-meetup. Thanks to our presenters: Marc, Saurabh, and Bruno, we had a really good set of material presented. Thanks, Ted
|
|
Re: [Meetup] JanusGraph Meetup May 18 covering JG OLAP approaches
Ted Wilmes
Hi Boxuan, Yes, definitely. I'll post this under presentations on janusgraph.org. Also, I hadn't posted meetup 3 on there yet and finally tracked the link down, so that will also be up there shortly. Thanks, Ted
On Sun, May 16, 2021 at 10:22 AM Boxuan Li <liboxuan@...> wrote:
|
|
Re: [Meetup] JanusGraph Meetup May 18 covering JG OLAP approaches
Boxuan Li
Hi Ted,
toggle quoted messageShow quoted text
Thanks for organizing this! Do you have plans to record & release the video after the meetup? 10:30 ET is a bit late for some regions in APAC, so it would be great if there would be a video record. Cheers, Boxuan
|
|
[Meetup] JanusGraph Meetup May 18 covering JG OLAP approaches
Ted Wilmes
Hello, We will be hosting a community meetup next week on Tuesday, May 18th at 9:30 central/10:30 eastern. We have a great set of speakers who will be discussing all things JanusGraph OLAP: * Hadoop Marc who has helped many of us on the mailing list and in JG issues * Saurabh Verma, principal engineer at Zeotap * Bruno Berriso, engineer at Expero If you're interested in signing up, here's the link: https://www.experoinc.com/get/janusgraph-user-group. Thanks, Ted
|
|
Re: Potential complexities of making secondary persistence atomic
Hi Florian!
That definitely sounds like a promising idea. You are most likely not the only one requiring consistency between storage and index data. In general, my advice is to start with a test case which explicitly addresses the situation where the primary persistence succeeds and the secondary persistence fails. From there, see what it needs to let the transaction fail. Once you have a solution for that, you can start the discussion by opening a PR. We will then see which issues could arise and try to solve them. The first prototype does not have to be perfect, a proof of concept is fine! Best Regards, Florian
|
|
Potential complexities of making secondary persistence atomic
florian.caesar <florian.caesar@...>
Hi! What are the potential complexities of pursuing this? How does this mess with JanusGraph's logic & assumptions in other places? If it seems fine, it should be an easy change. If nobody else wants to take it, I would also be happy to start a PR for this feature myself at some point.
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
As Tinkerpop is very close for a release, I'm proposing to delay JanusGraph 0.6.0 release, so that we could ship 0.6.0 release of JanusGraph with either 3.4.11 or 3.5.0 version of Tinkerpop. I assume, if everything goes well, we should be able to release JanusGraph 0.6.0 version in the middle of May.
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
Jansen, Jan
It sounds like tinkerpop will start in one week with the release. https://lists.apache.org/thread.html/r54ceb07f60b246342f5f2d9a09d50034c502012025eb88534fd9fa3f%40%3Cdev.tinkerpop.apache.org%3E
|
|
Re: [Performance Optimization] Optimization around the `system_properties` table interaction
Boxuan Li
Yeah that makes sense. I saw you said “unreplicated” thus wondered. I am not familiar with how `system_properties` is handled, but just want to point out that it is very difficult if not impossible to change the data model while keeping backward compatibility at the same time.
toggle quoted messageShow quoted text
|
|
Re: [Performance Optimization] Optimization around the `system_properties` table interaction
Hi Boxuan Li
|
|
Re: [Performance Optimization] Optimization around the `system_properties` table interaction
Boxuan Li
Hi @sauverma,
I am just curious: I noticed you said "there is only 1 partition for system_properties unreplicated". Do you have storage.cql.replication-factor = 1?
|
|
Re: [Performance Optimization] Optimization around the `system_properties` table interaction
Hi all
Updates on this issue - We found that the periodic removal of system_properties (while the ingestion is running) leads to graph corruption (mentioned at high level at https://docs.janusgraph.org/advanced-topics/recovery/) - The perf issue we saw were due to below reasons - improper handling on dataproc scaledown which lead to connections not getting closed to JG, and thus ever increasing system_properties table - unbounded access to the scylla caching layer, which is basically unthrottled access to scylla caching system, leading to other queries slowing down due to the system_properties single, hot partition - in addition to this, the data model for system_properties still needs to be fixed via usage of clustering keys, by design system_properties has only 1 SINGLE partition and all spark executors hit it while initialization leading to query slow down -> query queuing -> query timeouts Thanks
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
Jansen, Jan
What do you think? Sounds great Would such a deadline be OK? I think it is good idea to have a deadline. Von: janusgraph-dev@... <janusgraph-dev@...> im Auftrag von Oleksandr Porunov <alexandr.porunov@...>
Gesendet: Dienstag, 30. März 2021 18:49:50 An: janusgraph-dev@... Betreff: Re: [janusgraph-dev] [DISCUSS] JanusGraph 0.6.0 release I am good with waiting for TinkerPop 3.4.11 or 3.5.0 if we expect those releases soon, but I don't want to delay 0.6.0 release too much because the community sometimes asks for the release.
Also, if we release 0.6.0 without TinkerPop 3.5.0 version, I'm good with shipping 0.7.0 version even if that release is just 1 commit difference (TinkerPop upgrade). As I said, I'm good with waiting for TinkerPop 3.4.11 or TinkerPop 3.5.0 but I would prefer to set a day till which we should start a releasing process (unless there are critical bugs). I would propose to set a deadline on the 1st May. If there are no new TinkerPop releases till that day (due to some delays) then we start releasing process on the 1st May as is. If there are new TinkerPop releases in April then we update JanusGraph to the latest TinkerPop release and start releasing process after that. What do you think? Would such a deadline be OK? Best regards, Oleksandr
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
I am good with waiting for TinkerPop 3.4.11 or 3.5.0 if we expect those releases soon, but I don't want to delay 0.6.0 release too much because the community sometimes asks for the release.
Also, if we release 0.6.0 without TinkerPop 3.5.0 version, I'm good with shipping 0.7.0 version even if that release is just 1 commit difference (TinkerPop upgrade). As I said, I'm good with waiting for TinkerPop 3.4.11 or TinkerPop 3.5.0 but I would prefer to set a day till which we should start a releasing process (unless there are critical bugs). I would propose to set a deadline on the 1st May. If there are no new TinkerPop releases till that day (due to some delays) then we start releasing process on the 1st May as is. If there are new TinkerPop releases in April then we update JanusGraph to the latest TinkerPop release and start releasing process after that. What do you think? Would such a deadline be OK? Best regards, Oleksandr
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
If we wait for TinkerPop 3.5.0 (or 3.4.11), we can probably also release the configurable batch sizes in 0.6.0 which should bring a noticable performance gain if used properly.
Kind regards, Florian
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
Jansen, Jan
Hi I would be great to release JanusGraph 0.6.0. I just want to mention that TinkerPop 3.5.0 is planed to be released in April. We don't release new version in frequent time span, so should we wait to release or not? Best regards, Jan
|
|
Re: [DISCUSS] JanusGraph 0.6.0 release
Boxuan Li
Thanks for organizing this, Oleksandr! I would like to have Fix potential ThreadLocal transaction leak released in 0.6.0. If there is no further comment I plan to merge it this weekend.
Graph Reindexing Issue Fix: this PR is actually WIP. If I have some time at the weekend, I'll try to see if I can help fix it. Best regards, Boxuan Li
|
|