Date   

Re: [DISCUSS] TinkerPop version for next JanusGraph release

Jerry He <jerr...@...>
 

Are there any other incompatible changes from TinkerPop 3.2.x to 3.3?

Thanks,

Jerry

On Mon, Aug 28, 2017 at 10:40 AM, Robert Dale <rob...@...> wrote:
+1 for TinkerPop 3.3

Robert Dale

On Mon, Aug 28, 2017 at 12:19 PM, Jason Plurad <plu...@...> wrote:

TinkerPop 3.3.0 is released, TinkerPop 3.2.6 also. JanusGraph master is
currently at TinkerPop 3.2.6.

sjudeng has a pull request open for TP 3.3 support, and it is passing
Travis CI.

TinkerPop 3.3 is the latest, greatest release. The most notable part of
it, as far as dependencies are concerned, is that it brings Spark 2.2 (Scala
2.11) support. This is a big jump forward from Spark 1.6.1 (Scala 2.10)
released in March 2016.

Any reason to hold up from moving forward with TP 3.3? Any users in
production out there relying on Spark 1.6.1? The 0.1 branch is still open
for fixes, so that could be the answer for people that want to stick with
the older versions.

I'm +1 for moving to TinkerPop 3.3.

--
You received this message because you are subscribed to the Google Groups
"JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to janusgr...@....
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to janusgr...@....
For more options, visit https://groups.google.com/d/optout.


Re: [DISCUSS] TinkerPop version for next JanusGraph release

Robert Dale <rob...@...>
 

+1 for TinkerPop 3.3

Robert Dale

On Mon, Aug 28, 2017 at 12:19 PM, Jason Plurad <plu...@...> wrote:
TinkerPop 3.3.0 is released, TinkerPop 3.2.6 also. JanusGraph master is currently at TinkerPop 3.2.6.

sjudeng has a pull request open for TP 3.3 support, and it is passing Travis CI.

TinkerPop 3.3 is the latest, greatest release. The most notable part of it, as far as dependencies are concerned, is that it brings Spark 2.2 (Scala 2.11) support. This is a big jump forward from Spark 1.6.1 (Scala 2.10) released in March 2016.

Any reason to hold up from moving forward with TP 3.3? Any users in production out there relying on Spark 1.6.1? The 0.1 branch is still open for fixes, so that could be the answer for people that want to stick with the older versions.

I'm +1 for moving to TinkerPop 3.3.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[DISCUSS] TinkerPop version for next JanusGraph release

Jason Plurad <plu...@...>
 

TinkerPop 3.3.0 is released, TinkerPop 3.2.6 also. JanusGraph master is currently at TinkerPop 3.2.6.

sjudeng has a pull request open for TP 3.3 support, and it is passing Travis CI.

TinkerPop 3.3 is the latest, greatest release. The most notable part of it, as far as dependencies are concerned, is that it brings Spark 2.2 (Scala 2.11) support. This is a big jump forward from Spark 1.6.1 (Scala 2.10) released in March 2016.

Any reason to hold up from moving forward with TP 3.3? Any users in production out there relying on Spark 1.6.1? The 0.1 branch is still open for fixes, so that could be the answer for people that want to stick with the older versions.

I'm +1 for moving to TinkerPop 3.3.


Gremlin - Sugar plugin

Gene Fojtik <genef...@...>
 

Hello, 

I'm trying to run a profile on my graph using the sugar plugin, however the results of both benchmark and profile queries return ==> null.

Any tricks tips?  

The queries I'm running return results when run normally.

-g


Re: [DISCUSS] merge/commit flow for committers

sjudeng <sju...@...>
 

It looks like we're getting pretty close to the release of TinkerPop 3.3.0, do we have a path forward here? I think Jerry and I have made some good points for and against both of the proposed approaches. Irving and Ted, there has been a lot more discussion since you chimed in have you changed your opinions? Any other committers or contributors have a strong preference between the proposed merge/rebase approaches?


On Wednesday, July 12, 2017 at 2:09:12 PM UTC-5, Jason Plurad wrote:
I came across PR-393 the other day as I was looking through the latest commits.

> A number of commits in 0.1 are missing in master. @srosenthal Thanks for catching.

Many thanks to everybody that helped identify and fix that situation!

However I was left wondering a bunch of things:
1. Where would I find the conversation that triggered the PR?
2. How did we miss merging commits into master?
3. How are we tracking which issues get merged into which branches?
4. Do we have the merge/commit process sufficiently documented?

We have this document on pull requests http://docs.janusgraph.org/latest/pull-requests.html but it doesn't seem to cover anything about how to handle merges for multiple branches. For example, TinkerPop is a bit more clear on how it handles this http://tinkerpop.apache.org/docs/3.2.5/dev/developer/#branches

There's some good discussion in PR-393. If we reached a consensus on what the right strategy is, feel free to chime in here amcp, jerryjch, sjudeng, srosenthal, and others so we can get it documented and publicized more widely.

-- Jason


Re: Is Janus graph database mandatorily need a back end database

Jason Plurad <plu...@...>
 

1. Yes, you need to pick a storage backend. There is an inmemory storage backend that is useful for small testing scenarios, if you don't care about persistence.

2. Yes, you can use JanusGraph with Python. This is done with the Gremlin Server. You would need to add a configuration for gremlin-python. The Apache TinkerPop documentation covers a lot of this. Keep in mind that you would still be using Gremlin, which is a domain specific language for traversing graphs, but it would be native in Python (rather than passing a Gremlin query string across the wire to the server).


On Tuesday, August 22, 2017 at 7:26:56 AM UTC-4, sankeeta kamath wrote:
Hi ,

I am a new bee willing to explore on Janus database. One of my primary question is Is Janus graph database mandatorily need a back end database. Second thing is Is there any possibilities that I can use Janus with python(rather than gremlin).


Thank You


Re: New committers: Robert Dale, Paul Kendall, Samant Maharaj

Jerry He <jerr...@...>
 

Congratulations and welcome!

On Tue, Aug 22, 2017 at 3:42 PM, sjudeng <sju...@...> wrote:
Robert, Paul and Samant - Thanks for the great work you've put into
JanusGraph and welcome aboard!

On Tuesday, August 22, 2017 at 6:32:26 AM UTC-5, Jason Plurad wrote:

On behalf of the JanusGraph Technical Steering Committee (TSC), I'm
pleased to welcome 3 new committers on the project! Here they are in
alphabetical order by last name.

Robert Dale: Robert has been a solid contributor, and his contributions
are across the board -- triaging issues, submitting/reviewing pull requests,
and answering questions on the Google groups. He's also on the Apache
TinkerPop PMC.

Paul Kendall and Samant Maharaj: Paul and Samant contributed the CQL
storage adapter. This is a pretty big achievement and helps steer JanusGraph
towards future compatibility with Cassandra 4.0. They are continuing work on
cleaning up the Cassandra source code tree that will help make testing it
easier and better.

Congratulations to all!
--
You received this message because you are subscribed to the Google Groups
"JanusGraph developers" group.

To unsubscribe from this group and stop receiving emails from it, send an
email to janusgr...@....
For more options, visit https://groups.google.com/d/optout.


Re: New committers: Robert Dale, Paul Kendall, Samant Maharaj

sjudeng <sju...@...>
 

Robert, Paul and Samant - Thanks for the great work you've put into JanusGraph and welcome aboard!


On Tuesday, August 22, 2017 at 6:32:26 AM UTC-5, Jason Plurad wrote:
On behalf of the JanusGraph Technical Steering Committee (TSC), I'm pleased to welcome 3 new committers on the project! Here they are in alphabetical order by last name.

Robert Dale: Robert has been a solid contributor, and his contributions are across the board -- triaging issues, submitting/reviewing pull requests, and answering questions on the Google groups. He's also on the Apache TinkerPop PMC.

Paul Kendall and Samant Maharaj: Paul and Samant contributed the CQL storage adapter. This is a pretty big achievement and helps steer JanusGraph towards future compatibility with Cassandra 4.0. They are continuing work on cleaning up the Cassandra source code tree that will help make testing it easier and better.

Congratulations to all!


Re: New committers: Robert Dale, Paul Kendall, Samant Maharaj

Misha Brukman <mbru...@...>
 

Robert, Paul and Samant — thank you for the great work and welcome!


On Tue, Aug 22, 2017 at 7:32 AM, Jason Plurad <plu...@...> wrote:
On behalf of the JanusGraph Technical Steering Committee (TSC), I'm pleased to welcome 3 new committers on the project! Here they are in alphabetical order by last name.

Robert Dale: Robert has been a solid contributor, and his contributions are across the board -- triaging issues, submitting/reviewing pull requests, and answering questions on the Google groups. He's also on the Apache TinkerPop PMC.

Paul Kendall and Samant Maharaj: Paul and Samant contributed the CQL storage adapter. This is a pretty big achievement and helps steer JanusGraph towards future compatibility with Cassandra 4.0. They are continuing work on cleaning up the Cassandra source code tree that will help make testing it easier and better.

Congratulations to all!

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: New committers: Robert Dale, Paul Kendall, Samant Maharaj

Ted Wilmes <twi...@...>
 

Welcome aboard Robert, Paul, and Samant! Thanks for the excellent contributions.

--Ted


On Tuesday, August 22, 2017 at 6:32:27 AM UTC-5, Jason Plurad wrote:
On behalf of the JanusGraph Technical Steering Committee (TSC), I'm pleased to welcome 3 new committers on the project! Here they are in alphabetical order by last name.

Robert Dale: Robert has been a solid contributor, and his contributions are across the board -- triaging issues, submitting/reviewing pull requests, and answering questions on the Google groups. He's also on the Apache TinkerPop PMC.

Paul Kendall and Samant Maharaj: Paul and Samant contributed the CQL storage adapter. This is a pretty big achievement and helps steer JanusGraph towards future compatibility with Cassandra 4.0. They are continuing work on cleaning up the Cassandra source code tree that will help make testing it easier and better.

Congratulations to all!


New committers: Robert Dale, Paul Kendall, Samant Maharaj

Jason Plurad <plu...@...>
 

On behalf of the JanusGraph Technical Steering Committee (TSC), I'm pleased to welcome 3 new committers on the project! Here they are in alphabetical order by last name.

Robert Dale: Robert has been a solid contributor, and his contributions are across the board -- triaging issues, submitting/reviewing pull requests, and answering questions on the Google groups. He's also on the Apache TinkerPop PMC.

Paul Kendall and Samant Maharaj: Paul and Samant contributed the CQL storage adapter. This is a pretty big achievement and helps steer JanusGraph towards future compatibility with Cassandra 4.0. They are continuing work on cleaning up the Cassandra source code tree that will help make testing it easier and better.

Congratulations to all!


Is Janus graph database mandatorily need a back end database

sankee...@...
 

Hi ,

I am a new bee willing to explore on Janus database. One of my primary question is Is Janus graph database mandatorily need a back end database. Second thing is Is there any possibilities that I can use Janus with python(rather than gremlin).


Thank You


Re: [DISCUSS] merge/commit flow for committers

sjudeng <sju...@...>
 

Jerry, Thank you for continuing to discuss this. It's hard to find consensus when it comes to whether to follow a merge or rebase workflow.

I do think the TinkerPop flow puts less burden on committers for day-to-day reviews and merges because as long as the PR is targeted to the lowest applicable release branch the merge up to master (or other release branches) can happen any time later, potentially by another committer, with equal ease. I think validating the branch target for new PRs would be easy for us to implement and monitor.

With the Spark flow the committer would really need to, immediately after merging into master, create a new branch off the relevant release branch, cherry-pick the commit from master into same, and then create a new PR (possibly repeating the process in the case of multiple release branches). It would be possible to do it later but I think it would be easy to lose track of the commit. For example if a month passed or another committer wanted to see if any commits in master were missing in the release branch all commits would need to be scanned.

It seems like while the Spark flow may result in a cleaner repository in terms of removing merge commits it does require more diligence by the repository maintainer and I think it would be better in an environment where a single person is responsible for managing the repository, probably with the ability to push directly to branches.

Currently we're almost all merging rather than rebasing PRs. As a compromise we could consider instead rebasing PRs. This way we'd remove the merge commit into the release branch, which would balance out the eventual additional merge commit (which might actually include multiple PRs/commits) from the release branch into master.


On Monday, August 14, 2017 at 12:15:02 AM UTC-5, Jerry He wrote:
Thanks for bringing up this thread again, and I apologize for not answering earlier.  I was trying to wait for others to chime in, but it eventually escaped me.

Some answers to the earlier questions first.
1. When cherry-picking what if the commit isn't the most recent, do you need to identify the commit hash (or should you always cherry-pick by hash)?
Yes, need the commit hash.
2. In the case of multiple commits in a PR do you cherry-pick individually or can you cherry pick the merge commit?
There is always one commit per PR.
3. If a committer wants to check whether master is up-to-date with a release branch how is this done?
The master is the main branch and always first to consider (unless in special cases where it does not apply).  So there is no worry if it is up-to-date.

I think my questions and concerns are here:
1.  Do we allow command line direct push of commits to the main repo?
2.  In the commit history, the less the 'merge' commits showing up, the better.  For example, for one real commit of a code change, we could have 3 commits showing up in history:
  - commit for the code change
  - 'merge' commit for the PR from the user branch.
  - the branch merge commit.
I think it is truly confusing and pollutes the history.
3.  Also I think the master branching should still be denoted as the main dev branch, even if we have future 0.2 and master branches to follow TP 3.2 and 3.3 respectively.
     There may be more activities on the 0.2 branch, but master is still the main dev branch.

I am very much open to anything that comes up nice and clean.

Thanks,

Jerry
   

On Sunday, August 13, 2017 at 2:05:15 PM UTC-7, sjudeng wrote:
Jerry, Would you be okay trying out the TinkerPop flow for a period or do you prefer the Spark flow? If it's the Spark flow what are your (or anyone elses) thoughts on my previous questions/concerns and/or what are the concerns with the TinkerPop flow?

With the approaching release of TinkerPop 3.3.0 I think we should address this if our intention is to maintain both 0.2 and master branches to follow TinkerPop 3.2.x and 3.3.x, respectively.

On Saturday, July 15, 2017 at 4:37:42 PM UTC-5, sjudeng wrote:
Thanks for your summary, it looks looks good and I agree the approaches are very similar. Given this it might just come down to what our preference is as a community. Unlike before we now have time to hopefully come to a stronger consensus. I think we just want to decide either way by the 0.2.0 release.

Few questions about the Spark flow:

1. When cherry-picking what if the commit isn't the most recent, do you need to identify the commit hash (or should you always cherry-pick by hash)?
2. In the case of multiple commits in a PR do you cherry-pick individually or can you cherry pick the merge commit?
3. If a committer wants to check whether master is up-to-date with a release branch how is this done?

With the TinkerPop flow the merge into master (and the check whether master is up-to-date) is always just "git merge release-branch". I like the simplicity, especially for checking whether master is up-to-date. Right now if I want to confirm all commits in 0.1 are in master I try "git log origin/0.1 ^origin/master". But this shows me 29 commits (16 if I add "--no-merges"). Is this expected or did we do something wrong (or is there a better git call for this)? If we followed the TinkerPop flow I could do a "git merge release-branch" and get a satisfying "Already up-to-date" response from git.

On Saturday, July 15, 2017 at 2:48:06 PM UTC-5, Jerry He wrote:

Do we allow direct but protected push to the main repo?  If not, PR will be needed to push to each branch in all approaches.

The work is the same.  There is no easy way out.  The steps may be slightly different.  The goal is the same too, to let the change go to appropriate branches and maintain a relatively clean commit history.


Let’s look at two examples, Spark and TinkerPop, both of which use the PR model, from my understanding. Please correct me if wrong.


Spark flow:

1. JIRA

2. Pull request against master (contributor)

3. Merge PR (committer)

4. Cherry-pick the commit to an old branch if needed, direct push, no PR (committer)

    In rare cases and if the conflicts are non-trivial, the committer will request a PR from the contributor against the old branch and then merge.


TinkerPop flow:

1. JIRA

2. Pull request against master or an old branch (contributor)

3. Merge PR (committer)

4. Merge branch from an old branch that has the PR to master, direct push, no PR (committer).


Again, the work is similar. There is no big difference. 

Spark has additional requirements on the PR and commit message format and hides all pure ‘merge commit’. 


Thanks,


Jerry




On Friday, July 14, 2017 at 6:34:38 PM UTC-7, sjudeng wrote:
At the time of our previous discussion on this we were blocking merge of all PRs until resolution, so I know for my part it seemed reasonable to try an approach and move forward. Similarly could we try the alternative merge strategy for the next round of development (e.g. 0.2 release branch) and see if it works better for us? For one it might make sense to align with TinkerPop on this since many are members of both communities. I agree it does depend on how active development is on release branches. Soon I expect we'll have a 0.2 release branch (tied to TinkerPop-3.2/Spark-1.6) and master (tied to TinkerPop-3.3/Spark-2.1). I had been thinking we would be maintaining both of these in parallel to a much greater extent than we tried to do with 0.1/master, where 0.1 mostly only got bug fixes. It seems like the proposed merge strategy would scale better when many PRs (with potentially multiple commits) will go to both branches. I do like your point regarding the contributor resolving conflicts in both branches. But it also seems odd to me to have duplicate PRs with the same feature going into different branches. Is this common in other projects?

On Thursday, July 13, 2017 at 6:54:18 PM UTC-5, Jerry He wrote:
I recall we had previous discussed this topic, and had a consensus.

Is it not working for us and we need to change direction? 

By looking at the history,  the 3 or so PRs that were missed in master were during the transition time of the previous discussion. Things after that are working as expected. 
But surely we all want to remind ourselves to follow so when we commit.

I see there is no differences in term of the work involved for the different approaches.

We all have to:
1. Decide which version  the change will go into.  
2. After committing to one branch, merge or PR (and resolving conflicts) to the other branches.

The current approach (as agreed upon in the previous thread) is default to master.  Then selectively go to the other branches, which will be done at the same time.
The alternative is to go to the other branches, then merge  (resolving conflicts) to master.  If we do branch merge, then we have the question of how often we do it,  For each commit, or once a while?

Branches will diverge, and can significantly so after a while.  Then there is a question on who is better to do the merge/resolving conflicts.  I think it belongs to the originator of the change, not the committer.  (It could be the same person.)

Also, as we go, the changes that need to go to the other branches should diminish significantly, and we should do it purposefully to reduce our maintenance cost and encourage upgrade.

It does not seem things are broken, or there is an overwhelming advantage of the alternative so that we need to change.


Thanks.

Jerry




On Thursday, July 13, 2017 at 2:22:57 PM UTC-7, Ted Wilmes wrote:
I think the TinkerPop way is a good way to go. It's straightforward and has worked well on that project.

--Ted

On Thursday, July 13, 2017 at 12:04:59 PM UTC-5, Irving Duran wrote:
I like this approach. Does anybody know if this will be a problem for Travis-CI to pick up as of which branch the testing will be done?


Thank You,

Irving Duran

On Wed, Jul 12, 2017 at 8:22 PM, sjudeng <s...@...> wrote:
Based on our trial with the approach of merging into master and cherry-picking commits into release branches, I'd like to suggest we instead follow the TinkerPop process and merge features into relevant release branches and then merge release branches into master. I'd suggest we make this change after releasing 0.2.0, depending on the separate decision on how branching occurs after that (e.g. if we created a 0.2 branch and bumped master to 0.3.0-SNAPSHOT).

Some of the problems with the current approach have come up in the past weeks, including updates merged into 0.1 but not master (Jason this comment is what started it). There have also been cases where contributors are trying to solve this disconnect on their own by submitting multiple PRs for the same feature, one for the release branch and one for master. This makes it more difficult for us as reviewers and also makes it hard to track all interactions on an issue because comments are spread across multiple PRs.


On Wednesday, July 12, 2017 at 2:09:12 PM UTC-5, Jason Plurad wrote:
I came across PR-393 the other day as I was looking through the latest commits.

> A number of commits in 0.1 are missing in master. @srosenthal Thanks for catching.

Many thanks to everybody that helped identify and fix that situation!

However I was left wondering a bunch of things:
1. Where would I find the conversation that triggered the PR?
2. How did we miss merging commits into master?
3. How are we tracking which issues get merged into which branches?
4. Do we have the merge/commit process sufficiently documented?

We have this document on pull requests http://docs.janusgraph.org/latest/pull-requests.html but it doesn't seem to cover anything about how to handle merges for multiple branches. For example, TinkerPop is a bit more clear on how it handles this http://tinkerpop.apache.org/docs/3.2.5/dev/developer/#branches

There's some good discussion in PR-393. If we reached a consensus on what the right strategy is, feel free to chime in here amcp, jerryjch, sjudeng, srosenthal, and others so we can get it documented and publicized more widely.

-- Jason

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [DISCUSS] merge/commit flow for committers

Jerry He <jerr...@...>
 

Thanks for bringing up this thread again, and I apologize for not answering earlier.  I was trying to wait for others to chime in, but it eventually escaped me.

Some answers to the earlier questions first.
1. When cherry-picking what if the commit isn't the most recent, do you need to identify the commit hash (or should you always cherry-pick by hash)?
Yes, need the commit hash.
2. In the case of multiple commits in a PR do you cherry-pick individually or can you cherry pick the merge commit?
There is always one commit per PR.
3. If a committer wants to check whether master is up-to-date with a release branch how is this done?
The master is the main branch and always first to consider (unless in special cases where it does not apply).  So there is no worry if it is up-to-date.

I think my questions and concerns are here:
1.  Do we allow command line direct push of commits to the main repo?
2.  In the commit history, the less the 'merge' commits showing up, the better.  For example, for one real commit of a code change, we could have 3 commits showing up in history:
  - commit for the code change
  - 'merge' commit for the PR from the user branch.
  - the branch merge commit.
I think it is truly confusing and pollutes the history.
3.  Also I think the master branching should still be denoted as the main dev branch, even if we have future 0.2 and master branches to follow TP 3.2 and 3.3 respectively.
     There may be more activities on the 0.2 branch, but master is still the main dev branch.

I am very much open to anything that comes up nice and clean.

Thanks,

Jerry
   

On Sunday, August 13, 2017 at 2:05:15 PM UTC-7, sjudeng wrote:
Jerry, Would you be okay trying out the TinkerPop flow for a period or do you prefer the Spark flow? If it's the Spark flow what are your (or anyone elses) thoughts on my previous questions/concerns and/or what are the concerns with the TinkerPop flow?

With the approaching release of TinkerPop 3.3.0 I think we should address this if our intention is to maintain both 0.2 and master branches to follow TinkerPop 3.2.x and 3.3.x, respectively.

On Saturday, July 15, 2017 at 4:37:42 PM UTC-5, sjudeng wrote:
Thanks for your summary, it looks looks good and I agree the approaches are very similar. Given this it might just come down to what our preference is as a community. Unlike before we now have time to hopefully come to a stronger consensus. I think we just want to decide either way by the 0.2.0 release.

Few questions about the Spark flow:

1. When cherry-picking what if the commit isn't the most recent, do you need to identify the commit hash (or should you always cherry-pick by hash)?
2. In the case of multiple commits in a PR do you cherry-pick individually or can you cherry pick the merge commit?
3. If a committer wants to check whether master is up-to-date with a release branch how is this done?

With the TinkerPop flow the merge into master (and the check whether master is up-to-date) is always just "git merge release-branch". I like the simplicity, especially for checking whether master is up-to-date. Right now if I want to confirm all commits in 0.1 are in master I try "git log origin/0.1 ^origin/master". But this shows me 29 commits (16 if I add "--no-merges"). Is this expected or did we do something wrong (or is there a better git call for this)? If we followed the TinkerPop flow I could do a "git merge release-branch" and get a satisfying "Already up-to-date" response from git.

On Saturday, July 15, 2017 at 2:48:06 PM UTC-5, Jerry He wrote:

Do we allow direct but protected push to the main repo?  If not, PR will be needed to push to each branch in all approaches.

The work is the same.  There is no easy way out.  The steps may be slightly different.  The goal is the same too, to let the change go to appropriate branches and maintain a relatively clean commit history.


Let’s look at two examples, Spark and TinkerPop, both of which use the PR model, from my understanding. Please correct me if wrong.


Spark flow:

1. JIRA

2. Pull request against master (contributor)

3. Merge PR (committer)

4. Cherry-pick the commit to an old branch if needed, direct push, no PR (committer)

    In rare cases and if the conflicts are non-trivial, the committer will request a PR from the contributor against the old branch and then merge.


TinkerPop flow:

1. JIRA

2. Pull request against master or an old branch (contributor)

3. Merge PR (committer)

4. Merge branch from an old branch that has the PR to master, direct push, no PR (committer).


Again, the work is similar. There is no big difference. 

Spark has additional requirements on the PR and commit message format and hides all pure ‘merge commit’. 


Thanks,


Jerry




On Friday, July 14, 2017 at 6:34:38 PM UTC-7, sjudeng wrote:
At the time of our previous discussion on this we were blocking merge of all PRs until resolution, so I know for my part it seemed reasonable to try an approach and move forward. Similarly could we try the alternative merge strategy for the next round of development (e.g. 0.2 release branch) and see if it works better for us? For one it might make sense to align with TinkerPop on this since many are members of both communities. I agree it does depend on how active development is on release branches. Soon I expect we'll have a 0.2 release branch (tied to TinkerPop-3.2/Spark-1.6) and master (tied to TinkerPop-3.3/Spark-2.1). I had been thinking we would be maintaining both of these in parallel to a much greater extent than we tried to do with 0.1/master, where 0.1 mostly only got bug fixes. It seems like the proposed merge strategy would scale better when many PRs (with potentially multiple commits) will go to both branches. I do like your point regarding the contributor resolving conflicts in both branches. But it also seems odd to me to have duplicate PRs with the same feature going into different branches. Is this common in other projects?

On Thursday, July 13, 2017 at 6:54:18 PM UTC-5, Jerry He wrote:
I recall we had previous discussed this topic, and had a consensus.

Is it not working for us and we need to change direction? 

By looking at the history,  the 3 or so PRs that were missed in master were during the transition time of the previous discussion. Things after that are working as expected. 
But surely we all want to remind ourselves to follow so when we commit.

I see there is no differences in term of the work involved for the different approaches.

We all have to:
1. Decide which version  the change will go into.  
2. After committing to one branch, merge or PR (and resolving conflicts) to the other branches.

The current approach (as agreed upon in the previous thread) is default to master.  Then selectively go to the other branches, which will be done at the same time.
The alternative is to go to the other branches, then merge  (resolving conflicts) to master.  If we do branch merge, then we have the question of how often we do it,  For each commit, or once a while?

Branches will diverge, and can significantly so after a while.  Then there is a question on who is better to do the merge/resolving conflicts.  I think it belongs to the originator of the change, not the committer.  (It could be the same person.)

Also, as we go, the changes that need to go to the other branches should diminish significantly, and we should do it purposefully to reduce our maintenance cost and encourage upgrade.

It does not seem things are broken, or there is an overwhelming advantage of the alternative so that we need to change.


Thanks.

Jerry




On Thursday, July 13, 2017 at 2:22:57 PM UTC-7, Ted Wilmes wrote:
I think the TinkerPop way is a good way to go. It's straightforward and has worked well on that project.

--Ted

On Thursday, July 13, 2017 at 12:04:59 PM UTC-5, Irving Duran wrote:
I like this approach. Does anybody know if this will be a problem for Travis-CI to pick up as of which branch the testing will be done?


Thank You,

Irving Duran

On Wed, Jul 12, 2017 at 8:22 PM, sjudeng <s...@...> wrote:
Based on our trial with the approach of merging into master and cherry-picking commits into release branches, I'd like to suggest we instead follow the TinkerPop process and merge features into relevant release branches and then merge release branches into master. I'd suggest we make this change after releasing 0.2.0, depending on the separate decision on how branching occurs after that (e.g. if we created a 0.2 branch and bumped master to 0.3.0-SNAPSHOT).

Some of the problems with the current approach have come up in the past weeks, including updates merged into 0.1 but not master (Jason this comment is what started it). There have also been cases where contributors are trying to solve this disconnect on their own by submitting multiple PRs for the same feature, one for the release branch and one for master. This makes it more difficult for us as reviewers and also makes it hard to track all interactions on an issue because comments are spread across multiple PRs.


On Wednesday, July 12, 2017 at 2:09:12 PM UTC-5, Jason Plurad wrote:
I came across PR-393 the other day as I was looking through the latest commits.

> A number of commits in 0.1 are missing in master. @srosenthal Thanks for catching.

Many thanks to everybody that helped identify and fix that situation!

However I was left wondering a bunch of things:
1. Where would I find the conversation that triggered the PR?
2. How did we miss merging commits into master?
3. How are we tracking which issues get merged into which branches?
4. Do we have the merge/commit process sufficiently documented?

We have this document on pull requests http://docs.janusgraph.org/latest/pull-requests.html but it doesn't seem to cover anything about how to handle merges for multiple branches. For example, TinkerPop is a bit more clear on how it handles this http://tinkerpop.apache.org/docs/3.2.5/dev/developer/#branches

There's some good discussion in PR-393. If we reached a consensus on what the right strategy is, feel free to chime in here amcp, jerryjch, sjudeng, srosenthal, and others so we can get it documented and publicized more widely.

-- Jason

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [DISCUSS] merge/commit flow for committers

sjudeng <sju...@...>
 

Jerry, Would you be okay trying out the TinkerPop flow for a period or do you prefer the Spark flow? If it's the Spark flow what are your (or anyone elses) thoughts on my previous questions/concerns and/or what are the concerns with the TinkerPop flow?

With the approaching release of TinkerPop 3.3.0 I think we should address this if our intention is to maintain both 0.2 and master branches to follow TinkerPop 3.2.x and 3.3.x, respectively.


On Saturday, July 15, 2017 at 4:37:42 PM UTC-5, sjudeng wrote:
Thanks for your summary, it looks looks good and I agree the approaches are very similar. Given this it might just come down to what our preference is as a community. Unlike before we now have time to hopefully come to a stronger consensus. I think we just want to decide either way by the 0.2.0 release.

Few questions about the Spark flow:

1. When cherry-picking what if the commit isn't the most recent, do you need to identify the commit hash (or should you always cherry-pick by hash)?
2. In the case of multiple commits in a PR do you cherry-pick individually or can you cherry pick the merge commit?
3. If a committer wants to check whether master is up-to-date with a release branch how is this done?

With the TinkerPop flow the merge into master (and the check whether master is up-to-date) is always just "git merge release-branch". I like the simplicity, especially for checking whether master is up-to-date. Right now if I want to confirm all commits in 0.1 are in master I try "git log origin/0.1 ^origin/master". But this shows me 29 commits (16 if I add "--no-merges"). Is this expected or did we do something wrong (or is there a better git call for this)? If we followed the TinkerPop flow I could do a "git merge release-branch" and get a satisfying "Already up-to-date" response from git.

On Saturday, July 15, 2017 at 2:48:06 PM UTC-5, Jerry He wrote:

Do we allow direct but protected push to the main repo?  If not, PR will be needed to push to each branch in all approaches.

The work is the same.  There is no easy way out.  The steps may be slightly different.  The goal is the same too, to let the change go to appropriate branches and maintain a relatively clean commit history.


Let’s look at two examples, Spark and TinkerPop, both of which use the PR model, from my understanding. Please correct me if wrong.


Spark flow:

1. JIRA

2. Pull request against master (contributor)

3. Merge PR (committer)

4. Cherry-pick the commit to an old branch if needed, direct push, no PR (committer)

    In rare cases and if the conflicts are non-trivial, the committer will request a PR from the contributor against the old branch and then merge.


TinkerPop flow:

1. JIRA

2. Pull request against master or an old branch (contributor)

3. Merge PR (committer)

4. Merge branch from an old branch that has the PR to master, direct push, no PR (committer).


Again, the work is similar. There is no big difference. 

Spark has additional requirements on the PR and commit message format and hides all pure ‘merge commit’. 


Thanks,


Jerry




On Friday, July 14, 2017 at 6:34:38 PM UTC-7, sjudeng wrote:
At the time of our previous discussion on this we were blocking merge of all PRs until resolution, so I know for my part it seemed reasonable to try an approach and move forward. Similarly could we try the alternative merge strategy for the next round of development (e.g. 0.2 release branch) and see if it works better for us? For one it might make sense to align with TinkerPop on this since many are members of both communities. I agree it does depend on how active development is on release branches. Soon I expect we'll have a 0.2 release branch (tied to TinkerPop-3.2/Spark-1.6) and master (tied to TinkerPop-3.3/Spark-2.1). I had been thinking we would be maintaining both of these in parallel to a much greater extent than we tried to do with 0.1/master, where 0.1 mostly only got bug fixes. It seems like the proposed merge strategy would scale better when many PRs (with potentially multiple commits) will go to both branches. I do like your point regarding the contributor resolving conflicts in both branches. But it also seems odd to me to have duplicate PRs with the same feature going into different branches. Is this common in other projects?

On Thursday, July 13, 2017 at 6:54:18 PM UTC-5, Jerry He wrote:
I recall we had previous discussed this topic, and had a consensus.

Is it not working for us and we need to change direction? 

By looking at the history,  the 3 or so PRs that were missed in master were during the transition time of the previous discussion. Things after that are working as expected. 
But surely we all want to remind ourselves to follow so when we commit.

I see there is no differences in term of the work involved for the different approaches.

We all have to:
1. Decide which version  the change will go into.  
2. After committing to one branch, merge or PR (and resolving conflicts) to the other branches.

The current approach (as agreed upon in the previous thread) is default to master.  Then selectively go to the other branches, which will be done at the same time.
The alternative is to go to the other branches, then merge  (resolving conflicts) to master.  If we do branch merge, then we have the question of how often we do it,  For each commit, or once a while?

Branches will diverge, and can significantly so after a while.  Then there is a question on who is better to do the merge/resolving conflicts.  I think it belongs to the originator of the change, not the committer.  (It could be the same person.)

Also, as we go, the changes that need to go to the other branches should diminish significantly, and we should do it purposefully to reduce our maintenance cost and encourage upgrade.

It does not seem things are broken, or there is an overwhelming advantage of the alternative so that we need to change.


Thanks.

Jerry




On Thursday, July 13, 2017 at 2:22:57 PM UTC-7, Ted Wilmes wrote:
I think the TinkerPop way is a good way to go. It's straightforward and has worked well on that project.

--Ted

On Thursday, July 13, 2017 at 12:04:59 PM UTC-5, Irving Duran wrote:
I like this approach. Does anybody know if this will be a problem for Travis-CI to pick up as of which branch the testing will be done?


Thank You,

Irving Duran

On Wed, Jul 12, 2017 at 8:22 PM, sjudeng <s...@...> wrote:
Based on our trial with the approach of merging into master and cherry-picking commits into release branches, I'd like to suggest we instead follow the TinkerPop process and merge features into relevant release branches and then merge release branches into master. I'd suggest we make this change after releasing 0.2.0, depending on the separate decision on how branching occurs after that (e.g. if we created a 0.2 branch and bumped master to 0.3.0-SNAPSHOT).

Some of the problems with the current approach have come up in the past weeks, including updates merged into 0.1 but not master (Jason this comment is what started it). There have also been cases where contributors are trying to solve this disconnect on their own by submitting multiple PRs for the same feature, one for the release branch and one for master. This makes it more difficult for us as reviewers and also makes it hard to track all interactions on an issue because comments are spread across multiple PRs.


On Wednesday, July 12, 2017 at 2:09:12 PM UTC-5, Jason Plurad wrote:
I came across PR-393 the other day as I was looking through the latest commits.

> A number of commits in 0.1 are missing in master. @srosenthal Thanks for catching.

Many thanks to everybody that helped identify and fix that situation!

However I was left wondering a bunch of things:
1. Where would I find the conversation that triggered the PR?
2. How did we miss merging commits into master?
3. How are we tracking which issues get merged into which branches?
4. Do we have the merge/commit process sufficiently documented?

We have this document on pull requests http://docs.janusgraph.org/latest/pull-requests.html but it doesn't seem to cover anything about how to handle merges for multiple branches. For example, TinkerPop is a bit more clear on how it handles this http://tinkerpop.apache.org/docs/3.2.5/dev/developer/#branches

There's some good discussion in PR-393. If we reached a consensus on what the right strategy is, feel free to chime in here amcp, jerryjch, sjudeng, srosenthal, and others so we can get it documented and publicized more widely.

-- Jason

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Should graph traversals we explicitly closed?

Jason Plurad <plu...@...>
 

> Question 1: Should we ignore the warning in (2) or wrap each traversal with try-with-resources?

I'd think you should follow the recommendation. There was another thread recently (I don't remember which forum) where the results of a traversal were remaining in memory on the Gremlin Server because the traversal wasn't fully iterated. You also might want to consider checking hasNext() or using tryNext() on that traversal, otherwise you'll get an exception if there is no matching id.

> Question 2: Is is safe to reuse single graphTraversalSource for the lifetime of an application?

Yes. Creating the graphTraversalSource could be considered an expensive operation, so it should be reused. One thing you need to be aware of is ensuring that you handle the transactions cleanly. Make sure the traversals are surrounded with try/catch/finally and graph.tx().commit() or graph.tx().rollback() at the end.


On Monday, August 7, 2017 at 8:46:20 AM UTC-4, wojcik.w wrote:
Hi,

See the following code:

    GraphTraversalSource graphTraversalSource = graph.traversal();

   
// (1) one liner - no warnings
   
Vertex src = graphTraversalSource.V().has(ID, id).next();

   
// (2) assignment to variable generates a warning: potential resource leak: GraphTraversal is AutoClosable and should be managed by try-with-resources
   
GraphTraversal<Vertex, Vertex> g = graphTraversalSource.V().has(ID, id);  
   
Vertex src = g.next();

Question 1: Should we ignore the warning in (2) or wrap each traversal with try-with-resources?

The tinkerpop documentation
http://tinkerpop.apache.org/docs/current/reference
does not say much about resource management of traversals, only that transactions and graph instances should be explicitly closed.

Question 2: Is is safe to reuse single graphTraversalSource for the lifetime of an application?

The javadoc for graph.traversal states that instances of GraphTraversalSource are reusable, so following this we are keeping a single source for spawning the traversals for each request that comes to our web-app.
At the same time we are not closing the spawned traversals (question 1). Is this correct approach or will this lead to resource/memory leaks?
Should we spawn we source for each traversal or keep one common instance for all of them?

Any help of this topic would be much appreciated.

Thanks





Should graph traversals we explicitly closed?

woj...@...
 

Hi,

See the following code:

    GraphTraversalSource graphTraversalSource = graph.traversal();

   
// (1) one liner - no warnings
   
Vertex src = graphTraversalSource.V().has(ID, id).next();

   
// (2) assignment to variable generates a warning: potential resource leak: GraphTraversal is AutoClosable and should be managed by try-with-resources
   
GraphTraversal<Vertex, Vertex> g = graphTraversalSource.V().has(ID, id);  
   
Vertex src = g.next();

Question 1: Should we ignore the warning in (2) or wrap each traversal with try-with-resources?

The tinkerpop documentation
http://tinkerpop.apache.org/docs/current/reference
does not say much about resource management of traversals, only that transactions and graph instances should be explicitly closed.

Question 2: Is is safe to reuse single graphTraversalSource for the lifetime of an application?

The javadoc for graph.traversal states that instances of GraphTraversalSource are reusable, so following this we are keeping a single source for spawning the traversals for each request that comes to our web-app.
At the same time we are not closing the spawned traversals (question 1). Is this correct approach or will this lead to resource/memory leaks?
Should we spawn we source for each traversal or keep one common instance for all of them?

Any help of this topic would be much appreciated.

Thanks





Re: I'm starting a new startup big project, should I use Janus as main database to store all my data?

Jason Plurad <plu...@...>
 

Please follow the thread over on janusgraph-users. Thanks.


On Tuesday, August 1, 2017 at 5:17:41 PM UTC-4, Augusto Will wrote:
I'm thinking about learn Janus to use in my new big project but i can't understand some things.

Janus can be used like any database and supports "insert", "update", "delete"  operations so Janus will write data into Cassandra or other database to store these data, right?

Where Janus store the Nodes, Edges, Attributes etc, it will write these into database, right?

These data should be loaded in memory by Janus or will be read from Cassandra all the time?

The data that Janus read, must be load in Janus in every query or it will do selects in database to retrieve the data I need?

The data retrieved in database is only what I need or Janus will read all records in database all the time?

Should I use Janus in my project in production or should I wait until it becomes production ready?

I'm developing some kind of social network that need to store friendship, posts, comments, user blocks and do some elasticsearch too, in this case, what database backend should I use?


Thank you.


I'm starting a new startup big project, should I use Janus as main database to store all my data?

Augusto Will <pw...@...>
 

I'm thinking about learn Janus to use in my new big project but i can't understand some things.

Janus can be used like any database and supports "insert", "update", "delete"  operations so Janus will write data into Cassandra or other database to store these data, right?

Where Janus store the Nodes, Edges, Attributes etc, it will write these into database, right?

These data should be loaded in memory by Janus or will be read from Cassandra all the time?

The data that Janus read, must be load in Janus in every query or it will do selects in database to retrieve the data I need?

The data retrieved in database is only what I need or Janus will read all records in database all the time?

Should I use Janus in my project in production or should I wait until it becomes production ready?

I'm developing some kind of social network that need to store friendship, posts, comments, user blocks and do some elasticsearch too, in this case, what database backend should I use?


Thank you.


Re: [DISCUSS] Splitting janusgraph-cassandra

Samant Maharaj <samant...@...>
 

I’ve been putting a significant amount of time into the proposed cassandra split and have a branch here: https://github.com/orionhealth/janusgraph/tree/feature/cassandra-split with the core changes.

As part of this work I’ve improved the unit testing setup so that it’s not driven by Maven. This has had some flow on effects as the existing build system has the vast majority of maven config in the root POM. I have another branch based on the cassandra-split branch with the more significant maven changes here: https://github.com/orionhealth/janusgraph/tree/feature/maven-refactoring

In my opinion the existing maven setup does not follow best practices in a number of areas and my refactoring branch’s goal is to simplify the build system and to make it easier to work with and modify. I’d like to get your opinions on whether this is a worthwhile change and I’d be very happy to discuss the merits of my approach with you. In any case my overall goal is to complete the cassandra-split to the point that it can be merged and it is up for discussion as to whether that includes the additional maven refactoring.

It is probably a good idea for the maven refactoring discussion to be moved to a different thread or alternatively I could create a room on Gitter for discussion.

Thoughts?

On 4/07/2017, at 11:58 AM, Samant Maharaj <samant...@...> wrote:

I'm now working on bringing the branch up to date against master and will raise the PR as soon as it's ready.

There've been a few changes since I did that initial work so it might take a short while to get it all squared away.

Regards,
Samant 

On Friday, 30 June 2017 14:38:03 UTC+12, sjudeng wrote:
Samant, It's been 5 days and no one has yelled too loud so I think it's reasonable to move forward with your PR on this if you're still up for it.

Ted, I don't think the reorganization here would cause any additional complexities with the eventual deprecation of Thrift. On the surface it seems to me it might make it simpler to carry out the deprecation/removal if the Thrift components are separated into a standalone module like here.

On Friday, June 23, 2017 at 10:45:50 PM UTC-5, sjudeng wrote:
All,

Samant has said he should be able to continue working on this effort. The current proposal is to refactor into the following structure.

├─ janusgraph-cassandra-parent/
│   ├── astyanax
│   ├── cql
│   ├── core
│   ├── embedded
│   ├── test
│   ├── thrift

+1 for me. I think it's a step in the right direction and I'd hate to see the work already done on this go to waste.

Jason, is this something that needs a separate vote thread or can the work/PR just move forward?

Samant, What do you think about Ted's question above?


On Saturday, June 17, 2017 at 12:28:41 PM UTC-5, sjudeng wrote:
Hi Samant,

Are you still able to move forward with getting your work on this submitted? If so do you want to call a vote on the proposed refactoring or do you want me to? One way or another I think the refactoring is definitely needed. It came up in a recent PR where configuration needed to be duplicated in janusgraph-cassandra and janusgraph-cql because of the current state of things.

Thanks!

--
You received this message because you are subscribed to the Google Groups "JanusGraph developer list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-dev+unsu...@....
For more options, visit https://groups.google.com/d/optout.

1241 - 1260 of 1585