Re: [DISCUSS] Individual Versioning decision for GLV libraries


Florian Hockmann <f...@...>
 

Thanks for trying this out with PyPi and yes, you got my suggestion exactly right.

Unfortunately, I think that my suggestion wouldn't even completely work for NuGet like I proposed it earlier, exactly for the reasons you mentioned for PyPi, namely the normalization of version numbers. I didn't know that when I posted my suggestion, but NuGet seems to apply a very similar normalization. The docs state for example that: "Leading zeroes are removed from version numbers" which would also turn 0.3.001 into 0.3.1 and thus making my suggestion basically useless.

What I also didn't know earlier is that NuGet also supports version numbers with four parts which means that we could technically use the approach Stephen suggested and use version numbers like 0.3.0.1. However, four part version numbers are so unusual for NuGet packages that I really had to search before I could find a few packages that use them. I also have the suspicion that NuGet only supports them for some legacy reason and that it isn't the best idea to use such a version number for a new project.
I will try to find out whether using a four part version number has any disadvantages for a NuGet package or whether it is actively discouraged by the NuGet team. I already created an issue with that question 2 days ago and since they don't seem to respond that often to issues, I also just asked this on StackOverflow. Maybe someone on there has some insights on this topic.

Apart from whether it would be possible to use a four part version with NuGet or PyPi, I wonder whether it's a good idea in general. It might be possible for PyPi and NuGet, but semantic versioning is becoming the de-facto standard, or at least SemVer style three part version numbers are. npm for example also expects SemVer style version numbers, so does Rust's package manager Cargo and also the package manager for Swift.
If JanusGraph itself would follow semantic versioning, then we could only copy the major and minor version and use the third part (the patch version) for a library version. Including the third part of JanusGraph's version wouldn't be important as JanusGraph 0.3.1 would be JanusGraph 0.3.0 + bug fixes which wouldn't make any difference for a driver. But since JanusGraph also includes new features in releases like 0.3.1, we can't omit any part of the version number if we want to include that in the version of the libraries.

Sooo, I'm personally coming back to being in favour of simply following SemVer for the drivers which would mean that they start with version 0.0.1. We then only have to communicate to users which version of JanusGraph the driver is compatible with. But we can probably simply include that very prominently in the description of each package which should show up in the web interfaces of the package repositories for most languages.

I don't think that it's important for our decision, but out of interest I checked if and how the .NET drivers of the 10 most popular databases (according to DB-Engines) used the database version in their version:
  • The MySQL driver was the only one that uses exactly the same version (so they also seem to be released together).
  • 3 drivers shared only their major version with their database, namely Oracle, Elasticsearch, and Cassandra.
  • 5 drivers seem to have completely unrelated version numbers.
(The numbers don't add up to 10 as I couldn't find a NuGet package for Microsoft Access.)

Am Mittwoch, 10. Oktober 2018 16:25:37 UTC+2 schrieb Debasish Kanhar:

Everyone, I think Florian's idea can be modified a bit to support Python's PyPi versioning.

If I'm correct, Florian suggested following structure:

X.Y.ZZAA

Where X: JanusGraph majov version number: 0
Y: JanusGraph minor version number: 3 (0.3.x)/ 4 (0.4.x) etc
ZZ: JanusGraph patch version number in 2 digit version appended by 0 if needed. : 01/02/03 .. etc
AA: Library's Patch version numbers, starting with 00

Since PyPi normalizes the version numbers as normal numbers, we can trick it into thinking that the version we are following are valid numbers as follows:

X.Y.ZZAA then becomes X.Y.ZAA
Meaning, we don't append JanusGraph patch version number with additional 0. We just write the version number as it is. If a Path 10 is reached, then .ZAA becomes .ZZAA as follows:

JanusGraph version    Library Version, Patch 1
0.3.9                         0.3.901
0.3.10                       0.3.1001

If we think that we don't have fixed digits for version numbers, well we can think it in way that it is dynamic enough to accomodate our required use cases.

And, in addition to that I would like to propose a rule for Library Patch version numbers (AA) that it starts from 01 and not 00. This is to avoid following scenario:

JanusGraph version     Library Version      PyPi Normalized Version     NuGet version
0.3.1                          0.3.101                 0.3.101                              0.3.101
0.3.1                          0.3.100                 0.3.1                                  0.3.100

But I think of one drawback with this restricted approach is that, how do we tackle normalization when JanusGraph is on 1st release of a minor version. (0.3.0). In such scenario, following thing will happen with PyPi

JanusGraph version     Library Version      PyPi Normalized Version     NuGet version
0.3.0                          0.3.001                 0.3.1                                 0.3.001

We lose the information regarding which JanusGraph minor version that it.

Any ideas how do we proceed?


On Monday, 8 October 2018 21:28:37 UTC+5:30, Debasish Kanhar wrote:
Florian, Stephen. Thanks guys for your input. That was really a helpful comment but I was doing quick testing on what Version numbers are supported on PyPi (Where Python libs are hosted).

1. So, It got me wondering, because PyPi doesn't support uploading versions with tags in it. So something like "0.3.0101-beta1" will fail because it contains string which is Invalid.
2. Similarly, I thought will they support 4 digit versioning, and I tried "0.3.0.1" and it worked like a charm and the library got published to PyPi withoug any errors. Great news.
3. Then I tried out Florian's idea, having 3 digit codes, but 3rd one having 4 digits. Hence the analogous "0.3.0001" actually doesn;t get uploaded the way it is expected to. Python normalizes the version number, and hence 0.3.0001 now becomes 0.3.1. Similarly 0.3.0101 now becomes 0.3.101. Now due to this normalization, the Version numbers aren't getting reflected properly to PyPi. Looks like this is a technical drawback.

But looking at your comments, looks like NuGet doesn't support 4 digit SymVer versioning? If such is scenario, I don't know how do we reach a consensus here, as it looks like the version numbers supposed to be working with PyPi won't be working with NuGet or vice versa. 

On Sunday, 7 October 2018 17:22:53 UTC+5:30, Florian Hockmann wrote:

Thanks for your input on this, Stephen! This really got me thinking.


I think in the end it comes down to the question of what is more important for us to communicate to users with the version: Which version of JanusGraph (and therefore implicitly TinkerPop) is supported or whether new features or breaking changes are included in the library itself that are not related to an updated JanusGraph version?

The more I think about this, the more I tend towards primarily communicating the supported JanusGraph version. Once the libraries support all JanusGraph specific predicates and data types, there won’t be many changes to them. The only changes will probably be updated dependencies which is mostly the TinkerPop GLV and those versions will be in line with the TinkerPop version supported by JanusGraph. So, I guess the optimal solution for the library versions would include the supported JanusGraph version. That would also have the advantage that users won’t have to keep track of yet another version and ensure that everything is compatible (the compatibility matrix of JanusGraph is already complicated enough).


Unfortunately, we can’t completely copy the versioning strategy from Ogre as it uses 4 version elements and some package managers (like NuGet) expect SemVer style versions with only 3 elements. We could solve this by taking the JanusGraph version and then extending the third element with a version number for the library itself. So, something like this:


JanusGraph version   Library version
0.3.0                        0.3.0000
0.3.0                        0.3.0001
0.3.1                        0.3.0100
0.4.0                        0.4.0000

The first 2 digits of the third element reflect the third element of the JanusGraph version and the last 2 digits are for the library version. 2 digits are necessary in case JanusGraph reaches a version like 0.3.10 and the same for the library which could reach 0.3.0010.

The .NET Core SDK uses a similar versioning strategy. The only difference is that they only include the major and minor version of the .NET Core runtime as the patch version really only contains patches which is not the case for JanusGraph and TinkerPop. That allows them to include a minor and patch version together in a 3 digit third element of the version.

We could use this versioning when we agree that we only add breaking changes when a new JanusGraph version is out that allows breaking changes (next would be 0.4.0).

So, what do others say? Should we use versions like 0.3.0000?

Am Samstag, 6. Oktober 2018 17:41:18 UTC+2 schrieb Stephen Mallette:
I haven't been really following the discussion, so apologies if this has already been mentioned, but I've liked the approach that Ogre and gremlin-scala have taken with respect to TinkerPop - they just bind their releases to TinkerPop release numbers by adding a 4th number to the end. So for JanusGraph 0.4.0 you just have JanusGraph-Py 0.4.0.0. Then you can independently release JanusGraph-Py 0.4.0.1, 0.4.0.2 and so on. The advantage is that users will know exactly which versions work and are tested with which version of JanusGraph. Anyway....just something to consider.



On Sat, Oct 6, 2018 at 7:21 AM Florian Hockmann <f...@...> wrote:
How do we bump minor and major versions?

If we agree on using SemVer, then that already tells us how to handle minor and major version. Taken from its summary:

  1. MINOR version when you add functionality in a backwards-compatible manner, and
  2. PATCH version when you make backwards-compatible bug fixes.
The only special case is that we can include breaking changes without having to bump to 1.0.0 as 1.0.0 reflects a certain stability of the API. So, I'd say that we wait until we actually have a breaking change and then decide whether it makes sense to bump to 1.0.0 or whether we simply want to bump the minor version and communicate the breaking change to users in the docs / release notes.

One thing that we haven't really discussed yet is how users know which version of JanusGraph is supported by which version of the library. My initial thoughts are that we include this information in the release notes when a newer JanusGraph version is supported and that we probably want to include a section in the docs of the libraries about version compatibility. Maybe it makes also sense to include this information also in the package metadata which is at least for .NET users the most prominent place for documentation.

Am Samstag, 6. Oktober 2018 09:36:06 UTC+2 schrieb Debasish Kanhar:
and users expect exactly that of a library that has a 0.y.z version. I think that makes sense for us. : I agree with you, after going through your points, it seems that going forward with 0.y.z as version numbers is the right way to go forward.

Code breaking changes implies anything which might break user's code or even library. Just about anything. And well there will be minor such changes corresponding to few features being added. It would be better if we start with 0.y.z as version numbers.

New features shouldn't require breaking changes usually. Adding something doesn't break existing code.: My bad. I just wanted to consider every scenario theoretically, and it doesn't seem like I was able to find a logical scenario where adding new feature breaks existing features. Maybe coz such scenario doesn't exist only ;-)

Anyways thanks Florian for getting a lot of points clear. I guess we now have concensous on following SymVer and starting with 0.y.z.

How do we bump minor and major versions? Would that follow my proposal which was a top of the thread?

Cheers
Debasish K

On Thursday, 4 October 2018 21:00:15 UTC+5:30, Florian Hockmann wrote:
Now, when a breaking change becomes necessary at some point in the future, we can still decide whether it's big enough to require maintaining 2 branches: I'm still not able to imagine the scenario where such a code breaking change will be introduced. Code breaking changes for library I don't visualize yet, but code changes for user of library can be possible if few syntax or class changes in future.

I just meant any breaking change which could also simply be that a new GraphSON version is used by default. It doesn't have to be a breaking code change in the library itself.

I didn't know that having major version 0 implies that API may change

That's simply part of SemVer (semantic versioning). It states:

Major version zero (0.y.z) is for initial development. Anything may change at any time. The public API should not be considered stable.

and users expect exactly that of a library that has a 0.y.z version. I think that makes sense for us.

Do you mean the end point API like the one used to connect to Server? Or few of syntax?

I mean anything that requires users to change their code. Your PR for example currently contains a JanusGraphClient that has a connect and a get_connection method. What if we want to use just one method that does both in the future? Or what if we decide to unify the APIs of the libraries for the different languages?
It's really hard to figure out a good API right from the start. So, it's good if we have the possibility to change the API in early versions without breaking users expectations. However, a version 1.y.z implies that the API is relatively stable.

Well isn't syntax probable to change if some conflicting or new feature comes in?

New features shouldn't require breaking changes usually. Adding something doesn't break existing code.

Am Donnerstag, 4. Oktober 2018 15:42:35 UTC+2 schrieb Debasish Kanhar:
Adding new features (which requires a minor version bump) doesn't hinder users from upgrading in any way. So, I don't see why we would still maintain an older version without this new feature.: This sounds good, and better plan for going forward.

Now, when a breaking change becomes necessary at some point in the future, we can still decide whether it's big enough to require maintaining 2 branches: I'm still not able to imagine the scenario where such a code breaking change will be introduced. Code breaking changes for library I don't visualize yet, but code changes for user of library can be possible if few syntax or class changes in future.

as long as you provide upgrade notes to notify users about syntax changes, they would be able to upgrade to the newer version or they can also choose to stick with the previous version despite being out of maintenance: This sounds like the way to go forward. Like for eg, in my upcoming release, I've changed the syntax of connecting to JanusGraph server. We can provide a upgrade notes, on what changes to code or user will be required, or else they can decide to stick to older version if they don't need the upgraded lib's features.

Then the only scenario I think where we might need to maintain 2 different versions will be either when JanusGraph's actively maintained versions contains more than 2 GraphSON versions (Like GSon 3 & 4) or when together 2 separate TinkerPop versions are supported (Though this later one might require less of maintenance)

I didn't know that having major version 0 implies that API may change. Do you mean the end point API like the one used to connect to Server? Or few of syntax? Well isn't syntax probable to change if some conflicting or new feature comes in? Like the way to connect to running instace of JanusGrpah server for example.


On Wednesday, 3 October 2018 16:44:12 UTC+5:30, Florian Hockmann wrote:
I also think that we should keep the number of maintained branches / versions as low as possible, ideally only 1 branch will be actively maintained. Adding new features (which requires a minor version bump) doesn't hinder users from upgrading in any way. So, I don't see why we would still maintain an older version without this new feature. Now, when a breaking change becomes necessary at some point in the future, we can still decide whether it's big enough to require maintaining 2 branches or whether it's so small that we can expect users to just upgrade to the new major version in which case we would only have to maintain that version.

But it seems as if we have reached consensus on using semantic versioning for the libraries. We just need to decide whether we want to start at version 1.0.0 or 0.0.1.
I would be ok with both versions, but I have a slight preference towards 0.0.1 as a major version of 0 implies that the API may change at any time which allows us to still make some general adjustments to the API.

Am Dienstag, 2. Oktober 2018 22:20:25 UTC+2 schrieb Jason Plurad:
> I'm still concerned about the number of releases which will need to be actively maintained as that keeps on increasing.

I would think as long as the drivers are largely backwards compatible, you'd be able to keep it to 2 active branches. Adding new features like schema management or geoshapes wouldn't break the previous functionality. And even if there were breaking changes, as long as you provide upgrade notes to notify users about syntax changes, they would be able to upgrade to the newer version or they can also choose to stick with the previous version despite being out of maintenance.


On Tuesday, October 2, 2018 at 2:20:36 PM UTC-4, Debasish Kanhar wrote:
I meant it in following sense. We go forward with releasing first version which will be tagged as 1.0.0

Now, this would be our current set of features, and as and when bugs are found out, a new patch release will be planned. Lets say some bug fixes leads us to release 1.0.1, then 1.0.2, then 1.0.3 and so on. Thus no we have a line of releases corresponding to patch version which needs to be maintained. I would suggest here just maintaining the latest in line, i.e. if 1.0.2 is out, then 1.0.1 and 1.0.0 won't be activly maintained.

As for minor version bump, let us say that we added a new feature named Schema management. Since it is a huge feature in itself, everything won't be added in single go. Let us say, we release the feature "Schema Management" in between 1.0.2 and 1.0.3 releases. Since it is a new feature, the version which will be released corresponding to the feature will involve a bump in minor version. Since last stable patch release was 1.0.2, hence the new minor release will be tagged 1.1.0 and will include all patch fixes of 1.0.2.
Now, we have 2 separate stable releases, one is 1.1.0 and other will be 1.0.2. All the patches related to Schema management will go into 1.1.x series, i.e. new release named 1.1.1, 1.1.2, 1.1.3 and so on will be released. All patches for non schema library will go into 1.0.x series, i.e. 1.0.3, 1.0.4, 1.0.5 and so on will be released. At end of release, we now have 2 separate versions which need to be actively maintained. Latest in line of 1.0.x series and latest in line of 1.1.x series.

Once another feature is introduced which brings in bump in minor version, let us say new Geoshapes, we will move to 1.2.x series of versions with their corresponding Patch releases. 1.2.x series will contain the latest stable version of 1.1.x series and new features will be added on top of it. Once 1.2.x series are out, then we will stop actively maintaining 1.1.x series, and all corresponding Patch releases moves to 1.2.x series now.

As for huge code breaking changes, like TP version change of GraphSON change, we will just bump the major version number which would be in sync with latest in line of 1.x.y series and 1.0.x series.

Thus, at end of day, we are actively maintaining 3 versions parallely.

One would be latest in line of 1.0.x version, which would be the naked library with least features and their bugs fixed.
Second would be latest in line of 1.x.y version which contains the recent feature added, like either Schema management, GeoShapes etc along with their corresponding bugs fixed.
Third would be, though not required to actively maintain those will be those versions which had change in TP or GraphSON, so that when me merge the feature releases upstream (from patch and minor to major fix branch), we can easily check for conflicts, and will need to maintain only those where conflicts arise.

Let us consider 1.2.2 contains all the recent library with all new features like Schema and Geoshapes added. We will just need to bump TP version for new JanusGraph 0.4.0 and that goes into 2.0.0. All the bug fixes due to conflicts will go into 2.0.x line of releaes, while any new feature would require first creating those features in minor release for older stable TP, like 1.3.0 released, and then those changes are merged upstream into 2.0.0 to form 2.1.0 release. This way once a new feature is added, it gets automatically also added to older versions of library thus working with older version of JanusGraph too.

I hope I was clear on maintainability aspect too, but I'm still concerned about the number of releases which will need to be actively maintained as that keeps on increasing.

Cheers


On Tuesday, 2 October 2018 18:39:49 UTC+5:30, Florian Hockmann wrote:
3. But if JanusGraph version is upgraded, we don't necessary need to upgrade the Client unless TinkerPop change happens.

The major version bump will be latest version or minor bump series and patch series + new dependency changes. (Eg: If JanusGraph 0.4.0 comes with TP 3.4, then due to code breaks, we will need to to bump Major version number. Considering that latest release of minor version is 1.1.1 and or patch release is 1.0.2, then 2.0.0 will contain all features of 1.1.1 and 1.0.2 along with dependency changes)

I don't get what you mean with this. How can the latest minor version be 1.1.1? Minor version just means the second element of the version number which would be '2' in the version 1.2.3. My proposal was simply to employ semantic versioning. But you're right in general that an update of TinkerPop to 3.4.0 could break things for the client libraries and therefore require a major version bump instead of just a minor version bump.

In general, I can't quite follow what you propose regarding actively maintained branches. How is the versioning related to the number of branches we have to maintain?

Am Dienstag, 2. Oktober 2018 12:34:54 UTC+2 schrieb Debasish Kanhar:
Hi Florian,

Okay, thanks for pointing that out, and I do see few issues with my idea, as in inde[endently bumping major and minor versions of clients by itself without upgrade to JanusGraph is difficult.

At same time I like your idea a lot better but a few issues still remain which adds extra level of complexity and maintainabiliy.

Correct me if my understanding is wrong, (I'm in favour of starting version numbers from 1.0.0 as that seems lot cleaner ;-) )

1. 1.0.0 Supports JanusGraph 0.3.0
2. If Patch is released for same version, we simply bump minor version to 1.0.1.
3. But if JanusGraph version is upgraded, we don't necessary need to upgrade the Client unless TinkerPop change happens. Like JanusGraph 0.3.1 will still use TP 3.3.3 and hence clients of JG 0.3.0 will work with JG 0.3.1 also. So we don't need to release another version of client with bump in patch number just because to sync with bump in JanusGraph patch version upgrade.
4. If a new feature is released for client, let us say schema management, it leads to bump in minor version number, and will be in line with recent patch. i.e. 1.0.1 (Latest patch in line of 1.0) gets merged with recent bump of minor version, (i.e. 1.1.0 will have all features of 1.0.1 + the new feature being released) thus reducing number of versions to be maintained.
5. If any of JanusGraph versions brings in a major change like TinkerPop version change or GraphSON version change, or similar change which will break the library, then we bump the Major Version number. The major version bump will be latest version or minor bump series and patch series + new dependency changes. (Eg: If JanusGraph 0.4.0 comes with TP 3.4, then due to code breaks, we will need to to bump Major version number. Considering that latest release of minor version is 1.1.1 and or patch release is 1.0.2, then 2.0.0 will contain all features of 1.1.1 and 1.0.2 along with dependency changes)

The plus I see with this is, we will need to just actively maintain 2 branches, latest in line of minor and latest in line of patch releases. In this case, 1.1.1 and 1.0.2. Since TinkerPop version change doesn't require additional changes except changing the dependency, the latest in major release line will also be maintained, but requires less work to maintain, and hence helps in maintaing as many major version branches.

Also, if the major version change did bring any major breaking changes (like in case of GraphSON), then either a new version is released by bumping patch number of minor version depending on the type of new feature the bump brings in.

What do you think? A mix of both of our ideas I think :-D



On Monday, 1 October 2018 18:10:44 UTC+5:30, Florian Hockmann wrote:
Thanks for getting the discussion on this topic started!

One disadvantage I see with your proposal is that those version numbers imply semantic versioning but they prevent the libraries from making any changes except for patches independently of the main project as they have to wait for a version bump of JanusGraph's minor or major version before they can bump their minor or major version.

Therefore, I would like to propose that the libraries simply use semantic versioning. That would look like this:

1st release of JanusGraph-Python will be 0.0.1 and target JanusGraph 0.3.0.

The version number for the next release depends on the nature of the included changes:
  1. A patch release results in 0.0.2.
  2. Added functionality requires a minor version bump, so: 0.1.0.
  3. A breaking change bumps the major version number: 1.0.0. (Although versions pre 1.0.0 might still only bump the minor version and only go to 1.0.0 to convey a certain stability of the library.)
  4. Compatibility with a newer JanusGraph version bumps the equivalent version number: Supporting JanusGraph 0.4.0 would result in 0.1.0, whereas supporting JanusGraph 0.3.1 only results in 0.0.2.
The only downside I see with semantic versioning for the libraries is that they need to convey the supported JanusGraph version through another way, but that also seems to be necessary with your method as it's not clear just from a version number like 1.0.0 which JanusGraph version is supported. However, with pure semantic versioning we gain the increased flexibility we wanted to get with the independent versioning of those libraries.

What do you think?

Am Freitag, 28. September 2018 22:56:17 UTC+2 schrieb Debasish Kanhar:
Hi All,

A few days back, Florian had started a Discuss thread regarding how should we version the upcoming GLV libraries for Python and DotNet. After discussions and lazy consensus it felt like we should go forward with independent version numbers for the GLV libraries than the official JanusGraph.

Hence, coming back the main question then remains is even though independent how do we version the GLV libraries.

I've been following the following version semantics and would like to propose the same.

The GLV library will follow x.y.z version number where each are described as follows:
x: Corresponds to a particular JanusGraph major release version.
y: Corresponds to a particular JanusGraph minor release version
z: Corresponds to the path release for GLV library in itself.

For example, since we are starting to support begining JanusGraph 0.3.0, the following set of versions will be followed according to above proposed rule.

JanusGraph 0.3.0 : JanusGraph-Py 1.0.0
JanusGraph 0.3.0 : JanusGraph-Py 1.0.z [For all patches which are compatible with drivers for JanusGraph 0.3.0]

JanusGraph 0.3.1 : JanusGraph-Py 1.1.0 [Notice y change, because there was minor version change in JanusGraph]
JanusGraph 0.3.1 : JanusGraph-Py 1.1.z [For all patches for same JanusGraph version]

When JanusGraph bumps by Major version change like lets say 0.4.0,

JanusGraph 0.4.0 JanusGraph-Py 2.0.z
JanusGraph 0.4.1 JanusGraph-Py 2.1.z

And so on. The above rule helps in maintaining release plan for GLV libraries separately than JanusGraph in itself, and at the same time brings down a bit of commonality in bumps across major/minor version changes of JanusGraph and its Clients.

I'm still open to other ideas, so everyone, all suggestions are welcome.

Cheers
Debasish K

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
To post to this group, send email to jan...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/e8e0daa5-e0d0-48b1-9219-809c47fe5e33%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Join janusgraph-dev@lists.lfaidata.foundation to automatically receive all group messages.