Date   

Re: [DISCUSSION] JanusGraph db-cache in distributed environment

Boxuan Li
 

Thanks for starting this thread! Just want to mention there is an open PR for Redis cache integration: https://github.com/JanusGraph/janusgraph/pull/3100


[DISCUSSION] JanusGraph db-cache in distributed environment

Oleksandr Porunov
 

Hello,

I would like to start a topic about JanusGraph db-cache we have today and ways to improve it for distributed environments.
I want to split this topic on several issues I see with this cache:
1) Invalidation
2) Performance
3) Sharding

Invalidation

The first main issue with this cache is that it doesn't have good invalidation mechanisms.
As for now there are 3 invalidation mechanisms:
  1. Enough time passed (cache.db-cache-time).
  2. Evicted due to cache size limitation (cache.db-cache-time).
  3. Evicted on current JanusGraph instance only due to being mutated on the current JanusGraph instance.
I believe that we should give users a possibility to either somehow invalidate this cache when they deemed necessary (using their own strategy). Thus, I created the next PR to give users a simple possibility to invalidate cache manually: https://github.com/JanusGraph/janusgraph/pull/3184
Moreover, I think it would be great if there would be some kind of pattern implemented in JanusGraph to be able to invalidate data on mutation on all JanusGraph nodes ( the issue to track this feature is here: https://github.com/JanusGraph/janusgraph/issues/3155 ). I don't know how it's best to implement the later feature, but I have following ideas:
- We can reuse JanusGraph messaging mechanism (i.e. the one which works on top of a storage backend). With this feature users don't need any external messaging tools and can enable global db-cache invalidation on mutation easily. The downside is most likely performance, because usually those storage backends are not the best tools for messaging.
- Using external tool for messaging (let's say Kafka, Redis, etc.). The advantage would be performance most likely but the disadvantage is that the user now needs to manage a separate external system.
- Providing an interface for mutated data invalidation. We can make a general interface which accepts a set of keys which need to be evicted in db-cache and then the implementation can be either developed by the user themselves or they can use existing JanusGraph solutions (let.s say we will have 3 options at the beggining: `storage-messaging`, `redis-messaging`, `kafka-messaging` and we will be able to add more systems if there is interest in them).

That's just a brainstorming, so if anyone has any thoughts about it. please share. Maybe this issue should be solved differently and maybe I should look at it from a different angle.

Performance

For some reason we didn't look at this side too much but as noted here and here the Guava cache we use isn't the best option. I didn't investigate what is the best option for those caches we have (and we have 8 caches as commented here).
As an obvious solution is to move from Guava to Caffeine cache. That said, if anyone thinks we need to try another cache or have any thoughts about it, please post them here.

Sharding

As for now db-cache caches all the data per JanusGraph instance. No any cache data is shared between multiple JanusGraph instances. In some use-cases this is an advantage but in some situations it's a disadvantage.
I think that it would makes sense to add several options for different db-cache implementation. Some implementations would be local only and some distributed against all participating JanusGraph nodes.
The are several implementations I could think of:
1) Default local only db-cache. This cache would use Caffeine cache implementation and there could be some invalidation strategies available to trigger invalidation on all JanusGraph nodes using some messaging tools (as described in `Invalidation` section).
2) Redis cache - this cache would use external Redis nodes to cache all the data. The advantage is that Redis may be shared and scaled separately. All the data in Redis will be distributed (depending on installation) which may improve cache usage in some cases. Maybe also using client-side caching could improve performance in some cases.
3) Hazelcast cache - this cache would require additional configuration and making sure that all JanusGraph nodes can reach other JanusGraph nodes. Hazelcast cache can be bound to JVM but being able to form a cluster on nodes for the cache and shard / replicate all the data between all the nodes in the cluster. With this cache users won't need to manage external cache nodes but users will need to make sure that their JanusGraph installation has direct network access to all nodes in the cluster.

Any thoughts about any of the above points (or maybe missing points / issues) are very welcome.

Best regards,
Oleksandr


LF AI&Data JanusGraph Project License Scan and Findings August 2022

Jeff Shapiro <jshapiro@...>
 

Hi Team,

Here are the results from the August 2022 license scan of the JanusGraph project. The scan was performed using the Linux Foundation Fossology server. Licenses and copyrights were examined.

The key findings (if any) and license summary can be found in the HTML report, the list of files in the spreadsheet, and also find the SPDX file listed below:

REPORTS:

lfai/JanusGraph, code pulled 2022-08-08
- report: https://lfscanning.org/reports/lfai/JanusGraph-2022-08-08-128b378d-8656-4907-ad70-b876a161c82d.html
- xlsx: https://lfscanning.org/reports/lfai/JanusGraph-2022-08-08-128b378d-8656-4907-ad70-b876a161c82d.xlsx
- spdx: https://github.com/lfscanning/spdx-lfai/tree/master/JanusGraph/2022-07/JanusGraph-2022-08-08.spdx

Please feel free to contact me with any questions about the scan results. Be sure to reply to me directly as I may not get an email sent directly to the distribution list. Since this is the first license scan for this project, if you have any general questions about the scanning process you can let me know that too.

Thanks, Jeff

Jeff Shapiro
408-910-7792
jshapiro@...


Re: Regarding DB

Kalpa 1977
 

Thanks for the clarifications.
Kalpa

On Sun, Jul 31, 2022 at 3:46 PM <hadoopmarc@...> wrote:

[Edited Message Follows]

1. Sure, see: https://github.com/JanusGraph/janusgraph/blob/master/LICENSE.txt

2. JanusGraph backends are not columnar in the usual meaning of the word, but rather assume a BigTable like KeyColumnValue store, see https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/keycolumnvalue/KeyColumnValueStore.java
This provides better horizontal scalability than technologies like solr/elacticsearch. A KeyColumnValue store client can see from the partitioned key which backend node to address, while solr/elastic involves a server-side routing step to direct a query to the right shard when requesting a specific document by id. This might be different for your inhouse store, though, so check whether it can implement the interface above efficiently. You may also want to check https://li-boxuan.medium.com/janusgraph-deep-dive-data-layout-in-janusgraph-cd33c045a495 whether your store supports this data layout.


Re: Regarding DB

hadoopmarc@...
 
Edited

1. Sure, see: https://github.com/JanusGraph/janusgraph/blob/master/LICENSE.txt

2. JanusGraph backends are not columnar in the usual meaning of the word, but rather assume a BigTable like KeyColumnValue store, see https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/keycolumnvalue/KeyColumnValueStore.java
This provides better horizontal scalability than technologies like solr/elacticsearch. A KeyColumnValue store client can see from the partitioned key which backend node to address, while solr/elastic involves a server-side routing step to direct a query to the right shard when requesting a specific document by id. This might be different for your inhouse store, though, so check whether it can implement the interface above efficiently. You may also want to check https://li-boxuan.medium.com/janusgraph-deep-dive-data-layout-in-janusgraph-cd33c045a495 whether your store supports this data layout.


Regarding DB

Kalpa 1977
 

Hi All,
    My company has a inhouse db like elasticsearch but only index based search, no columnar.
I see that janusgraph mostly has columnar db as the primary data storage.
1. can we plugin our own db, is it allowed?
2. why only columnar as the primary db, any reasons.
   index based lucene also can support like getting all the indexed documents.
any reason why columnar as the primary backend?
thanks,
Kalpa


[ANNOUNCE] JanusGraph 0.6.2 Release

Oleksandr Porunov
 

The JanusGraph Technical Steering Committee is excited to announce the release of JanusGraph 0.6.2.

JanusGraph is an Apache TinkerPop enabled property graph database with support for a variety of storage and indexing backends. Thank you to all of the contributors.

The release artifacts can be found at this location:
    https://github.com/JanusGraph/janusgraph/releases/tag/v0.6.2

A full binary distribution is provided for user convenience:
        https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-full-0.6.2.zip
 
A truncated binary distribution is provided:
        https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-0.6.2.zip

The online docs can be found here:
    https://docs.janusgraph.org
 
To view the resolved issues and commits check the milestone here:
    https://github.com/JanusGraph/janusgraph/milestone/23?closed=1

Thank you very much,
Oleksandr Porunov
on behalf of JanusGraph TSC


[RESULT][VOTE] JanusGraph 0.6.2 release

Oleksandr Porunov
 

This vote is now closed with a total of 3 +1s, no +0s and no -1s. The results are:

BINDING VOTES:

+1  (3 -- Florian Hockmann, Oleksandr Porunov, Jan Jansen)
0   (0)
-1  (0)

NON-BINDING VOTES:

+1 (0)
0  (0)
-1 (0)

Thank you very much,
Oleksandr Porunov


Re: [DISCUSS] Discord Server

Florian Hockmann
 

Thanks for your feedback, Oleksandr.

 

We now have 3 people here on the dev list in favour of starting a Discord Server and no one against it. Only one user responded on janusgraph-users in the thread you linked, and that user was also in favour of Discord. I also asked people on Gitter for their opinion [1] but didn’t really get a response there. Since also nobody on Gitter voiced concerns about moving to Discord, I don’t really see a reason against starting a Discord Server.

 

So, I went ahead and created a server. You can join via this link: https://discord.gg/5MnxF82VGw

I’d say we wait a week or so before promoting it also to users, so we have some time to use and test the server ourselves.

 

Regarding links between the TinkerPop server and our own: We can probably create some kind of “welcome” channel where we briefly explain the channels of our own server and then also link to the TinkerPop server for general Gremlin questions. TinkerPop already has such a welcome channel, and we can later ask to be linked there for JanusGraph specific questions.

 

[1]: https://gitter.im/janusgraph/janusgraph?at=6284afb0bd487e746b5c790f

 

Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von Oleksandr Porunov
Gesendet: Mittwoch, 1. Juni 2022 12:53
An: janusgraph-dev@...
Betreff: Re: [janusgraph-dev] [DISCUSS] Discord Server

 

Related thread in Users group: https://lists.lfaidata.foundation/g/janusgraph-users/topic/discuss_moving_from_gitter/91181812


Re: [VOTE] JanusGraph 0.6.2 release

Jansen, Jan
 

My vote is +1

I must merge the v0.6 branch into master to update the changelog.md.

Greetings,
Jan

From: janusgraph-dev@... <janusgraph-dev@...> on behalf of Oleksandr Porunov via lists.lfaidata.foundation <alexandr.porunov=gmail.com@...>
Sent: Wednesday, June 1, 2022 12:18 PM
To: janusgraph-dev@... <janusgraph-dev@...>
Subject: Re: [janusgraph-dev] [VOTE] JanusGraph 0.6.2 release
 
Thank you Florian for taking care of this release.
I downloaded JanusGraph Full artifacts, started JanusGraph Server with Cassandra and ElasticSearch, checked that queries from `Basic Usage` documentation - looks good.
I quickly checked the content of Sonatype staging repository - looks good.
Release tag commit - looks good.
Changelog record - looks good.

My vote is +1.

Best regards,
Oleksandr Porunov


Re: [DISCUSS] Discord Server

Oleksandr Porunov
 

Related thread in Users group: https://lists.lfaidata.foundation/g/janusgraph-users/topic/discuss_moving_from_gitter/91181812


Re: [DISCUSS] Discord Server

Oleksandr Porunov
 

It looks like Discord stores unlimited amount of history like Gitter but it has more features. It's definitely a downside that you need to be registered in Discord for read-only purposes and your discussions are not indexes by Google. That said, we do have mailing lists where you don't need to be registered to view messages.
I think it would make sense to start a Discord server for JanusGraph and see how much popularity it gets. I guess we shouldn't abandon Gitter just now because we have 864 people there but we should prioritize using Discord and wait until it gets the same number of users as Gitter has right now.
As for another aspect - I'm also in favor of using Dedicated server because we will be able to create multiple channels which should be very useful to structure future discussions. 
I didn't research much but if there is a good way of redirecting users with Gremlin questions to TinkerPop server and JanusGraph questions to JanusGraph server than it would be great. I would imagine some dual-linking between JanusGraph server and TinkerPop server but not sure it that's possible.


Re: [VOTE] JanusGraph 0.6.2 release

Oleksandr Porunov
 

Thank you Florian for taking care of this release.
I downloaded JanusGraph Full artifacts, started JanusGraph Server with Cassandra and ElasticSearch, checked that queries from `Basic Usage` documentation - looks good.
I quickly checked the content of Sonatype staging repository - looks good.
Release tag commit - looks good.
Changelog record - looks good.

My vote is +1.

Best regards,
Oleksandr Porunov


[VOTE] JanusGraph 0.6.2 release

Florian Hockmann
 

Hello,

We are happy to announce that JanusGraph 0.6.2 is ready for release.

The release artifacts can be found at this location:
        
https://github.com/JanusGraph/janusgraph/releases/tag/v0.6.2

A full binary distribution is provided for user convenience:
        
https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-full-0.6.2.zip

 

A truncated binary distribution is provided:
        
https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-0.6.2.zip

 

The GPG key used to sign the release artifacts is available at:
        
https://github.com/JanusGraph/janusgraph/blob/v0.6/KEYS

The docs can be found here:
        
https://github.com/JanusGraph/janusgraph/releases/download/v0.6.2/janusgraph-0.6.2-doc.zip

The release tag in Git can be found here:
        
https://github.com/JanusGraph/janusgraph/tree/v0.6.2

The release notes are available here:
        
https://github.com/JanusGraph/janusgraph/blob/v0.6/docs/changelog.md#version-062-release-date-may-31-2022

This [VOTE] will be open for the next 3 days --- closing Friday, June 3, 2022 at 01:00 PM UTC+2.
All are welcome to review and vote on the release, but only votes from TSC members are binding.
My vote is +1.

Thank you,
Florian Hockmann

 


Re: [DISCUSS] JanusGraph 0.6.2 Release

Florian Hockmann
 

I’ll interpret the lack of replies as lazy consensus and proceed with the release process.

 

Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von Florian Hockmann
Gesendet: Mittwoch, 11. Mai 2022 17:21
An: janusgraph-dev@...
Betreff: [janusgraph-dev] [DISCUSS] JanusGraph 0.6.2 Release

 

Hi,

 

it has been nearly 4 months now since we released 0.6.1 and I just merged the PR to update TinkerPop to 3.5.3 [1].

 

So, I think that it makes to release 0.6.2 soon. Given that the release process is now also partially automated (thanks to Oleksandr’s great work), I’m also more in favour of releasing a maintenance version with only a small number of changes like this one since the release process doesn’t require as much effort anymore.

 

We should of course wait until the distribution tests work again: https://github.com/JanusGraph/janusgraph/pull/3036

 

Are there any other issues or PRs that should be included in this release? I at least don’t see any in the milestone for this version right now [2].

 

Regards,

Florian

 

[1]: https://github.com/JanusGraph/janusgraph/pull/2999

[2]: https://github.com/JanusGraph/janusgraph/milestone/23

 

 


[DISCUSS] JanusGraph 0.6.2 Release

Florian Hockmann
 

Hi,

 

it has been nearly 4 months now since we released 0.6.1 and I just merged the PR to update TinkerPop to 3.5.3 [1].

 

So, I think that it makes to release 0.6.2 soon. Given that the release process is now also partially automated (thanks to Oleksandr’s great work), I’m also more in favour of releasing a maintenance version with only a small number of changes like this one since the release process doesn’t require as much effort anymore.

 

We should of course wait until the distribution tests work again: https://github.com/JanusGraph/janusgraph/pull/3036

 

Are there any other issues or PRs that should be included in this release? I at least don’t see any in the milestone for this version right now [2].

 

Regards,

Florian

 

[1]: https://github.com/JanusGraph/janusgraph/pull/2999

[2]: https://github.com/JanusGraph/janusgraph/milestone/23

 

 


Re: [DISCUSS] Discord Server

Florian Hockmann
 

This thread didn’t really see much activity and I’m not sure how to interpret this. Are people here in favour of switching to Discord but don’t have much else to add to the discussion? Or do you just not care much about it?

 

But since we now only have two voices in favour of Discord, I’d move the discussion next to janusgraph-users and to our Discord channels to ask users for their opinion. If most users support the migration, then I’d say that we go ahead with it.

 

Another aspect of this we might want to discuss is whether we want to create our own Discord server or whether we just want to have a channel on the TinkerPop server which would probably also be an option.

The TinkerPop server would have the advantage that JanusGraph questions are often actually Gremlin questions and discussions about TinkerPop are usually also interesting for JanusGraph.

On the other hand, we might want to stay independent, and we probably also want to be able to create multiple channels (like for dev discussions or backend specific ones).

I think I’m more in favour of a dedicated JanusGraph server, but I still wanted to mention both possibilities in case others see it differently.

 

 

Von: janusgraph-dev@... <janusgraph-dev@...> Im Auftrag von Boxuan Li
Gesendet: Mittwoch, 20. April 2022 17:35
An: janusgraph-dev@...
Betreff: Re: [janusgraph-dev] [DISCUSS] Discord Server

 

Vote for Discord.

 

Many questions on Gitter are not getting answered likely due to its lack of popularity. I personally haven’t been using Gitter for a long time because I don’t use it for any purpose other than answering JanusGraph related questions. Personally, migrating to Discord means I would be more able to help users.

 

One only benefit I like about Gitter is that it is indexed by Google.



On Apr 20, 2022, at 11:27 AM, Florian Hockmann <fh@...> wrote:

 

Hi,

 

we’re currently using Gitter as our chat system where we have two chat rooms, one mostly for users to ask questions and one to discuss development issues.

 

We already discussed moving to a different chat system two years ago as part of a discussion about creating the janusgraph-dev channel where Slack and Discord were mentioned as possible alternatives to Gitter [1].

This discussion about moving to a different chat system didn’t lead to a consensus so we stayed on Gitter. However, in the meantime TinkerPop has started a Discord server [2] which is getting more and more popular (~400 registered users right now, compared to ~250 in January). I’ve also recently noticed more and more JanusGraph questions being asked there so I wanted to bring this topic back up and suggest that we migrate to Discord.

 

Here are some advantages I see in favour of Discord:

  • Same platform that TinkerPop uses -> should make it easier for users
  • Discord seems to be becoming more popular for OSS communities*
  • Built-in support for voice chats

 

A downside of Discord is of course that people need to create an account for it whereas a GitHub/Gitlab/Twitter account is enough for Gitter.

 

Any thoughts on this?

 

* I don’t have numbers to back this up, but Jan mentioned it already in the discussion two years ago and Discord itself lists a few big OSS communities [3].

 

 


Re: [DISCUSS] Discord Server

Boxuan Li
 

Vote for Discord.

Many questions on Gitter are not getting answered likely due to its lack of popularity. I personally haven’t been using Gitter for a long time because I don’t use it for any purpose other than answering JanusGraph related questions. Personally, migrating to Discord means I would be more able to help users.

One only benefit I like about Gitter is that it is indexed by Google.

On Apr 20, 2022, at 11:27 AM, Florian Hockmann <fh@...> wrote:

Hi,
 
we’re currently using Gitter as our chat system where we have two chat rooms, one mostly for users to ask questions and one to discuss development issues.
 
We already discussed moving to a different chat system two years ago as part of a discussion about creating the janusgraph-dev channel where Slack and Discord were mentioned as possible alternatives to Gitter [1].
This discussion about moving to a different chat system didn’t lead to a consensus so we stayed on Gitter. However, in the meantime TinkerPop has started a Discord server [2] which is getting more and more popular (~400 registered users right now, compared to ~250 in January). I’ve also recently noticed more and more JanusGraph questions being asked there so I wanted to bring this topic back up and suggest that we migrate to Discord.
 
Here are some advantages I see in favour of Discord:
  • Same platform that TinkerPop uses -> should make it easier for users
  • Discord seems to be becoming more popular for OSS communities*
  • Built-in support for voice chats
 
A downside of Discord is of course that people need to create an account for it whereas a GitHub/Gitlab/Twitter account is enough for Gitter.
 
Any thoughts on this?
 
* I don’t have numbers to back this up, but Jan mentioned it already in the discussion two years ago and Discord itself lists a few big OSS communities [3].
 


[DISCUSS] Discord Server

Florian Hockmann
 

Hi,

 

we’re currently using Gitter as our chat system where we have two chat rooms, one mostly for users to ask questions and one to discuss development issues.

 

We already discussed moving to a different chat system two years ago as part of a discussion about creating the janusgraph-dev channel where Slack and Discord were mentioned as possible alternatives to Gitter [1].

This discussion about moving to a different chat system didn’t lead to a consensus so we stayed on Gitter. However, in the meantime TinkerPop has started a Discord server [2] which is getting more and more popular (~400 registered users right now, compared to ~250 in January). I’ve also recently noticed more and more JanusGraph questions being asked there so I wanted to bring this topic back up and suggest that we migrate to Discord.

 

Here are some advantages I see in favour of Discord:

  • Same platform that TinkerPop uses -> should make it easier for users
  • Discord seems to be becoming more popular for OSS communities*
  • Built-in support for voice chats

 

A downside of Discord is of course that people need to create an account for it whereas a GitHub/Gitlab/Twitter account is enough for Gitter.

 

Any thoughts on this?

 

* I don’t have numbers to back this up, but Jan mentioned it already in the discussion two years ago and Discord itself lists a few big OSS communities [3].

 

[1]: https://groups.google.com/g/janusgraph-dev/c/5Fp2tQNn_Po/m/WdmLRf3WAgAJ

[2]: https://discord.gg/ndMpKZcBEE

[3]: https://discord.com/open-source


[ANNOUNCE] JanusGraph 0.6.1 Release

Oleksandr Porunov
 

The JanusGraph Technical Steering Committee is excited to announce the release of JanusGraph 0.6.1.

JanusGraph is an Apache TinkerPop enabled property graph database with support for a variety of storage and indexing backends. Thank you to all of the contributors.

The release artifacts can be found at this location:
    https://github.com/JanusGraph/janusgraph/releases/tag/v0.6.1

A full binary distribution is provided for user convenience:
        https://github.com/JanusGraph/janusgraph/releases/download/v0.6.1/janusgraph-full-0.6.1.zip
 
A truncated binary distribution is provided:
        https://github.com/JanusGraph/janusgraph/releases/download/v0.6.1/janusgraph-0.6.1.zip

The online docs can be found here:
    https://docs.janusgraph.org
 
To view the resolved issues and commits check the milestone here:
    https://github.com/JanusGraph/janusgraph/milestone/22?closed=1

Thank you very much,
Oleksandr Porunov

1 - 20 of 1585