Janusgraph embedded multi instance(JVM) data sync issue


Pawan Shriwas
 

Hi All,

I am facing one problem for synchronization of data stored between multiple embedded mode janusgraph instances.

If we are creating some data into graph using JVM 1 and after committing when we get same data from JVM 2 its not reflecting for some duration.

I want to avail the same information to all instances after any CRUD operation once it gets committed.

I am using the same graph property in all instances of embedded janusgraph.

##############graph.properties#####################
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname=cql-dns
storage.cql.keyspace=janusgraphdbks
storage.port=30808
storage.username=user123
storage.password=user12345
schema.default=none
schema.constraints=true

index.search-central-graph.backend=elasticsearch
index.search-central-graph.hostname=api-es-instance1:9200
index.search-central-graph.index-name=search-central-graph
index.search-central-graph.elasticsearch.http.auth.type=basic
index.search-central-graph.elasticsearch.http.auth.basic.username=admin
index.search-central-graph.elasticsearch.http.auth.basic.password=admin 

cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
query.batch = true
query.fast-property = true
query.batch-property-prefetch = true
storage.buffer-size=1024
######################property file end############################

Please let me know if someone faces this and how to prevent this. 

Thanks,
Pawan



Boxuan Li
 

Hi Pawan,

A couple of questions:

1. What do you mean by creating some data? For example, do you mean by creating new vertices or just updating existing vertices? If it’s the latter case, then you could try turning off cache.db-cache option as it might lead to stale data read.

2. What is your typical “duration” after which data gets reflected?

3. What is your cql replication factor and read & write consistency levels? Are they default values? Also, how many Cassandra nodes do you have and are they in the same data center?

Best,
Boxuan

On Thu, Jan 6, 2022 at 3:03 PM Pawan Shriwas <shriwas.pawan@...> wrote:
Hi All,

I am facing one problem for synchronization of data stored between multiple embedded mode janusgraph instances.

If we are creating some data into graph using JVM 1 and after committing when we get same data from JVM 2 its not reflecting for some duration.

I want to avail the same information to all instances after any CRUD operation once it gets committed.

I am using the same graph property in all instances of embedded janusgraph.

##############graph.properties#####################
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname=cql-dns
storage.cql.keyspace=janusgraphdbks
storage.port=30808
storage.username=user123
storage.password=user12345
schema.default=none
schema.constraints=true

index.search-central-graph.backend=elasticsearch
index.search-central-graph.hostname=api-es-instance1:9200
index.search-central-graph.index-name=search-central-graph
index.search-central-graph.elasticsearch.http.auth.type=basic
index.search-central-graph.elasticsearch.http.auth.basic.username=admin
index.search-central-graph.elasticsearch.http.auth.basic.password=admin 

cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
query.batch = true
query.fast-property = true
query.batch-property-prefetch = true
storage.buffer-size=1024
######################property file end############################

Please let me know if someone faces this and how to prevent this. 

Thanks,
Pawan



Pawan Shriwas
 

Same case also happened with two or more gremlin consoles as well where we are creating/updating something on console 1 and not reflecting on others.


On Thu, Jan 6, 2022 at 12:32 PM Pawan Shriwas <shriwas.pawan@...> wrote:
Hi All,

I am facing one problem for synchronization of data stored between multiple embedded mode janusgraph instances.

If we are creating some data into graph using JVM 1 and after committing when we get same data from JVM 2 its not reflecting for some duration.

I want to avail the same information to all instances after any CRUD operation once it gets committed.

I am using the same graph property in all instances of embedded janusgraph.

##############graph.properties#####################
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname=cql-dns
storage.cql.keyspace=janusgraphdbks
storage.port=30808
storage.username=user123
storage.password=user12345
schema.default=none
schema.constraints=true

index.search-central-graph.backend=elasticsearch
index.search-central-graph.hostname=api-es-instance1:9200
index.search-central-graph.index-name=search-central-graph
index.search-central-graph.elasticsearch.http.auth.type=basic
index.search-central-graph.elasticsearch.http.auth.basic.username=admin
index.search-central-graph.elasticsearch.http.auth.basic.password=admin 

cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
query.batch = true
query.fast-property = true
query.batch-property-prefetch = true
storage.buffer-size=1024
######################property file end############################

Please let me know if someone faces this and how to prevent this. 

Thanks,
Pawan




--
Thanks & Regard

PAWAN SHRIWAS


Pawan Shriwas
 

Hi Boxuan,

Please see my inline response

1. What do you mean by creating some data? For example, do you mean by creating new vertices or just updating existing vertices? If it’s the latter case, then you could try turning off cache.db-cache option as it might lead to stale data read.
[Pawan] - Creating data means vertex/edge creation and updation as well. 

2. What is your typical “duration” after which data gets reflected?  
[Pawan] - Seems to be within a 1 or two min.

3. What is your cql replication factor and read & write consistency levels? Are they default values? Also, how many Cassandra nodes do you have and are they in the same data center?
[Pawan] - These should be defaults, I am using only those graph properties which are mentioned in the below mail.  There are 8 nodes cluster(3 master + 5 nodes ). All cassandra nodes are there in the same data center.

Thanks,
Pawan

On Thu, Jan 6, 2022 at 5:39 PM Boxuan Li <liboxuan@...> wrote:
Hi Pawan,

A couple of questions:

1. What do you mean by creating some data? For example, do you mean by creating new vertices or just updating existing vertices? If it’s the latter case, then you could try turning off cache.db-cache option as it might lead to stale data read.

2. What is your typical “duration” after which data gets reflected?

3. What is your cql replication factor and read & write consistency levels? Are they default values? Also, how many Cassandra nodes do you have and are they in the same data center?

Best,
Boxuan

On Thu, Jan 6, 2022 at 3:03 PM Pawan Shriwas <shriwas.pawan@...> wrote:
Hi All,

I am facing one problem for synchronization of data stored between multiple embedded mode janusgraph instances.

If we are creating some data into graph using JVM 1 and after committing when we get same data from JVM 2 its not reflecting for some duration.

I want to avail the same information to all instances after any CRUD operation once it gets committed.

I am using the same graph property in all instances of embedded janusgraph.

##############graph.properties#####################
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname=cql-dns
storage.cql.keyspace=janusgraphdbks
storage.port=30808
storage.username=user123
storage.password=user12345
schema.default=none
schema.constraints=true

index.search-central-graph.backend=elasticsearch
index.search-central-graph.hostname=api-es-instance1:9200
index.search-central-graph.index-name=search-central-graph
index.search-central-graph.elasticsearch.http.auth.type=basic
index.search-central-graph.elasticsearch.http.auth.basic.username=admin
index.search-central-graph.elasticsearch.http.auth.basic.password=admin 

cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
query.batch = true
query.fast-property = true
query.batch-property-prefetch = true
storage.buffer-size=1024
######################property file end############################

Please let me know if someone faces this and how to prevent this. 

Thanks,
Pawan




--
Thanks & Regard

PAWAN SHRIWAS


hadoopmarc@...
 

Hi Pawan,

Your requirement for instant synchronization cannot work with JanusGraph caches enabled, because JanusGraph will get data from the cache if available, instead of getting the latest data from the backend. So,

  • cache.db-cache = false
  • be sure to start a new transaction before querying for the latest data (e.g. by executing a g.tx().commit())
Best wishes,    Marc


Pawan Shriwas
 

Hi Marc,

I have removed cache properties from instances and we already have new transactions for each api operation but still facing stale data issues in other instances for some time.

Below is the code which is used for the new transaction for each operation.

In my embedded janusgraph service, We always create new translations for each api operation using below code and do commit or rollback at the end of api operation.  but sometimes it works and sometimes not. Is it a sync kind of issue which varies between graph instances in multiple services(JVM).

// Create graph instance code(once service start) 
  String filePath = ConfigUtils.getString(GraphConstants.GRAPH_FILE_PATH);
  JanusGraph graphinstance = embeddedConnection.open(filePath);

// create transaction code for each api operation
  JanusgraphTransaction threadedTransaction=  graphinstance.getGraphInstance().newTransaction();

// we do commit or rollback at end of each api operation
        threadedTransaction.commit();
                 //or 
        threadedTransaction.rollback();

Let me know if anything related to configuration or any code needs to tried for the same.

Thanks,
Pawan

On Fri, Jan 7, 2022 at 1:45 PM <hadoopmarc@...> wrote:
Hi Pawan,

Your requirement for instant synchronization cannot work with JanusGraph caches enabled, because JanusGraph will get data from the cache if available, instead of getting the latest data from the backend. So,

  • cache.db-cache = false
  • be sure to start a new transaction before querying for the latest data (e.g. by executing a g.tx().commit())
Best wishes,    Marc



--
Thanks & Regard

PAWAN SHRIWAS


hadoopmarc@...
 

Hi Pawan,

OK, let's investigate further. You say that the issue occurs for both vertex creation and modification. Let's take the clearest case first: vertex creation with an indexed property. So, in your system setup, if you have added a new vertex with embedded intance1, sometimes it takes a minute or more before a query for this vertex (based on its property value) on instance2 returns the vertex. This can only mean that the elasticserch index sometimes does not return the new property value. This on its turn means that an elasticsearch replica has not yet been synced with the data about the new vertex.

Indeed, the janusgraph-elastic configs have a key index.[X].elasticsearch.bulk-refresh (default: false) which can be set to any of the values in:
https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docs-refresh.html

One can check the correspondence between this janusgraph config item and the elasticsearch API parameter in:
https://github.com/JanusGraph/janusgraph/blob/v0.6.0/janusgraph-es/src/main/java/org/janusgraph/diskstorage/es/rest/RestElasticSearchClient.java

So, can you see what happens with the other possible values for index.[X].elasticsearch.bulk-refresh?

Best wishes,    Marc


Pawan Shriwas
 

Hi Marc,

I don't think data cache was created due to elastic search/mixed index only. I have seen this on basic property/node without index as well. I am thinking let's work on basic node/property then we can plan for mixed index cases.

Any suggestions for  basic case without an index backend?

Thanks,
Pawan


On Sat, Jan 15, 2022 at 5:16 PM <hadoopmarc@...> wrote:
Hi Pawan,

OK, let's investigate further. You say that the issue occurs for both vertex creation and modification. Let's take the clearest case first: vertex creation with an indexed property. So, in your system setup, if you have added a new vertex with embedded intance1, sometimes it takes a minute or more before a query for this vertex (based on its property value) on instance2 returns the vertex. This can only mean that the elasticserch index sometimes does not return the new property value. This on its turn means that an elasticsearch replica has not yet been synced with the data about the new vertex.

Indeed, the janusgraph-elastic configs have a key index.[X].elasticsearch.bulk-refresh (default: false) which can be set to any of the values in:
https://www.elastic.co/guide/en/elasticsearch/reference/7.16/docs-refresh.html

One can check the correspondence between this janusgraph config item and the elasticsearch API parameter in:
https://github.com/JanusGraph/janusgraph/blob/v0.6.0/janusgraph-es/src/main/java/org/janusgraph/diskstorage/es/rest/RestElasticSearchClient.java

So, can you see what happens with the other possible values for index.[X].elasticsearch.bulk-refresh?

Best wishes,    Marc



--
Thanks & Regard

PAWAN SHRIWAS


hadoopmarc@...
 

Hi Pawan,

You are right, if issues already arise without index, you should investigate that first, even though a large graph without indices is useless in itself.
See the third question from Boxuan Li above, in particular:
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlAboutDataConsistency.html

Best wishes,   Marc


Pawan Shriwas
 

Hi Marc,

Thanks for your suggestion,

However I am testing it on a local environment having a single replication factor. I believe if the replication factor is one then in all cases it should give me the same data/information in other instances as well. 

Screenshot 2022-01-23 at 5.11.26 PM.png

see below local property file information

gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname=127.0.0.1
storage.cql.keyspace=janusgraph
storage.port=9042
schema.constraints=true
############ CQL Properties ############

storage.cql.read-consistency-level=LOCAL_QUORUM
storage.cql.write-consistency-level=LOCAL_QUORUM
storage.cql.replication-factor=1

Please see attached API code in for create update and get for local sample application. Let me know if something is wrong here because that refresh of data is not working on another embedded instance with the same configuration.

Thanks,
Pawan
 

On Thu, Jan 20, 2022 at 12:44 PM <hadoopmarc@...> wrote:
Hi Pawan,

You are right, if issues already arise without index, you should investigate that first, even though a large graph without indices is useless in itself.
See the third question from Boxuan Li above, in particular:
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlAboutDataConsistency.html

Best wishes,   Marc



--
Thanks & Regard

PAWAN SHRIWAS


hadoopmarc@...
 

Hi Pawan,

Interesting, I could not find a JanusGraph unit test for this basic scenario (there is one with two instances and an index, though). This needs more investigation.

Meawhile, are you sure that you have no hidden configs for caching in the springframework rest service?

Best wishes,    Marc


Pawan Shriwas
 

Hi Marc,

All code and property configuration are shared in the last trail mail. I hope if we have not provided the cache properties then it means it will default false.

Thanks,
Pawan

On Tue, 25 Jan 2022, 2:14 am , <hadoopmarc@...> wrote:
Hi Pawan,

Interesting, I could not find a JanusGraph unit test for this basic scenario (there is one with two instances and an index, though). This needs more investigation.

Meawhile, are you sure that you have no hidden configs for caching in the springframework rest service?

Best wishes,    Marc