"anj...@gmail.com" <anjani...@...>
Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...:
toggle quoted message
Show quoted text
Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
toggle quoted message
Show quoted text
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
Hi Anjani,
Do you mean that there are still (extremely rare) failure situations possible despite the use of locking and the use of JanusGraph transactions? I am not sure if I can think of one and it would depend on ill-timed failures in the backend (e.g. power failure). One thing to worry about and that you could properly test, is whether all mutations in the JanusGraph transaction are sent to the backend in a single network request (otherwise JanusGraph could have persisted two of the five nodes and then fail). There are various configuration properties that might influence this:
query.batch storage.cql.atomic-batch-mutate storage.cql.batch-statement-size
Also see the comments for the tx.log-tx property.
HTH, Marc
Op vrijdag 30 oktober 2020 om 15:58:39 UTC+1 schreef anj...@...:
toggle quoted message
Show quoted text
Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Hi Marc,
Thanks for your detailed response. My understanding is node is locked automatically during operation and get released after it, does not wait for commit.
Suppose i need to update 3 nodes. I can write like as below. In this way if there is any exception for any of the node, will not commit and hence can control it. try { g.V(4104).property("NodeUpdatedDate", new Date()).next(); g.V(4288).property("NodeUpdatedDate", new Date()).next(); g.V(4188).property("NodeUpdatedDate", new Date()).next(); g.tx().commit(); } catch (Exception e) { //Recover, retry } With this 1st node V(4104) is locked by thread when update is happening, but it get released when update for other nodes V(4288), V(4188) happening, which mean other thread can update V(4104) before transaction is committed, which might result in data inconsistency.
I was thinking in some way acquire lock on all nodes before doing any operation on them some thing like : g.V(4288).lock(),g.V(4104).lock(), g.V(4188).lock() After locking explicitly, perform operations and unlock as part of commit.
Thanks, Anjani
toggle quoted message
Show quoted text
On Saturday, 31 October 2020 at 17:01:05 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Do you mean that there are still (extremely rare) failure situations possible despite the use of locking and the use of JanusGraph transactions? I am not sure if I can think of one and it would depend on ill-timed failures in the backend (e.g. power failure). One thing to worry about and that you could properly test, is whether all mutations in the JanusGraph transaction are sent to the backend in a single network request (otherwise JanusGraph could have persisted two of the five nodes and then fail). There are various configuration properties that might influence this:
query.batch storage.cql.atomic-batch-mutate storage.cql.batch-statement-size
Also see the comments for the tx.log-tx property.
HTH, Marc
Op vrijdag 30 oktober 2020 om 15:58:39 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
Hi Anjani,
See step 4 in the ref docs link I sent earlier: the locks are not released until the entire transaction is committed or rolled back.
Marc
Op maandag 2 november 2020 om 13:21:57 UTC+1 schreef anj...@...:
toggle quoted message
Show quoted text
Hi Marc,
Thanks for your detailed response. My understanding is node is locked automatically during operation and get released after it, does not wait for commit.
Suppose i need to update 3 nodes. I can write like as below. In this way if there is any exception for any of the node, will not commit and hence can control it. try { g.V(4104).property("NodeUpdatedDate", new Date()).next(); g.V(4288).property("NodeUpdatedDate", new Date()).next(); g.V(4188).property("NodeUpdatedDate", new Date()).next(); g.tx().commit(); } catch (Exception e) { //Recover, retry } With this 1st node V(4104) is locked by thread when update is happening, but it get released when update for other nodes V(4288), V(4188) happening, which mean other thread can update V(4104) before transaction is committed, which might result in data inconsistency.
I was thinking in some way acquire lock on all nodes before doing any operation on them some thing like : g.V(4288).lock(),g.V(4104).lock(), g.V(4188).lock() After locking explicitly, perform operations and unlock as part of commit.
Thanks, Anjani
On Saturday, 31 October 2020 at 17:01:05 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Do you mean that there are still (extremely rare) failure situations possible despite the use of locking and the use of JanusGraph transactions? I am not sure if I can think of one and it would depend on ill-timed failures in the backend (e.g. power failure). One thing to worry about and that you could properly test, is whether all mutations in the JanusGraph transaction are sent to the backend in a single network request (otherwise JanusGraph could have persisted two of the five nodes and then fail). There are various configuration properties that might influence this:
query.batch storage.cql.atomic-batch-mutate storage.cql.batch-statement-size
Also see the comments for the tx.log-tx property.
HTH, Marc
Op vrijdag 30 oktober 2020 om 15:58:39 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Thanks Mark, i missed step 4. Thank you very much for pointing it.
One more question, Our graph is already running on prod and properties are defined but consistency is not set on them. If i add consistency modifier for existing properties, will it be picked up?
Thanks, Anjani
toggle quoted message
Show quoted text
On Monday, 2 November 2020 at 21:41:18 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
See step 4 in the ref docs link I sent earlier: the locks are not released until the entire transaction is committed or rolled back.
Marc
Op maandag 2 november 2020 om 13:21:57 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your detailed response. My understanding is node is locked automatically during operation and get released after it, does not wait for commit.
Suppose i need to update 3 nodes. I can write like as below. In this way if there is any exception for any of the node, will not commit and hence can control it. try { g.V(4104).property("NodeUpdatedDate", new Date()).next(); g.V(4288).property("NodeUpdatedDate", new Date()).next(); g.V(4188).property("NodeUpdatedDate", new Date()).next(); g.tx().commit(); } catch (Exception e) { //Recover, retry } With this 1st node V(4104) is locked by thread when update is happening, but it get released when update for other nodes V(4288), V(4188) happening, which mean other thread can update V(4104) before transaction is committed, which might result in data inconsistency.
I was thinking in some way acquire lock on all nodes before doing any operation on them some thing like : g.V(4288).lock(),g.V(4104).lock(), g.V(4188).lock() After locking explicitly, perform operations and unlock as part of commit.
Thanks, Anjani
On Saturday, 31 October 2020 at 17:01:05 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Do you mean that there are still (extremely rare) failure situations possible despite the use of locking and the use of JanusGraph transactions? I am not sure if I can think of one and it would depend on ill-timed failures in the backend (e.g. power failure). One thing to worry about and that you could properly test, is whether all mutations in the JanusGraph transaction are sent to the backend in a single network request (otherwise JanusGraph could have persisted two of the five nodes and then fail). There are various configuration properties that might influence this:
query.batch storage.cql.atomic-batch-mutate storage.cql.batch-statement-size
Also see the comments for the tx.log-tx property.
HTH, Marc
Op vrijdag 30 oktober 2020 om 15:58:39 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
Hi Anjani,
No, existing schema elements cannot be modified (apart from renames), see:
https://docs.janusgraph.org/basics/schema/#changing-schema-elements
Best wishes, Marc
Op dinsdag 3 november 2020 om 11:08:00 UTC+1 schreef anj...@...:
toggle quoted message
Show quoted text
Thanks Mark, i missed step 4. Thank you very much for pointing it.
One more question, Our graph is already running on prod and properties are defined but consistency is not set on them. If i add consistency modifier for existing properties, will it be picked up?
Thanks, Anjani
On Monday, 2 November 2020 at 21:41:18 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
See step 4 in the ref docs link I sent earlier: the locks are not released until the entire transaction is committed or rolled back.
Marc
Op maandag 2 november 2020 om 13:21:57 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your detailed response. My understanding is node is locked automatically during operation and get released after it, does not wait for commit.
Suppose i need to update 3 nodes. I can write like as below. In this way if there is any exception for any of the node, will not commit and hence can control it. try { g.V(4104).property("NodeUpdatedDate", new Date()).next(); g.V(4288).property("NodeUpdatedDate", new Date()).next(); g.V(4188).property("NodeUpdatedDate", new Date()).next(); g.tx().commit(); } catch (Exception e) { //Recover, retry } With this 1st node V(4104) is locked by thread when update is happening, but it get released when update for other nodes V(4288), V(4188) happening, which mean other thread can update V(4104) before transaction is committed, which might result in data inconsistency.
I was thinking in some way acquire lock on all nodes before doing any operation on them some thing like : g.V(4288).lock(),g.V(4104).lock(), g.V(4188).lock() After locking explicitly, perform operations and unlock as part of commit.
Thanks, Anjani
On Saturday, 31 October 2020 at 17:01:05 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Do you mean that there are still (extremely rare) failure situations possible despite the use of locking and the use of JanusGraph transactions? I am not sure if I can think of one and it would depend on ill-timed failures in the backend (e.g. power failure). One thing to worry about and that you could properly test, is whether all mutations in the JanusGraph transaction are sent to the backend in a single network request (otherwise JanusGraph could have persisted two of the five nodes and then fail). There are various configuration properties that might influence this:
query.batch storage.cql.atomic-batch-mutate storage.cql.batch-statement-size
Also see the comments for the tx.log-tx property.
HTH, Marc
Op vrijdag 30 oktober 2020 om 15:58:39 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|
"anj...@gmail.com" <anjani...@...>
Thanks Marc, for you help and time.
Regards, Anjani
toggle quoted message
Show quoted text
On Tuesday, 3 November 2020 at 21:45:33 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
No, existing schema elements cannot be modified (apart from renames), see:
Best wishes, Marc
Op dinsdag 3 november 2020 om 11:08:00 UTC+1 schreef anj...@...: Thanks Mark, i missed step 4. Thank you very much for pointing it.
One more question, Our graph is already running on prod and properties are defined but consistency is not set on them. If i add consistency modifier for existing properties, will it be picked up?
Thanks, Anjani
On Monday, 2 November 2020 at 21:41:18 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
See step 4 in the ref docs link I sent earlier: the locks are not released until the entire transaction is committed or rolled back.
Marc
Op maandag 2 november 2020 om 13:21:57 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your detailed response. My understanding is node is locked automatically during operation and get released after it, does not wait for commit.
Suppose i need to update 3 nodes. I can write like as below. In this way if there is any exception for any of the node, will not commit and hence can control it. try { g.V(4104).property("NodeUpdatedDate", new Date()).next(); g.V(4288).property("NodeUpdatedDate", new Date()).next(); g.V(4188).property("NodeUpdatedDate", new Date()).next(); g.tx().commit(); } catch (Exception e) { //Recover, retry } With this 1st node V(4104) is locked by thread when update is happening, but it get released when update for other nodes V(4288), V(4188) happening, which mean other thread can update V(4104) before transaction is committed, which might result in data inconsistency.
I was thinking in some way acquire lock on all nodes before doing any operation on them some thing like : g.V(4288).lock(),g.V(4104).lock(), g.V(4188).lock() After locking explicitly, perform operations and unlock as part of commit.
Thanks, Anjani
On Saturday, 31 October 2020 at 17:01:05 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
Do you mean that there are still (extremely rare) failure situations possible despite the use of locking and the use of JanusGraph transactions? I am not sure if I can think of one and it would depend on ill-timed failures in the backend (e.g. power failure). One thing to worry about and that you could properly test, is whether all mutations in the JanusGraph transaction are sent to the backend in a single network request (otherwise JanusGraph could have persisted two of the five nodes and then fail). There are various configuration properties that might influence this:
query.batch storage.cql.atomic-batch-mutate storage.cql.batch-statement-size
Also see the comments for the tx.log-tx property.
HTH, Marc
Op vrijdag 30 oktober 2020 om 15:58:39 UTC+1 schreef anj...@...: Hi Marc,
Thanks for your response. Earlier i had look on the page you shared and from that my understanding is we can define consistency at property level and if same property is modified by two different threads then consistency check from back-end happens and transaction can success or can throw locking exception. But this is applicable to a property of a singe node.
In my case i want to add/update property on multiple nodes based on some condition. For example based on some rules we see some nodes are related and we want to group them, for that want to add/update one property on multiple nodes, say want to add/update property on 5 nodes. In that case want to local all 5 nodes, update them and then release locks. - If update to any of the node fails then we should roll back updates to other nodes also. - When update to 5 nodes are going on, no other threads should modify that property.
Thanks, Anjani
On Friday, 30 October 2020 at 19:26:10 UTC+5:30 HadoopMarc wrote:
Hi Anjani,
I am not sure if I understand your question and if your question already took the following into account:
What aspect of transactions do you miss? You can choose between tx.commit() for succesful insertion and tx.rollback() in case of exceptions.
Please clarify!
Marc
Op vrijdag 30 oktober 2020 om 08:15:36 UTC+1 schreef anj...@...: Hi All,
We are using Janus 0.5.2 with Cassandra and Elastic-search. Currently for adding or updating a node we are using gremlin queries in java.
We have a use case where we need to update multiple-nodes for a given metadata. We want to make sure updates to multiple nodes are transactional and when updates are happening, no other thread should update them.
Through gremlin queries do we have option to: - achieve transaction updates. - locking/unlocking of nodes for updates?
Appreciate your thoughts/inputs.
Thanks, Anjani
|
|