Alexandr Porunov <alexand...@...>
Hello,
Does anybody know how ids are generated in a distributed environment? I am trying to figure out how JanusGraph generates identifiers for new vertices. As I understand each vertex has a 64 bit identifier which determines in which partition the vertex is stored. The only problem is that there is no any information about how those IDs are generated. Is there a possibility for duplicates when several instances are creating new vertices simultaneously? Is there any possibility to change the identifier of the vertex when it is persisted already? Are instances talking to each other to synchronize identifiers?
Best regards, Alexandr
|
|
Ankur Goel <ankur...@...>
Multiple instances can not duplicate vertex id, as there is a concept of unique id block assignment to each instance in case of unused this is not reassigned to other instance. Check property configuration:
ids.block-size
~
toggle quoted message
Show quoted text
On Tuesday, December 26, 2017 at 2:58:29 AM UTC+5:30, Alexandr Porunov wrote: Hello,
Does anybody know how ids are generated in a distributed environment? I am trying to figure out how JanusGraph generates identifiers for new vertices. As I understand each vertex has a 64 bit identifier which determines in which partition the vertex is stored. The only problem is that there is no any information about how those IDs are generated. Is there a possibility for duplicates when several instances are creating new vertices simultaneously? Is there any possibility to change the identifier of the vertex when it is persisted already? Are instances talking to each other to synchronize identifiers?
Best regards, Alexandr
|
|
Alexandr Porunov <alexand...@...>
Ankur,
Thank you very much for pointing out. I tried to investigate more deeply into the id allocation process, but I am still confused a little bit. I would be very grateful If you can answer one more question. I still don't understand how different JanusGraph instances allocate id blocks. What if two different JanusGraph instances allocate the same id block? For example, if we are using two multi datacenter Cassandra clusters (which has an asynchronous syncing) we cannot guarantee consistency (because Cassandra is eventually consistent database). So, there is a probability that two different JanusGraph instances allocate the same id block (each of the instances would think that only it allocated the block). Is there any protection in JanusGraph against such situation? What would happen in such scenario?
Best regards, Alexandr
toggle quoted message
Show quoted text
On Tuesday, December 26, 2017 at 6:32:12 AM UTC+2, Ankur Goel wrote: Multiple instances can not duplicate vertex id, as there is a concept of unique id block assignment to each instance in case of unused this is not reassigned to other instance. Check property configuration:
ids.block-size ~ On Tuesday, December 26, 2017 at 2:58:29 AM UTC+5:30, Alexandr Porunov wrote: Hello,
Does anybody know how ids are generated in a distributed environment? I am trying to figure out how JanusGraph generates identifiers for new vertices. As I understand each vertex has a 64 bit identifier which determines in which partition the vertex is stored. The only problem is that there is no any information about how those IDs are generated. Is there a possibility for duplicates when several instances are creating new vertices simultaneously? Is there any possibility to change the identifier of the vertex when it is persisted already? Are instances talking to each other to synchronize identifiers?
Best regards, Alexandr
|
|
Hi, any updates on this? I'm thinking about this question too
在 2017年12月30日星期六 UTC+8上午3:56:51,Alexandr Porunov写道:
toggle quoted message
Show quoted text
Ankur,
Thank you very much for pointing out. I tried to investigate more deeply into the id allocation process, but I am still confused a little bit. I would be very grateful If you can answer one more question. I still don't understand how different JanusGraph instances allocate id blocks. What if two different JanusGraph instances allocate the same id block? For example, if we are using two multi datacenter Cassandra clusters (which has an asynchronous syncing) we cannot guarantee consistency (because Cassandra is eventually consistent database). So, there is a probability that two different JanusGraph instances allocate the same id block (each of the instances would think that only it allocated the block). Is there any protection in JanusGraph against such situation? What would happen in such scenario?
Best regards, Alexandr On Tuesday, December 26, 2017 at 6:32:12 AM UTC+2, Ankur Goel wrote: Multiple instances can not duplicate vertex id, as there is a concept of unique id block assignment to each instance in case of unused this is not reassigned to other instance. Check property configuration:
ids.block-size ~ On Tuesday, December 26, 2017 at 2:58:29 AM UTC+5:30, Alexandr Porunov wrote: Hello,
Does anybody know how ids are generated in a distributed environment? I am trying to figure out how JanusGraph generates identifiers for new vertices. As I understand each vertex has a 64 bit identifier which determines in which partition the vertex is stored. The only problem is that there is no any information about how those IDs are generated. Is there a possibility for duplicates when several instances are creating new vertices simultaneously? Is there any possibility to change the identifier of the vertex when it is persisted already? Are instances talking to each other to synchronize identifiers?
Best regards, Alexandr
|
|
Evgeniy Ignatiev <yevgeniy...@...>
Hello.
It utilizes transactions (JanusGraph-level) internally:
*
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConsistentKeyIDAuthority.java#L260
*
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConsistentKeyIDAuthority.java#L285
*https://github.com/JanusGraph/janusgraph/blob/1c864333e709a4445c049b051855d726decb56d8/janusgraph-core/src/main/java/org/janusgraph/diskstorage/util/BackendOperation.java#L142
See configuration options for tuning:
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java#L700
Also retries in case id block is already allocated in the
backend. Using multi-dc setup with default options is probably
safe (?) - as far as I can tell from
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java
- by default system operations (transactions too) will utilize
QUORUM consistency in a cross-DC manner. You will have to use
NetworkTopologyStrategy in JanusGraph configuration
(SimpleStrategy by default for Janus keyspaces).
Best regards,
Evgenii Ignatev.
On 6/21/2020 6:36 AM, Chen Wu wrote:
Hi, any updates on this? I'm thinking about this
question too
在 2017年12月30日星期六 UTC+8上午3:56:51,Alexandr Porunov写道:
Ankur,
Thank you very much for pointing out. I tried to
investigate more deeply into the id allocation process,
but I am still confused a little bit.
I would be very grateful If you can answer one more
question.
I still don't understand how different JanusGraph
instances allocate id blocks. What if two different
JanusGraph instances allocate the same id block? For
example, if we are using two multi datacenter Cassandra
clusters (which has an asynchronous syncing) we cannot
guarantee consistency (because Cassandra is eventually
consistent database). So, there is a probability that two
different JanusGraph instances allocate the same id block
(each of the instances would think that only it allocated
the block). Is there any protection in JanusGraph against
such situation? What would happen in such scenario?
Best regards,
Alexandr
On Tuesday, December 26, 2017 at 6:32:12 AM UTC+2, Ankur
Goel wrote:
Multiple instances can not duplicate vertex
id, as there is a concept of unique id block assignment
to each instance in case of unused this is not
reassigned to other instance. Check property
configuration:
ids.block-size
~
On Tuesday, December 26, 2017 at 2:58:29 AM UTC+5:30,
Alexandr Porunov wrote:
Hello,
Does anybody know how ids are generated in a
distributed environment?
I am trying to figure out how JanusGraph
generates identifiers for new vertices. As I
understand each vertex has a 64 bit identifier
which determines in which partition the vertex
is stored.
The only problem is that there is no any
information about how those IDs are generated.
Is there a possibility for duplicates when
several instances are creating new vertices
simultaneously? Is there any possibility to
change the identifier of the vertex when it is
persisted already? Are instances talking to each
other to synchronize identifiers?
Best regards,
Alexandr
--
You received this message because you are subscribed to the Google
Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/a1b77ea0-bf53-4303-ba6a-44fe9d1def30o%40googlegroups.com.
--
Best regards,
Evgeniy Ignatiev.
|
|
Thank you for your reply. I've also read the code in ConsistentKeyIdAuthority.java, what I'm confusing now is as follows: when one claim(claim1) is beat by another claim(claim2) on block [start, end], does claim1 retry calim on block [start, end], or it trys to claim on another block [start_new, end_new]? I think it should be the latter case but I can't find the code to claim another block. It's very helpful if you could give some hints, thanks.
在 2020年6月21日星期日 UTC+8下午8:10:27,Evgeniy Ignatiev写道:
toggle quoted message
Show quoted text
Hello.
It utilizes transactions (JanusGraph-level) internally:
*
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConsistentKeyIDAuthority.java#L260
*
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConsistentKeyIDAuthority.java#L285
*https://github.com/JanusGraph/janusgraph/blob/1c864333e709a4445c049b051855d726decb56d8/janusgraph-core/src/main/java/org/janusgraph/diskstorage/util/BackendOperation.java#L142
See configuration options for tuning:
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java#L700
Also retries in case id block is already allocated in the
backend. Using multi-dc setup with default options is probably
safe (?) - as far as I can tell from
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java
- by default system operations (transactions too) will utilize
QUORUM consistency in a cross-DC manner. You will have to use
NetworkTopologyStrategy in JanusGraph configuration
(SimpleStrategy by default for Janus keyspaces).
Best regards,
Evgenii Ignatev.
On 6/21/2020 6:36 AM, Chen Wu wrote:
Hi, any updates on this? I'm thinking about this
question too
在 2017年12月30日星期六 UTC+8上午3:56:51,Alexandr Porunov写道:
Ankur,
Thank you very much for pointing out. I tried to
investigate more deeply into the id allocation process,
but I am still confused a little bit.
I would be very grateful If you can answer one more
question.
I still don't understand how different JanusGraph
instances allocate id blocks. What if two different
JanusGraph instances allocate the same id block? For
example, if we are using two multi datacenter Cassandra
clusters (which has an asynchronous syncing) we cannot
guarantee consistency (because Cassandra is eventually
consistent database). So, there is a probability that two
different JanusGraph instances allocate the same id block
(each of the instances would think that only it allocated
the block). Is there any protection in JanusGraph against
such situation? What would happen in such scenario?
Best regards,
Alexandr
On Tuesday, December 26, 2017 at 6:32:12 AM UTC+2, Ankur
Goel wrote:
Multiple instances can not duplicate vertex
id, as there is a concept of unique id block assignment
to each instance in case of unused this is not
reassigned to other instance. Check property
configuration:
ids.block-size
~
On Tuesday, December 26, 2017 at 2:58:29 AM UTC+5:30,
Alexandr Porunov wrote:
Hello,
Does anybody know how ids are generated in a
distributed environment?
I am trying to figure out how JanusGraph
generates identifiers for new vertices. As I
understand each vertex has a 64 bit identifier
which determines in which partition the vertex
is stored.
The only problem is that there is no any
information about how those IDs are generated.
Is there a possibility for duplicates when
several instances are creating new vertices
simultaneously? Is there any possibility to
change the identifier of the vertex when it is
persisted already? Are instances talking to each
other to synchronize identifiers?
Best regards,
Alexandr
--
You received this message because you are subscribed to the Google
Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to janusgra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/a1b77ea0-bf53-4303-ba6a-44fe9d1def30o%40googlegroups.com.
--
Best regards,
Evgeniy Ignatiev.
|
|
Evgeniy Ignatiev <yevgeniy...@...>
toggle quoted message
Show quoted text
Thank you for your reply.
I've also read the code in ConsistentKeyIdAuthority.java,
what I'm confusing now is as follows:
when one claim(claim1) is beat by another claim(claim2) on
block [start, end], does claim1 retry calim on block [start,
end], or it trys to claim on another block [start_new,
end_new]? I think it should be the latter case but I can't
find the code to claim another block.
It's very helpful if you could give some hints, thanks.
在 2020年6月21日星期日 UTC+8下午8:10:27,Evgeniy Ignatiev写道:
Hello.
It utilizes transactions (JanusGraph-level) internally:
*
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConsistentKeyIDAuthority.java#L260
*
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/diskstorage/idmanagement/ConsistentKeyIDAuthority.java#L285
*https://github.com/JanusGraph/janusgraph/blob/1c864333e709a4445c049b051855d726decb56d8/janusgraph-core/src/main/java/org/janusgraph/diskstorage/util/BackendOperation.java#L142
See configuration options for tuning:
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java#L700
Also retries in case id block is already allocated in
the backend. Using multi-dc setup with default options
is probably safe (?) - as far as I can tell from
https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/CQLConfigOptions.java
- by default system operations (transactions too) will
utilize QUORUM consistency in a cross-DC manner. You
will have to use NetworkTopologyStrategy in JanusGraph
configuration (SimpleStrategy by default for Janus
keyspaces).
Best regards,
Evgenii Ignatev.
On 6/21/2020 6:36 AM, Chen Wu wrote:
Hi, any updates on this? I'm thinking
about this question too
在 2017年12月30日星期六 UTC+8上午3:56:51,Alexandr Porunov写道:
Ankur,
Thank you very much for pointing out. I tried
to investigate more deeply into the id
allocation process, but I am still confused a
little bit.
I would be very grateful If you can answer
one more question.
I still don't understand how different
JanusGraph instances allocate id blocks. What if
two different JanusGraph instances allocate the
same id block? For example, if we are using two
multi datacenter Cassandra clusters (which has
an asynchronous syncing) we cannot guarantee
consistency (because Cassandra is eventually
consistent database). So, there is a probability
that two different JanusGraph instances allocate
the same id block (each of the instances would
think that only it allocated the block). Is
there any protection in JanusGraph against such
situation? What would happen in such scenario?
Best regards,
Alexandr
On Tuesday, December 26, 2017 at 6:32:12 AM UTC+2,
Ankur Goel wrote:
Multiple instances can not
duplicate vertex id, as there is a concept of
unique id block assignment to each instance in
case of unused this is not reassigned to other
instance. Check property configuration:
ids.block-size
~
On Tuesday, December 26, 2017 at 2:58:29 AM
UTC+5:30, Alexandr Porunov wrote:
Hello,
Does anybody know how ids are
generated in a distributed
environment?
I am trying to figure out how
JanusGraph generates identifiers for
new vertices. As I understand each
vertex has a 64 bit identifier which
determines in which partition the
vertex is stored.
The only problem is that there is
no any information about how those IDs
are generated. Is there a possibility
for duplicates when several instances
are creating new vertices
simultaneously? Is there any
possibility to change the identifier
of the vertex when it is persisted
already? Are instances talking to each
other to synchronize identifiers?
Best regards,
Alexandr
--
You received this message because you are subscribed to
the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to janusgra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/a1b77ea0-bf53-4303-ba6a-44fe9d1def30o%40googlegroups.com.
--
Best regards,
Evgeniy Ignatiev.
--
You received this message because you are subscribed to the Google
Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/0286e8ee-54e1-4a22-94d7-324dea7a7aa0o%40googlegroups.com.
|
|