Do you `mgt.commit()`? Do you `mgt.awaitGraphIndexStatus(graph, 'UniqueURI').call()`?
toggle quoted message
Show quoted text
On Sat, Sep 23, 2017 at 6:36 AM, Ajay Srivastava <Ajay.Sr...@...> wrote:
Hi,
I am using janusgraph-0.1.1 with HBase.
The data is being loaded in graph using three clients connecting to same gremlin server. The clients are executing same code that checks if vertex is not already present in the graph then it inserts the vertex.
I was verifying the data and found following problem -
scala> graph.V().hasLabel("Root").toList
15:27:22,361 WARN StandardJanusGraphTx:1273 - Query requires iterating over all vertices [(~label = Root)]. For better performance, use indexes
res11: List[gremlin.scala.Vertex] = List(v[737304], v[4136], v[442432])
Results is three vertices.
scala> graph.V().hasLabel("Root").properties("URI").toList
15:27:52,275 WARN StandardJanusGraphTx:1273 - Query requires iterating over all vertices [(~label = Root)]. For better performance, use indexes
res13: List[gremlin.scala.Property[Any]] = List(vp[URI->Root], vp[URI->Root], vp[URI->Root])
Result is three vertices having same URI.
scala> val uri = Key[String]("URI")
scala> graph.V().has(uri, "Root").toList
res12: List[gremlin.scala.Vertex] = List(v[442432])
Since vertices are uniquely indexed on URI, this result is correct. Janusgraph should not have allowed to insert vertices having same URI but it did as displayed in above two outputs.
I am new to janusgraph and have many questions -
1) What am I doing wrong here ?
2) Multiple clients writing to same gremlin server may create problem ?
3) How to read back the schema created by me ?
4) Below is the code for creating schema. Is this correct ?
/* Creating three types of vertices having same properties and indexed on same property URI */
def createVertexSchema : Boolean = {
val vertexLabels = Array("Root", "Lang", "Cocpt")
val GUID = mgt.makePropertyKey("GUID").dataType(classOf[String]).make
val Name = mgt.makePropertyKey("Name").dataType(classOf[String]).make
val URI = mgt.makePropertyKey("URI").dataType(classOf[String]).make
vertexLabels.foreach {
vertexLabel =>
val vLabel = mgt.makeVertexLabel(vertexLabel).make
}
mgt.buildIndex("UniqueURI", classOf[Vertex]).addKey(URI).unique().buildCompositeIndex()
true
}
Regards,
Ajay
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/D8F1F502-3482-4A8E-AB9A-5021273DC697%40guavus.com.
For more options, visit https://groups.google.com/d/optout.
|
|
Ajay Srivastava <Ajay.Sr...@...>
Thanks Robert.
I have commit and await in my code. Here is more information -
scala> val mgt = graph.openManagement()
mgt: org.janusgraph.core.schema.JanusGraphManagement = org.janusgraph.graphdb.database.management.ManagementSystem@46f31564
scala> val index = mgt.getGraphIndexes(classOf[Vertex]).iterator.next
index: org.janusgraph.core.schema.JanusGraphIndex = UniqueURI
scala> val properties = index.getFieldKeys
properties: Array[org.janusgraph.core.PropertyKey] = Array(URI)
scala> properties.toList
res30: List[org.janusgraph.core.PropertyKey] = List(URI)
scala> index.getIndexStatus(properties(0))
res29: org.janusgraph.core.schema.SchemaStatus = ENABLED
So, the status of index is “ENABLED”. Should it be “REGISTERED" ?
I deleted db and recreated schema again, the awaitGraphIndexStatus call times out -
14:03:04,435 INFO GraphIndexStatusWatcher:81 - Some key(s) on index UniqueURI do not currently have status REGISTERED: URI=ENABLED
14:03:04,435 INFO GraphIndexStatusWatcher:90 - Timed out (PT1M) while waiting for index UniqueURI to converge on status REGISTERED
I waited for half an hour but the status remains as “ENABLED”.
Note that there are no records in db and I create all vertex/edge properties and indexes in one transaction. I have tried creating only vertex properties, labels and index in one transaction and that also is not working.
Regards,
Ajay
toggle quoted message
Show quoted text
On 23-Sep-2017, at 5:22 PM, Robert Dale < rob...@...> wrote:
Do you `mgt.commit()`? Do you `mgt.awaitGraphIndexStatus(graph, 'UniqueURI').call()`?
|
|
toggle quoted message
Show quoted text
On Sun, Sep 24, 2017 at 5:47 AM, Ajay Srivastava <Ajay.Sr...@...> wrote:
Thanks Robert.
I have commit and await in my code. Here is more information -
scala> val mgt = graph.openManagement()
mgt: org.janusgraph.core.schema.JanusGraphManagement = org.janusgraph.graphdb.database.management.ManagementSystem@46f31564
scala> val index = mgt.getGraphIndexes(classOf[Vertex]).iterator.next
index: org.janusgraph.core.schema.JanusGraphIndex = UniqueURI
scala> val properties = index.getFieldKeys
properties: Array[org.janusgraph.core.PropertyKey] = Array(URI)
scala> properties.toList
res30: List[org.janusgraph.core.PropertyKey] = List(URI)
scala> index.getIndexStatus(properties(0))
res29: org.janusgraph.core.schema.SchemaStatus = ENABLED
So, the status of index is “ENABLED”. Should it be “REGISTERED" ?
I deleted db and recreated schema again, the awaitGraphIndexStatus call times out -
14:03:04,435 INFO GraphIndexStatusWatcher:81 - Some key(s) on index UniqueURI do not currently have status REGISTERED: URI=ENABLED
14:03:04,435 INFO GraphIndexStatusWatcher:90 - Timed out (PT1M) while waiting for index UniqueURI to converge on status REGISTERED
I waited for half an hour but the status remains as “ENABLED”.
Note that there are no records in db and I create all vertex/edge properties and indexes in one transaction. I have tried creating only vertex properties, labels and index in one transaction and that also is not working.
Regards,
Ajay
On 23-Sep-2017, at 5:22 PM, Robert Dale < rob...@...> wrote:
Do you `mgt.commit()`? Do you `mgt.awaitGraphIndexStatus( graph, 'UniqueURI').call()`?
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CC7B1514-1089-4B7A-9CEC-200619E64DC9%40guavus.com.
For more options, visit https://groups.google.com/d/optout.
|
|
toggle quoted message
Show quoted text
On Sun, Sep 24, 2017 at 7:33 AM, Robert Dale <rob...@...> wrote: Enabled is good.
|
|
Kevin Schmidt <ktsc...@...>
|
|
Ajay Srivastava <Ajay.Sr...@...>
Taking the lock on property and index solved the consistency problem but it took 10 mins. to ingest data from three clients while this data can be added from one client in less than 2 min.
Our system will eventually be 99% read and 1% write. Single write client would be enough for that. But I need to ingest millions of files to initialise the graph database. With this speed, it will take years to ingest data.
Regards,
Ajay
toggle quoted message
Show quoted text
On 24-Sep-2017, at 5:04 PM, Robert Dale < rob...@...> wrote:
|
|
Ajay Srivastava <Ajay.Sr...@...>
Hi Kevin,
This is the exact problem I am facing.
So, how are you handling duplicates ? I assume that dedup() will not work, as duplicate vertices will have different Ids. And if indexes are not used then read queries are going to be slower.
Regards,
Ajay
toggle quoted message
Show quoted text
On 24-Sep-2017, at 7:55 PM, Kevin Schmidt < ktsc...@...> wrote:
|
|
Hi Ajay, If at all possible, I usually try to remove the need for unique constraints on indexes. You mentioned that eventually you'll have about a 99:1 r/w ratio. Does this mean that you'll do a big bulk load up front? If so, could you structure your load so that you do not need to have the unique index enabled and you can instead build it after the load? For example, maybe you could load all of your vertices first, and then load the edges. This would require some preprocessing but would speed things up greatly. This is how the TinkerPop BulkLoaderVertexProgram [1] that can be run against Janus works, granted, you must put your data in one of the support adjacency list formats first or provide a custom reader.
If you can't load the vertices separate, maybe you could partition your input data so that you could isolate reads and writes for any specific vertex to the same thread, this would let you safely perform a read before write to check for existence without having to worry about race conditions. Combine this with an in-thread cache of what vertices have already been inserted and their corresponding Janus IDs and you'll speed things up.
--Ted
toggle quoted message
Show quoted text
On Sunday, September 24, 2017 at 9:56:17 AM UTC-5, Ajay Srivastava wrote:
Hi Kevin,
This is the exact problem I am facing.
So, how are you handling duplicates ? I assume that dedup() will not work, as duplicate vertices will have different Ids. And if indexes are not used then read queries are going to be slower.
Regards,
Ajay
On 24-Sep-2017, at 7:55 PM, Kevin Schmidt < ktsc...@...> wrote:
On Sun, Sep 24, 2017 at 4:34 AM, Robert Dale
<rob...@...> wrote:
On Sun, Sep 24, 2017 at 7:33 AM, Robert Dale
<rob...@...> wrote:
Enabled is good.
On Sun, Sep 24, 2017 at 5:47 AM, Ajay Srivastava
<Ajay.Sr...@...> wrote:
Thanks Robert.
I have commit and await in my code. Here is more information -
scala> val mgt = graph.openManagement()
mgt: org.janusgraph.core.schema.JanusGraphManagement = org.janusgraph.graphdb.database.management.ManagementSystem@46f31564
scala> val index = mgt.getGraphIndexes(classOf[Vertex]).iterator.next
index: org.janusgraph.core.schema.JanusGraphIndex = UniqueURI
scala> val properties = index.getFieldKeys
properties: Array[org.janusgraph.core.PropertyKey] = Array(URI)
scala> properties.toList
res30: List[org.janusgraph.core.PropertyKey] = List(URI)
scala> index.getIndexStatus(properties(0))
res29: org.janusgraph.core.schema.SchemaStatus = ENABLED
So, the status of index is “ENABLED”. Should it be “REGISTERED" ?
I deleted db and recreated schema again, the awaitGraphIndexStatus call times out -
14:03:04,435 INFO GraphIndexStatusWatcher:81 - Some key(s) on index UniqueURI do not currently have status REGISTERED: URI=ENABLED
14:03:04,435 INFO GraphIndexStatusWatcher:90 - Timed out (PT1M) while waiting for index UniqueURI to converge on status REGISTERED
I waited for half an hour but the status remains as “ENABLED”.
Note that there are no records in db and I create all vertex/edge properties and indexes in one transaction. I have tried creating only vertex properties, labels and index in one transaction and that also is not working.
Regards,
Ajay
On 23-Sep-2017, at 5:22 PM, Robert Dale < rob...@...> wrote:
Do you `mgt.commit()`? Do you `mgt.awaitGraphIndexStatus( graph, 'UniqueURI').call()`?
On Sat, Sep 23, 2017 at 6:36 AM, Ajay Srivastava
<Ajay.Sr...@...> wrote:
Hi,
I am using janusgraph-0.1.1 with HBase.
The data is being loaded in graph using three clients connecting to same gremlin server. The clients are executing same code that checks if vertex is not already present in the graph then it inserts the vertex.
I was verifying the data and found following problem -
scala> graph.V().hasLabel("Root").toList
15:27:22,361 WARN StandardJanusGraphTx:1273 - Query requires iterating over all vertices [(~label = Root)]. For better performance, use indexes
res11: List[gremlin.scala.Vertex] = List(v[737304], v[4136], v[442432])
Results is three vertices.
scala> graph.V().hasLabel("Root").properties("URI").toList
15:27:52,275 WARN StandardJanusGraphTx:1273 - Query requires iterating over all vertices [(~label = Root)]. For better performance, use indexes
res13: List[gremlin.scala.Property[Any]] = List(vp[URI->Root], vp[URI->Root], vp[URI->Root])
Result is three vertices having same URI.
scala> val uri = Key[String]("URI")
scala> graph.V().has(uri, "Root").toList
res12: List[gremlin.scala.Vertex] = List(v[442432])
Since vertices are uniquely indexed on URI, this result is correct. Janusgraph should not have allowed to insert vertices having same URI but it did as displayed in above two outputs.
I am new to janusgraph and have many questions -
1) What am I doing wrong here ?
2) Multiple clients writing to same gremlin server may create problem ?
3) How to read back the schema created by me ?
4) Below is the code for creating schema. Is this correct ?
/* Creating three types of vertices having same properties and indexed on same property URI
*/
def createVertexSchema : Boolean = {
val vertexLabels = Array("Root", "Lang", "Cocpt")
val GUID = mgt.makePropertyKey("GUID").dataType(classOf[String]).make
val Name = mgt.makePropertyKey("Name").dataType(classOf[String]).make
val URI = mgt.makePropertyKey("URI").dataType(classOf[String]).make
vertexLabels.foreach {
vertexLabel =>
val vLabel = mgt.makeVertexLabel(vertexLabel).make
}
mgt.buildIndex("UniqueURI", classOf[Vertex]).addKey(URI).unique().buildCompositeIndex()
true
}
Regards,
Ajay
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/D8F1F502-3482-4A8E-AB9A-5021273DC697%40guavus.com.
For more options, visit
https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/CC7B1514-1089-4B7A-9CEC-200619E64DC9%40guavus.com.
For more options, visit
https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/CABed_4ocmXy0-fMdPG1-WihDaU9Qha%2B_PYehRkt5Tq8K9j8K0g%40mail.gmail.com.
|
|
Kevin Schmidt <ktsc...@...>
The way we handled it was to not use locks or a unique index, but do keep a non-unique index, but then accept that there may be duplicate vertices and either construct our traversals to handle it, or periodically check/detect the duplicates and remove/fix them.
toggle quoted message
Show quoted text
On Sun, Sep 24, 2017 at 7:56 AM, Ajay Srivastava <Ajay.Sr...@...> wrote:
Hi Kevin,
This is the exact problem I am facing.
So, how are you handling duplicates ? I assume that dedup() will not work, as duplicate vertices will have different Ids. And if indexes are not used then read queries are going to be slower.
Regards,
Ajay
On 24-Sep-2017, at 7:55 PM, Kevin Schmidt < ktsc...@...> wrote:
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/9CE3C143-6A63-483D-92A8-82DB98A38059%40guavus.com.
|
|
Ajay Srivastava <Ajay.Sr...@...>
Thanks Ted.
I am working on it.
Regards,
Ajay
toggle quoted message
Show quoted text
On 02-Oct-2017, at 8:00 PM, Ted Wilmes < twi...@...> wrote:
Hi Ajay,
If at all possible, I usually try to remove the need for unique constraints on indexes. You mentioned that eventually
you'll have about a 99:1 r/w ratio. Does this mean that you'll do a big bulk load up front? If so, could you structure your load
so that you do not need to have the unique index enabled and you can instead build it after the load? For example, maybe
you could load all of your vertices first, and then load the edges. This would require some preprocessing but would speed
things up greatly. This is how the TinkerPop BulkLoaderVertexProgram [1] that can be run against Janus works, granted, you
must put your data in one of the support adjacency list formats first or provide a custom reader.
If you can't load the vertices separate, maybe you could partition your input data so that you could isolate
reads and writes for any specific vertex to the same thread, this would let you safely perform a read before write
to check for existence without having to worry about race conditions. Combine this with an in-thread cache of
what vertices have already been inserted and their corresponding Janus IDs and you'll speed things up.
--Ted
On Sunday, September 24, 2017 at 9:56:17 AM UTC-5, Ajay Srivastava wrote:
Hi Kevin,
This is the exact problem I am facing.
So, how are you handling duplicates ? I assume that dedup() will not work, as duplicate vertices will have different Ids. And if indexes are not used then read queries are going to be slower.
Regards,
Ajay
On 24-Sep-2017, at 7:55 PM, Kevin Schmidt < ktsc...@...>
wrote:
On Sun, Sep 24, 2017 at 4:34 AM, Robert Dale
<rob...@...> wrote:
On Sun, Sep 24, 2017 at 7:33 AM, Robert Dale
<rob...@...> wrote:
Enabled is good.
On Sun, Sep 24, 2017 at 5:47 AM, Ajay Srivastava
<Ajay.Sr...@...>
wrote:
Thanks Robert.
I have commit and await in my code. Here is more information -
scala> val mgt = graph.openManagement()
mgt: org.janusgraph.core.schema.JanusGraphManagement = org.janusgraph.graphdb.database.management.ManagementSystem@46f31564
scala> val index = mgt.getGraphIndexes(classOf[Vertex]).iterator.next
index: org.janusgraph.core.schema.JanusGraphIndex = UniqueURI
scala> val properties = index.getFieldKeys
properties: Array[org.janusgraph.core.PropertyKey] = Array(URI)
scala> properties.toList
res30: List[org.janusgraph.core.PropertyKey] = List(URI)
scala> index.getIndexStatus(properties(0))
res29: org.janusgraph.core.schema.SchemaStatus = ENABLED
So, the status of index is “ENABLED”. Should it be “REGISTERED" ?
I deleted db and recreated schema again, the awaitGraphIndexStatus call times out -
14:03:04,435 INFO GraphIndexStatusWatcher:81 - Some key(s) on index UniqueURI do not currently have status REGISTERED: URI=ENABLED
14:03:04,435 INFO GraphIndexStatusWatcher:90 - Timed out (PT1M) while waiting for index UniqueURI to converge on status REGISTERED
I waited for half an hour but the status remains as “ENABLED”.
Note that there are no records in db and I create all vertex/edge properties and indexes in one transaction. I have tried creating only vertex properties, labels and index in one transaction and that also is not working.
Regards,
Ajay
On 23-Sep-2017, at 5:22 PM, Robert Dale < rob...@...>
wrote:
Do you `mgt.commit()`? Do you `mgt.awaitGraphIndexStatus( graph, 'UniqueURI').call()`?
On Sat, Sep 23, 2017 at 6:36 AM, Ajay Srivastava
<Ajay.Sr...@...>
wrote:
Hi,
I am using janusgraph-0.1.1 with HBase.
The data is being loaded in graph using three clients connecting to same gremlin server. The clients are executing same code that checks if vertex is not already present in the graph then it inserts the vertex.
I was verifying the data and found following problem -
scala> graph.V().hasLabel("Root").toList
15:27:22,361 WARN StandardJanusGraphTx:1273 - Query requires iterating over all vertices [(~label = Root)]. For better performance, use indexes
res11: List[gremlin.scala.Vertex] = List(v[737304], v[4136], v[442432])
Results is three vertices.
scala> graph.V().hasLabel("Root").properties("URI").toList
15:27:52,275 WARN StandardJanusGraphTx:1273 - Query requires iterating over all vertices [(~label = Root)]. For better performance, use indexes
res13: List[gremlin.scala.Property[Any]] = List(vp[URI->Root], vp[URI->Root], vp[URI->Root])
Result is three vertices having same URI.
scala> val uri = Key[String]("URI")
scala> graph.V().has(uri, "Root").toList
res12: List[gremlin.scala.Vertex] = List(v[442432])
Since vertices are uniquely indexed on URI, this result is correct. Janusgraph should not have allowed to insert vertices having same URI but it did as displayed in above two outputs.
I am new to janusgraph and have many questions -
1) What am I doing wrong here ?
2) Multiple clients writing to same gremlin server may create problem ?
3) How to read back the schema created by me ?
4) Below is the code for creating schema. Is this correct ?
/* Creating three types of vertices having same properties and indexed on same property URI */
def createVertexSchema : Boolean = {
val vertexLabels = Array("Root", "Lang", "Cocpt")
val GUID = mgt.makePropertyKey("GUID").dataType(classOf[String]).make
val Name = mgt.makePropertyKey("Name").dataType(classOf[String]).make
val URI = mgt.makePropertyKey("URI").dataType(classOf[String]).make
vertexLabels.foreach {
vertexLabel =>
val vLabel = mgt.makeVertexLabel(vertexLabel).make
}
mgt.buildIndex("UniqueURI", classOf[Vertex]).addKey(URI).unique().buildCompositeIndex()
true
}
Regards,
Ajay
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/D8F1F502-3482-4A8E-AB9A-5021273DC697%40guavus.com.
For more options, visit
https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/CC7B1514-1089-4B7A-9CEC-200619E64DC9%40guavus.com.
For more options, visit
https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/CABed_4ocmXy0-fMdPG1-WihDaU9Qha%2B_PYehRkt5Tq8K9j8K0g%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
janusgraph-users+unsu...@....
To view this discussion on the web visit
https://groups.google.com/d/msgid/janusgraph-users/37dc6059-a06d-45f8-8ee7-c7bd75b821a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
|
|