Date   

Re: How to circumvent transaction cache?

Boxuan Li
 

Hi Timon,

I don’t even think you will be able to disable tx-cache by using createThreadedTx(), or equivalently, newTransaction()/buildTransaction(). Unfortunately, as long as your transaction is not readOnly(), the effective vertex transaction size will be Math.max(100, cache.tx-cache-size).

To my best knowledge, you can only modify JanusGraph source code to completely disable transaction level cache. A workaround would be to always start a new transaction to check whether the value has changed.

Best regards,
Boxuan

「<timon.schneider@...>」在 2021年3月3日 週三,下午9:11 寫道:

Our application has transactions editing many vertices representing elements of a branch. This branch is also represented by a vertex that has boolean property isPublished. Before committing such a transaction, we need to know whether another user set the isPublished property on the branch vertex to true, in which case the transaction should be rolled back.

Here’s the problem:
* User A reads the branch vertex but doesn’t close transaction
* User B changes the isPublished property to true and commits (while A is still making changes)
* User A read locks the vertex with an external locking API
* User A queries the branch vertex again (to make sure isPublished is still false) in the same thread but gets the old values because of the transaction cache.
Now user A can commit data even though the branch isPublished is true.

I know it’s possible to use createThreadedTx() to circumvent the ThreadLocal transaction cache. However, such refreshes will be very common in our application and ideally we would be able to execute a refresh within the main transaction to minimise complexity and workarounds. Is this possible? And if not, are there any possibilities to turn off transaction cache entirely?

Thanks in advance,
Timon


how to delete Ghost vertices and ghost edges?

vamsi.lingala@...
 

gremlin> g.V(6389762617560).valueMap()
==>{}
gremlin>
gremlin> g.V().hasLabel("MAID").has("madsfid","sfmsdlk").outE("MAIH1").as("e").inV().as("v").select("e", "v").by(valueMap())
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}


Re: Gremlin Query to return count for nodes and edges

Vinayak Bali
 

Hi Marc,

The backend used is Cassandra. I was just wondering if we can load the data from Cassandra's data store to the in-memory backend to speed up the process.
I tried OLAP by configuring Hadoop and Spark with the help of references shared in the documentation. A simple query to retrieve 1 node from the graph took around 5 mins. 
Based on your experience, request to share the steps to be followed to solve the issue.

Thanks & Regards,
Vinayak

On Wed, Feb 24, 2021 at 9:32 PM <hadoopmarc@...> wrote:
Hi Vinayak,

Speeding up your query depends on your setup. 15.000 vertices/second is already fast. Is this the janusgraph inmemory backend? Or ScyllaDB?

In a perfect world, not there yet, your query would profit from parallelization (OLAP). JanusGraph supports both the withComputer() and withComputer(SparkGraphComputer) start steps, but the former is undocumented and the performance gains of the latter are often disappointing.

Best wishes,    Marc


Re: Authentication in JanusGraph Server

hadoopmarc@...
 

Sorry for asking, but you did not state it explicitly: you did modify your sasl-remote.yaml file to reflect the new ('graham', 'sasl-password') credentials, did you?

Marc


Authentication in JanusGraph Server

Graham Wallis <grahamwallis.dev@...>
 

Hi, 

I've been trying to use authentication over a websocket connection to a JanusGraph Server. 

If I configure the server to use a SimpleAuthenticator and a TinkerGraph for the credentials, as described in the Tinkerpop documentation, it works. 

In this mode, my gremlin-server.yaml is configured for authentication as follows: 

authentication: { 
  authenticator: org.apache.tinkerpop.gremlin.server.auth.SimpleAuthenticator, 
  authenticationHandler: org.apache.tinkerpop.gremlin.server.handler.SaslAuthenticationHandler, 
  config: { 
    credentialsDb: conf/tinkergraph-credentials.properties 
  } 

where the tinkergraph-credentials.properties file is the same as the example from Tinkerpop:

gremlin.graph=org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph 
gremlin.tinkergraph.vertexIdManager=LONG 
gremlin.tinkergraph.graphLocation=data/credentials.kryo 
gremlin.tinkergraph.graphFormat=gryo 

My gremlin-server.yaml also has the following SSL configuration: 

ssl: { 
  enabled: true, 
  sslEnabledProtocols: [TLSv1.2], 
  keyStore: server.jks, 
  keyStorePassword: mykeystore 

I've created a self-signed certificate for localhost, added it to the server.jks keystore (with the key password the same as the store password). 
Because my client (console) is on the same machine as the server, I used the server.jks keystore as the truststore for the client, and created 
a sasl-remote.yaml file for the client, with the following: 

hosts: [localhost] 
port: 8182 
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }} 
username: stephen 
password: password 
connectionPool: { 
  enableSsl: true, 
  sslEnabledProtocols: [TLSv1.2], 
  trustStore: server.jks, 
  trustStorePassword: mykeystore 
} 
I can start a gremlin-console and connect to the server, using the credentials ("stephen", "password"). 

:remote connect tinkerpop.server conf/sasl-remote.yaml session 

and subsequent remote operations against my (real) graph succeed. 

The above all works nicely. I can step through the invocation of SimpleAuthenticator's authenticate() method in the server in the debugger and it does exactly what you'd expect. 


If I try to do the same using a JanusGraph DB to store the credentials I can't get the client to authenticate. 

I tried using the following janusgraph-credentials-server.properties file for my credentials store: 

gremlin.graph=org.janusgraph.core.JanusGraphFactory 
storage.backend=berkeleyje 
storage.directory=../cred/berkeley 

And changed my gremlin-server yaml as follows: 

authentication: { 
  authenticator: org.janusgraph.graphdb.tinkerpop.gremlin.server.auth.JanusGraphSimpleAuthenticator, 
  authenticationHandler: org.apache.tinkerpop.gremlin.server.handler.SaslAuthenticationHandler, 
  config: { 
    defaultUsername: graham, 
    defaultPassword: sasl-password, 
    credentialsDb: conf/janusgraph-credentials-server.properties 
  } 

The ../cred/berkeley database is created during start of the gremlin-server. If I subsequeently stop the server and open the credentials database using a gremlin-console (locally) I can see that the default user has been added to it, the vertex is correctly labelled (as 'user') and the username and (hashed) password match. So the credentials store looks OK. 

However, if I now create a connection to the server and try to perform an remote operation, it doesn't authenticate and always results in "Username and/or password are incorrect". 

Stepping through the server code in the debugger, I noticed that the JanusGraphSimpleAuthenticator authenticate() method is never called, because the handler calls the SimpleAuthenticator's authenticate() method directly. This is probably fine as the former delegates to the latter anyway. But when the SimpleAuthenticator's authenticate() actually performs the credentials traversal, it does not find the user. 

I wondered whether I shuold be using a JanusGraph specific authentication handler, but that doesn't look like it would help; for a websocket connection the SaslAndHMACAuthentiucationHandler will delegate to the channelRead method of its superclass, i.e. SaslAuthenticationHandler, which is the same as the above. The only difference I can see in the code is that the SimpleAuthenticator is using a Tinkerpop generic Graph to create its CredentialTraversalSource, whereas the JanusGraphSimpleAuthenticaor uses a JanusGraph. 

Please can anyone can see what I'm doing wrong? 


Best regards,
 Graham

Linux Foundation LFAIData
Project: Egeria


How to circumvent transaction cache?

timon.schneider@...
 

Our application has transactions editing many vertices representing elements of a branch. This branch is also represented by a vertex that has boolean property isPublished. Before committing such a transaction, we need to know whether another user set the isPublished property on the branch vertex to true, in which case the transaction should be rolled back.

Here’s the problem:
* User A reads the branch vertex but doesn’t close transaction
* User B changes the isPublished property to true and commits (while A is still making changes)
* User A read locks the vertex with an external locking API
* User A queries the branch vertex again (to make sure isPublished is still false) in the same thread but gets the old values because of the transaction cache.
Now user A can commit data even though the branch isPublished is true.

I know it’s possible to use createThreadedTx() to circumvent the ThreadLocal transaction cache. However, such refreshes will be very common in our application and ideally we would be able to execute a refresh within the main transaction to minimise complexity and workarounds. Is this possible? And if not, are there any possibilities to turn off transaction cache entirely?

Thanks in advance,
Timon


Re: Not able to reindex with bigtable as backend

hadoopmarc@...
 

The vertex centric index is written to the storage backend, so I guess the section on write performance configs should be relevant:
https://docs.janusgraph.org/advanced-topics/bulk-loading/#optimizing-writes-and-reads

If have no idea whether row locking plays a role in writing the vertex centric index. If so, the config properties you mention are relevant and maybe also the config for batch loading, which disables locking:
https://docs.janusgraph.org/advanced-topics/bulk-loading/#batch-loading

Id allocation does not seem relevant (it has its own error messages so you would notice).

Marc


Re: Not able to reindex with bigtable as backend

liqingtaobkd@...
 

Thanks a lot for your reply Marc. I browsed through the older threads and didn't find a good solution for this. 

"BigTable cannot keep up with your index repair workers" - could you provide a little bit insights for what an index repair job does, or any documentation?
I was trying a few storage settings and didn't get any luck yet: storage.write-time/storage.lock.wait-time/storage.lock.expiry-time/etc. Do you think it will make a difference? 

As you suggested, I'll try delete the index and retry from start.
For our application, we do need to have the option of reindexing current data, so I'll need to make sure it works. Do you see similar issue for Cassandra? We deploy it on GCP so we try Bigtable first.
Do you have any recommendation on backend storage for GCP please?


Re: Not able to reindex with bigtable as backend

hadoopmarc@...
 

I checked on the existing issues and the following one looks similar to your issue:
https://github.com/JanusGraph/janusgraph/issues/1803

There are also some older questions in the janusgraph users list. Only workaround seems to be to define the index before adding the data.

Best wishes,     Marc


Re: Not able to reindex with bigtable as backend

hadoopmarc@...
 

The stacktraces you sent are not from reindexing but from an index repair job. TemporaryBackendException is usually an indication of unbalanced distributed system components; apparently BigTable cannot keep up with your index repair workers. Is it still possible to delete the index and retry from the start?

Otherwise, you could try if reindexing works with just a small graph. There is little to go on right now.

Best wishes,    Marc


Re: ConfiguredGraphFactory and Authentication not working

Jansen, Jan
 


Re: ConfiguredGraphFactory and Authentication not working

hadoopmarc@...
 

Hi Vinayak,

The information you provide is still a puzzle of little pieces which is hard to make sense of. Do you mean either of the following:

  1. there is some behaviour of janusgraph that is undocumented. Please provide steps to reproduce the issue.
  2. you created a graph with v.0.4.0 and now you have trouble reading it with janusgraph v0.5.2 . According to the upgrade instuctions in the changelog, https://docs.janusgraph.org/changelog/, there are no specific instructions to read v0.4.x graphs with janusgraph v0.5.x, in other words this is not expected to be an issue.
Or is it something else? Please be very specific.

You state "The difference which I saw between the two was when I start 0.4.0 automatically configuredgraphfactory schema was created in Cassandra, but in 0.5.2 janusgraph schema is created by default. This may be the reason for it."  Can you elaborate on that. How do the different schema's look like. What are the differences in the yaml and properties config files?

Best wishes,    Marc


Delete label is very difficult

vamsi.lingala@...
 
Edited

Deleting all edges in a label/all vertices in a label from gremlin is almost impossible..
g.E().hasLabel('MAID-BRAND-012021').drop()
g.V().hasLabel('BRAND').drop()


it gets timeout even thought you increase timeout to larger limits.
I think it scans all vertices/edges and then finds which are belong to that label..which is very inefficient


Re: Not able to reindex with bigtable as backend

liqingtaobkd@...
 

Sorry for multiple emails. My found the final error from Janusgraph. Not sure it's a bigtable issue or a janusgraph/bigtable compatibility issue. Can anybody help to take a look?

3035627 [Thread-8] ERROR org.janusgraph.graphdb.olap.job.IndexRepairJob - Transaction commit threw runtime exception:

org.janusgraph.core.JanusGraphException: Could not commit transaction due to exception during persistence at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1449) at org.janusgraph.graphdb.olap.job.IndexUpdateJob.workerIterationEnd(IndexUpdateJob.java:136) at org.janusgraph.graphdb.olap.job.IndexRepairJob.workerIterationEnd(IndexRepairJob.java:208) at org.janusgraph.graphdb.olap.VertexJobConverter.workerIterationEnd(VertexJobConverter.java:118) at org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScannerExecutor$Processor.run(StandardScannerExecutor.java:285) Caused by: org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:91) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:133) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:196) at org.janusgraph.diskstorage.BackendTransaction.commit(BackendTransaction.java:150) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1440) ... 4 more Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT1M40S at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:100) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54) ... 9 more Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend at org.janusgraph.diskstorage.hbase.HBaseStoreManager.mutateMany(HBaseStoreManager.java:460) at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingStoreManager.mutateMany(ExpectedValueCheckingStoreManager.java:79) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:94) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:91) at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68) ... 10 more Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IllegalStateException: 1 time, servers with issues: bigtable.googleapis.com at com.google.cloud.bigtable.hbase.BatchExecutor.batchCallback(BatchExecutor.java:288) at com.google.cloud.bigtable.hbase.BatchExecutor.batch(BatchExecutor.java:207) at com.google.cloud.bigtable.hbase.AbstractBigtableTable.batch(AbstractBigtableTable.java:185) at org.janusgraph.diskstorage.hbase.HTable1_0.batch(HTable1_0.java:51) at org.janusgraph.diskstorage.hbase.HBaseStoreManager.mutateMany(HBaseStoreManager.java:455) ... 14 more


Re: Not able to reindex with bigtable as backend

liqingtaobkd@...
 

Found follow error in the log. janusgraph version 0.5.3 with bigtable as backend. Any suggestions pls? I have been stuck with it for a few days...

"org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend at org.janusgraph.diskstorage.hbase.HBaseStoreManager.mutateMany(HBaseStoreManager.java:460) at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingStoreManager.mutateMany(ExpectedValueCheckingStoreManager.java:79) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:94) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:91) at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:91) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:133) at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:196) at org.janusgraph.diskstorage.BackendTransaction.commit(BackendTransaction.java:150) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1440) at org.janusgraph.graphdb.olap.job.IndexUpdateJob.workerIterationEnd(IndexUpdateJob.java:136) at org.janusgraph.graphdb.olap.job.IndexRepairJob.workerIterationEnd(IndexRepairJob.java:208) at org.janusgraph.graphdb.olap.VertexJobConverter.workerIterationEnd(VertexJobConverter.java:118) at org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScannerExecutor$Processor.run(StandardScannerExecutor.java:285) Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IllegalStateException: 1 time, servers with issues: bigtable.googleapis.com at com.google.cloud.bigtable.hbase.BatchExecutor.batchCallback(BatchExecutor.java:288) at com.google.cloud.bigtable.hbase.BatchExecutor.batch(BatchExecutor.java:207) at com.google.cloud.bigtable.hbase.AbstractBigtableTable.batch(AbstractBigtableTable.java:185) at org.janusgraph.diskstorage.hbase.HTable1_0.batch(HTable1_0.java:51) at org.janusgraph.diskstorage.hbase.HBaseStoreManager.mutateMany(HBaseStoreManager.java:455) ... 14 more "


Re: ConfiguredGraphFactory and Authentication not working

Vinayak Bali
 

Hi Marc,

It was working with 0.4.0. After the update to 0.5.2, it is not working. The difference which I saw between the two was when I start 0.4.0 automatically configuredgraphfactory schema was created in Cassandra, but in 0.5.2 janusgraph schema is created by default. This may be the reason for it. Please let me know how to solve the issue.

Thanks & Regards,
Vinayak

On Sun, Feb 28, 2021 at 9:26 PM <hadoopmarc@...> wrote:
Hi Vinayak,

I played around a bit with the ConfigurationManagementGraph myself. The client error you describe occurs when you run ":remote console" twice. It works as a toggle switch, in other words, the command you run was probably interpreted as for a locally embedded janusgraph instance (which was not present).

Best wishes,    Marc


Re: JanusGraphIndex how to retrieve constraint (indexOnly) specified for the global index?

cmilowka
 

Works like a charm, thank you Bo Xuan Li.


Re: Not able to reindex with bigtable as backend

liqingtaobkd@...
 

Thanks for the reply. I carefully followed each step described in the doc. Before the reindex, I closed all the open transactions and management instance. I sent the reindex command from the console and it never returns (at least for 10h+):
mgmt.updateIndex(mgmt.getRelationIndex(flowsTo, "flowsToByTimestamp"), SchemaAction.REINDEX).get()

So I don't have a chance to commit.

But from my monitoring of janusgraph and bigtable, there is no activity after 30min.

Do you have any further suggestion?


Re: ConfiguredGraphFactory and Authentication not working

hadoopmarc@...
 

Hi Vinayak,

I played around a bit with the ConfigurationManagementGraph myself. The client error you describe occurs when you run ":remote console" twice. It works as a toggle switch, in other words, the command you run was probably interpreted as for a locally embedded janusgraph instance (which was not present).

Best wishes,    Marc


Re: Not able to reindex with bigtable as backend

hadoopmarc@...
 

Please be sure to run all the steps (including a graph.tx().rollback() before index creation and a mgmt.commit() after update of the index) from the example in:

https://docs.janusgraph.org/index-management/index-performance/#vertex-centric-indexes

Best wishes,   Marc

1001 - 1020 of 6661