Re: Authentication in JanusGraph Server
hadoopmarc@...
Hi Graham,
This was certainly one to investigate for the weekend. Where you started investigating from the inside of janusgraph, I started from the user perspective and this is what I did:
Best wishes, Marc |
|
Re: How to circumvent transaction cache?
hadoopmarc@...
Hi Timon,
Adding to the answer of Ted, I can imagine that your new data enter your pipeline from a Kafka queue. With a microbatching solution, e.g. Apache Spark streaming, you could pre-shuffle your data per microbatch to be sure that all data relating to a branch are in a single partition. After that, a single thread can handle this single partition in one JanusGraph transaction. This approach seems fit better to your use case that trying to circumvent ACID limits in a tricky way. Best wishes, Marc |
|
Re: How to circumvent transaction cache?
Boxuan Li
Hi Timon,
As I mentioned earlier, the only way I can think of (assuming you are not concerned about the consistency of data storage as Ted mentioned) is to modify JanusGraph source code: In CacheVertex class, there is a data structure, protected final Map<SliceQuery, EntryList> queryCache. What you could do is to add a method to that class:
And then you can call refresh before you want to load new value from the storage rather than cache: Hope this helps, Boxuan
|
|
Re: How to circumvent transaction cache?
Ted Wilmes
Hi Timon, Jumping in late on this one but I wanted to point out that even if you could read it prior to committing to check if your constraint is maintained, most of the JG storage layers do not provide ACID guarantees. FoundationDB is the one distributed option, and BerkeleyDB can do it for a single instance setup. Since you do not have ACID guarantees in most cases, I think you could still have a case where another transaction commits prior to your commit even though you saw isPublished = false when you check it. One possible way around this without ACID would be to process all mutations for a branch on one thread, effectively single threading access to it so that you could know that no other user was writing to the branch while you were reading. --Ted On Fri, Mar 5, 2021 at 8:52 AM <timon.schneider@...> wrote: Thanks for your suggestion, but the consistency setting does not solve my problem. |
|
Re: How to circumvent transaction cache?
timon.schneider@...
Thanks for your suggestion, but the consistency setting does not solve my problem.
|
|
Re: How to circumvent transaction cache?
Nicolas Trangosi <nicolas.trangosi@...>
Hi Simon, It seems that you can force JG to re-read elements just before commit according to I have never try the option mgmt.setConsistency but this may help you. Regards, Nicolas Le ven. 5 mars 2021 à 10:20, <timon.schneider@...> a écrit :
--
![]() Ce message et ses pièces jointes peuvent contenir des informations confidentielles ou privilégiées et ne doivent donc pas être diffusés, exploités ou copiés sans autorisation. Si vous avez reçu ce message par erreur, veuillez le signaler a l'expéditeur et le détruire ainsi que les pièces jointes. Les messages électroniques étant susceptibles d'altération, DCbrain décline toute responsabilité si ce message a été altéré, déformé ou falsifié. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, DCbrain is not liable for messages that have been modified, changed or falsified. Thank you. |
|
Re: How to circumvent transaction cache?
Thanks for your reply.
The issue is that we need to refresh some vertices mid transaction. Rolling back is not an option as that would erase edits that we're making in our transaction. Disabling tranaction cache could be one solution. Using a treaded tx counld be an option as well as that transaction does see edits made by other users, opposed to the original transaction: A reads vertex X and then starts transaction and makes edits, does not commit yet B may or may not edit X A continues editing and before committing it needs to makes sure vertex X was not changed by B or else rolls back. Again, it is possible to read X by using a ThreadedTx but I'm interested if there's another way to refresh a vertex mid transaction. Kr, Timon |
|
Re: Authentication in JanusGraph Server
grahamwallis.dev@...
Hi @hadoopmarc,
Thanks for replying and no apology needed - it's a good question. Although I failed to mention it in my question, I did set the credentials to ('graham','sass-password') in the sasl-remote.yaml file when testing with the JanusGraph as credentials store. Setting a breakpoint in the server I could see the correct credentials being received, and the credentials store traversal looked fine; but no vertex is returned. All the best Graham |
|
Re: how to delete Ghost vertices and ghost edges?
Boxuan Li
See if this helps: 「<vamsi.lingala@...>」在 2021年3月4日 週四,下午4:42 寫道:
|
|
Re: How to circumvent transaction cache?
Boxuan Li
Hi Timon, I don’t even think you will be able to disable tx-cache by using createThreadedTx(), or equivalently, newTransaction()/buildTransaction(). Unfortunately, as long as your transaction is not readOnly(), the effective vertex transaction size will be Math.max(100, cache.tx-cache-size). To my best knowledge, you can only modify JanusGraph source code to completely disable transaction level cache. A workaround would be to always start a new transaction to check whether the value has changed. Best regards, Boxuan 「<timon.schneider@...>」在 2021年3月3日 週三,下午9:11 寫道: Our application has transactions editing many vertices representing elements of a branch. This branch is also represented by a vertex that has boolean property isPublished. Before committing such a transaction, we need to know whether another user set the isPublished property on the branch vertex to true, in which case the transaction should be rolled back. |
|
how to delete Ghost vertices and ghost edges?
vamsi.lingala@...
gremlin> g.V(6389762617560).valueMap()
==>{}
gremlin>
gremlin> g.V().hasLabel("MAID").has("madsfid","sfmsdlk").outE("MAIH1").as("e").inV().as("v").select("e", "v").by(valueMap())
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}}
==>{e={}, v={}} |
|
Re: Gremlin Query to return count for nodes and edges
Vinayak Bali
Hi Marc, The backend used is Cassandra. I was just wondering if we can load the data from Cassandra's data store to the in-memory backend to speed up the process. I tried OLAP by configuring Hadoop and Spark with the help of references shared in the documentation. A simple query to retrieve 1 node from the graph took around 5 mins. Based on your experience, request to share the steps to be followed to solve the issue. Thanks & Regards, Vinayak On Wed, Feb 24, 2021 at 9:32 PM <hadoopmarc@...> wrote: Hi Vinayak, |
|
Re: Authentication in JanusGraph Server
hadoopmarc@...
Sorry for asking, but you did not state it explicitly: you did modify your sasl-remote.yaml file to reflect the new ('graham', 'sasl-password') credentials, did you?
Marc |
|
Authentication in JanusGraph Server
Graham Wallis <grahamwallis.dev@...>
Hi,
I've been trying to use authentication over a websocket connection to a JanusGraph Server. If I configure the server to use a SimpleAuthenticator and a TinkerGraph for the credentials, as described in the Tinkerpop documentation, it works. In this mode, my gremlin-server.yaml is configured for authentication as follows: authentication: { where the tinkergraph-credentials.properties file is the same as the example from Tinkerpop: gremlin.graph=org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph My gremlin-server.yaml also has the following SSL configuration: ssl: { I've created a self-signed certificate for localhost, added it to the server.jks keystore (with the key password the same as the store password). Because my client (console) is on the same machine as the server, I used the server.jks keystore as the truststore for the client, and created a sasl-remote.yaml file for the client, with the following: hosts: [localhost]} I can start a gremlin-console and connect to the server, using the credentials ("stephen", "password"). :remote connect tinkerpop.server conf/sasl-remote.yaml session and subsequent remote operations against my (real) graph succeed. The above all works nicely. I can step through the invocation of SimpleAuthenticator's authenticate() method in the server in the debugger and it does exactly what you'd expect. If I try to do the same using a JanusGraph DB to store the credentials I can't get the client to authenticate. I tried using the following janusgraph-credentials-server.properties file for my credentials store: gremlin.graph=org.janusgraph.core.JanusGraphFactory And changed my gremlin-server yaml as follows: authentication: { The ../cred/berkeley database is created during start of the gremlin-server. If I subsequeently stop the server and open the credentials database using a gremlin-console (locally) I can see that the default user has been added to it, the vertex is correctly labelled (as 'user') and the username and (hashed) password match. So the credentials store looks OK. However, if I now create a connection to the server and try to perform an remote operation, it doesn't authenticate and always results in "Username and/or password are incorrect". Stepping through the server code in the debugger, I noticed that the JanusGraphSimpleAuthenticator authenticate() method is never called, because the handler calls the SimpleAuthenticator's authenticate() method directly. This is probably fine as the former delegates to the latter anyway. But when the SimpleAuthenticator's authenticate() actually performs the credentials traversal, it does not find the user. I wondered whether I shuold be using a JanusGraph specific authentication handler, but that doesn't look like it would help; for a websocket connection the SaslAndHMACAuthentiucationHandler will delegate to the channelRead method of its superclass, i.e. SaslAuthenticationHandler, which is the same as the above. The only difference I can see in the code is that the SimpleAuthenticator is using a Tinkerpop generic Graph to create its CredentialTraversalSource, whereas the JanusGraphSimpleAuthenticaor uses a JanusGraph. Please can anyone can see what I'm doing wrong? Best regards, Graham Linux Foundation LFAIData Project: Egeria |
|
How to circumvent transaction cache?
timon.schneider@...
Our application has transactions editing many vertices representing elements of a branch. This branch is also represented by a vertex that has boolean property isPublished. Before committing such a transaction, we need to know whether another user set the isPublished property on the branch vertex to true, in which case the transaction should be rolled back.
Here’s the problem: * User A reads the branch vertex but doesn’t close transaction * User B changes the isPublished property to true and commits (while A is still making changes) * User A read locks the vertex with an external locking API * User A queries the branch vertex again (to make sure isPublished is still false) in the same thread but gets the old values because of the transaction cache. Now user A can commit data even though the branch isPublished is true. I know it’s possible to use createThreadedTx() to circumvent the ThreadLocal transaction cache. However, such refreshes will be very common in our application and ideally we would be able to execute a refresh within the main transaction to minimise complexity and workarounds. Is this possible? And if not, are there any possibilities to turn off transaction cache entirely? Thanks in advance, Timon |
|
Re: Not able to reindex with bigtable as backend
hadoopmarc@...
The vertex centric index is written to the storage backend, so I guess the section on write performance configs should be relevant:
https://docs.janusgraph.org/advanced-topics/bulk-loading/#optimizing-writes-and-reads If have no idea whether row locking plays a role in writing the vertex centric index. If so, the config properties you mention are relevant and maybe also the config for batch loading, which disables locking: https://docs.janusgraph.org/advanced-topics/bulk-loading/#batch-loading Id allocation does not seem relevant (it has its own error messages so you would notice). Marc |
|
Re: Not able to reindex with bigtable as backend
liqingtaobkd@...
Thanks a lot for your reply Marc. I browsed through the older threads and didn't find a good solution for this.
"BigTable cannot keep up with your index repair workers" - could you provide a little bit insights for what an index repair job does, or any documentation? I was trying a few storage settings and didn't get any luck yet: storage.write-time/storage.lock.wait-time/storage.lock.expiry-time/etc. Do you think it will make a difference? As you suggested, I'll try delete the index and retry from start. For our application, we do need to have the option of reindexing current data, so I'll need to make sure it works. Do you see similar issue for Cassandra? We deploy it on GCP so we try Bigtable first. Do you have any recommendation on backend storage for GCP please? |
|
Re: Not able to reindex with bigtable as backend
hadoopmarc@...
I checked on the existing issues and the following one looks similar to your issue:
https://github.com/JanusGraph/janusgraph/issues/1803 There are also some older questions in the janusgraph users list. Only workaround seems to be to define the index before adding the data. Best wishes, Marc |
|
Re: Not able to reindex with bigtable as backend
hadoopmarc@...
The stacktraces you sent are not from reindexing but from an index repair job. TemporaryBackendException is usually an indication of unbalanced distributed system components; apparently BigTable cannot keep up with your index repair workers. Is it still possible to delete the index and retry from the start?
Otherwise, you could try if reindexing works with just a small graph. There is little to go on right now. Best wishes, Marc |
|
Re: ConfiguredGraphFactory and Authentication not working
Jansen, Jan
|
|