Re: Multiple vertices generated for the same index value and vertex properties missing with RF3
hadoopmarc@...
Hi,
You did not answer my questions about the "id" poperty. TinkerPop uses a Token.ID that has the value 'id', see: https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/T.java I suspect that you ingested data without schema validation ("automatic schema creation"), that your input data contains an "id¨ property key and that JanusGraph/TinkerPop get confused about which id is what. So I strongly suggest that you make sure that this is not the root cause of this issue. To be sure, it would still be an issue but not for you anymore :-) Best wishes, Marc
|
|
Re: Multiple vertices generated for the same index value and vertex properties missing with RF3
Another really strange observation
gremlin> g.V().has('id','131594d6a416666b401a9e48e54ebc8f22be75e2593c5d98e2d9ecfd719d5f29').has('type','email_sha256_lowercase').valueMap(true) ==>[dpts_678:[1595548800],label:vertex,id:201523209257056,id:[19df651e-90d5-47f6-af2e-35dcb59bcc0a],type:[id_mid_10],soft_del:[false],country_GBR:[678]] Could you please have a look?
|
|
Re: Count Query Optimization
Boxuan Li
Have you tried keeping query.batch = true AND query.fast-property = true?
toggle quoted messageShow quoted text
Regards, Boxuan
|
|
Re: Count Query Optimization
Vinayak Bali
Hi All, Adding these properties in the configuration file affects edge traversal. Retrieving a single edge takes 7 mins of time. 1) Turn on query.batch 2) Turn off query.fast-property Count query is faster but edge traversal becomes more expensive. Is there any other way to improve count performance without affecting other queries. Thanks & Regards, Vinayak
On Fri, Mar 19, 2021 at 1:53 AM AMIYA KUMAR SAHOO <amiyakr.sahoo91@...> wrote:
|
|
Re: Multiple vertices generated for the same index value and vertex properties missing with RF3
Hi
The issue still persists, and the vertex metadata is still missing for some vertices, after enabling https://docs.janusgraph.org/advanced-topics/eventual-consistency/, has someone seen the same issue. The issue is logged at https://github.com/JanusGraph/janusgraph/issues/2515 Thanks
|
|
Re: Janusgraph 0.5.3 potential memory leak
Opened the issue about this potential bug here: https://github.com/JanusGraph/janusgraph/issues/2524
|
|
Re: ScriptExecutor Deprecated but Used in gremlin.bat
hadoopmarc@...
Hi Fredrick,
You are right, this is an issue, so if you want to report this: thanks. Best wishes, Marc
|
|
Re: Count Query Optimization
AMIYA KUMAR SAHOO
Hi Vinayak, Try below. If it works for you, you can add E2 and D similarly. g.V().has('property1', 'A'). outE().has('property1', 'E').as('e'). inV().has('property1', 'B'). outE().has('property1', 'E1').as('e'). where (inV().has('property1', 'C')). select (all, 'e').fold(). project('edgeCount', 'vertexCount'). by(count(local)). by(unfold().bothV().dedup().count()) Regards, Amiya
On Thu, 18 Mar 2021, 15:47 Vinayak Bali, <vinayakbali16@...> wrote:
|
|
Re: Duplicate Vertex
kumkar.dev@...
Hi Boxuan Li,
Hope this helps: ---------------------------------------------------------------------------------------------------
Vertex Index Name | Type | Unique | Backing | Key: Status |
---------------------------------------------------------------------------------------------------
by_prop1 | Composite | false | internalindex | prop1: ENABLED |
by_prop2 | Composite | false | internalindex | prop2 : ENABLED |
- Dev
|
|
Re: Duplicate Vertex
Boxuan Li
Hi, can you share more details (what indexes do you have related to prop1 and/or prop2), or even minimal code to reproduce?
toggle quoted messageShow quoted text
|
|
Duplicate Vertex
Hello
We are on Janus 0.4.0 and faced one scenario wherein there were duplicate vertices created. These 2 vertices were created in span of 9 milliseconds within single transaction. We are using index for looking up V in the graph. The vertex is identified by 2 identifiers/properties prop1, prop2 and there are other properties. There are property matches to check if the vertex is already present then accordingly create or update the vertex. There are two property matches to check for vertex existence.
The first vertex got created with property match, t1
Could this be issue not able to read in-memory cache? Are there known issues in this area where index is not being returned resulting into this issue? Thanks Dev
|
|
Re: How to circumvent transaction cache?
timon.schneider@...
Thanks for your thoughts.
1) I'm very interested to try out the PR you made for this issue. 2) I don't think the solution you gave me in that previous thread solves the issue. What if another user sets version_v.published to true between step 3 and 4. This is allowed even with the ConsistencyModifier.LOCK on the vertex and properties of version_v. 1. start_transaction();
2. read_vertex(type_v);
3. read_vertex(version_v); // type_v ——hasVersion—> version_v
4. if (version_v.published == true) then abort();
5. update_vertex(type_v);
6. update_vertex(version_v); // set version_v.published = true
7. commit();
|
|
Re: Count Query Optimization
Vinayak Bali
Amiya - I need to check the data, there is some mismatch with the counts. Consider we have more than one relation to get the count. How can we modify the query? For example: A->E->B query is as follows: g.V().has('property1', 'A'). outE().has('property1','E'). where(inV().has('property1', 'B')). fold(). project('edgeCount', 'vertexCount'). by(count(local)). by(unfold().bothV().dedup().count()) A->E->B->E1->C->E2->D What changes can be made in the query ?? Thanks
On Thu, Mar 18, 2021 at 1:59 PM AMIYA KUMAR SAHOO <amiyakr.sahoo91@...> wrote: Hi Vinayak,
|
|
Re: Count Query Optimization
AMIYA KUMAR SAHOO
Hi Vinayak,
Correct vertex count is ( 400332 non-unique, 34693 unique). g.V().has('property1', 'A').aggregate('v'), all the vertex having property1 = A might be getting included in count in your second query because of eager evaluation (does not matter they have outE with property1 = E or not) Regards, Amiya
|
|
Re: Count Query Optimization
Vinayak Bali
Hi Amiya, With dedup: g.V().has('property1', 'A'). outE().has('property1','E'). where(inV().has('property1', 'B')). fold(). project('edgeCount', 'vertexCount'). by(count(local)). by(unfold().bothV().dedup().count()) Output: ==>[edgeCount:200166,vertexCount:34693] without dedup: g.V().has('property1', 'A'). outE().has('property1','E'). where(inV().has('property1', 'B')). fold(). project('edgeCount', 'vertexCount'). by(count(local)). by(unfold().bothV().count()) Output: ==>[edgeCount:200166,vertexCount:400332] Both queries are taking approx 3 sec to run. Query: g.V().has('property1', 'A').aggregate('v').outE().has('property1','E').aggregate('e').inV().has('property1', 'B').aggregate('v').select('v').dedup().as('vetexCount').select('e').dedup().as('edgeCount').select('vetexCount','edgeCount').by(unfold().count()) Output: ==>[vetexCount:383633,edgeCount:200166] Time: 3.5 mins Edge Count is the same for all the queries but getting different vertexCount. Which one is the right vertex count?? Thanks & Regards, Vinayak
On Thu, Mar 18, 2021 at 11:18 AM AMIYA KUMAR SAHOO <amiyakr.sahoo91@...> wrote: Hi Vinayak,
|
|
Re: Count Query Optimization
AMIYA KUMAR SAHOO
Hi Vinayak,
May be try below. g.V().has('property1', 'A'). outE().has('property1','E'). where(inV().has('property1', 'B')). fold(). project('edgeCount', 'vertexCount'). by(count(local)).
by(unfold().bothV().dedup().count()) // I do not think dedup is required for your use case, can try both with and without dedup Regards, Amiya
|
|
Re: Janusgraph - OLAP using Dataproc
kndoan94@...
Hi Claire!
Would you mind sharing the pom.xml file for your build? I'm trying a similar build for AWS and am hitting a mess of dependency errors. Thank you :) Ben
|
|
Re: Caused by: org.janusgraph.core.JanusGraphException: A JanusGraph graph with the same instance id [0a000439355-0b2b58ca5c222] is already open. Might required forced shutdown.
hadoopmarc@...
Hi Srinivas,
In the yaml file determining class Settings you use the channelizer key twice. If you use ConfigurationManagentGraph only the following line should be present: channelizer: org.janusgraph.channelizers.JanusGraphWebSocketChannelizer Does that make any difference? Does stil the part ¨with one argument of class Settings,¨ show up in the ERROR message then? Best wishes, Marc
|
|
Re: Count Query Optimization
hadoopmarc@...
Hi Vinayak,
Another attempt, this one is very similar to the one that works. gremlin> graph = JanusGraphFactory.open('conf/janusgraph-inmemory.properties') ==>standardjanusgraph[inmemory:[127.0.0.1]] gremlin> g = graph.traversal() ==>graphtraversalsource[standardjanusgraph[inmemory:[127.0.0.1]], standard] gremlin> GraphOfTheGodsFactory.loadWithoutMixedIndex(graph,true) ==>null gremlin> g.V().as('v1').outE().as('e').inV().as('v2').union(select('v1'), select('v2')).dedup().count() 16:12:39 WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [()]. For better performance, use indexes ==>12 gremlin> g.V().as('v1').outE().as('e').inV().as('v2').select('e').dedup().count() 16:15:30 WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [()]. For better performance, use indexes ==>17 gremlin> g.V().as('v1').outE().as('e').inV().as('v2').union( ......1> union(select('v1'), select('v2')).dedup().count(), ......2> select('e').dedup().count().as('ecount') ......3> ) 16:27:42 WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [()]. For better performance, use indexes ==>12 ==>17 Best wishes, Marc
|
|
Re: Count Query Optimization
Nicolas Trangosi
Hi, You may try to use denormalization by setting
property1 from inV also on edge. Then once edges are updated, following query should work: g.V().has('property1', 'A').aggregate('v').outE().has('property1','E').has('inVproperty1', 'B').aggregate('e').inV().aggregate('v').select('v').dedup().as('vetexCount').select('e').dedup().as('edgeCount').select('vetexCount','edgeCount').by(unfold().count())
Le mer. 17 mars 2021 à 14:05, Vinayak Bali <vinayakbali16@...> a écrit :
--
![]() Ce message et ses pièces jointes peuvent contenir des informations confidentielles ou privilégiées et ne doivent donc pas être diffusés, exploités ou copiés sans autorisation. Si vous avez reçu ce message par erreur, veuillez le signaler a l'expéditeur et le détruire ainsi que les pièces jointes. Les messages électroniques étant susceptibles d'altération, DCbrain décline toute responsabilité si ce message a été altéré, déformé ou falsifié. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, DCbrain is not liable for messages that have been modified, changed or falsified. Thank you.
|
|