Re: config skip-schema-check=true is not honored for HBase
hadoopmarc@...
Hi Jigar,
Yes, I think it is an issue. I did not fully dive into it, in particular I did not check whether any tests exist for the "disable schema check" configuration option. So, go ahead and create an issue for it. Best wishes, Marc
|
|
Data Loading Script Optimization
Vinayak Bali
Hi All, I have attached a groovy script that I use to load data into janusgraph. The script takes 4 mins to load 1.5 million nodes and 13 mins to load approx 3 million edges. The server on which the script runs has higher configurations. I looking for different ways to improve the performance of the script. Your feedback will help. Thank You for the responses. Thanks & Regards, Vinayak
|
|
Re: config skip-schema-check=true is not honored for HBase
jigar patel <jigar.9408266552@...>
Hi Marc Here is the full stack trace, https://gist.github.com/jigs1993/5cc1682a919cfb5e8290bf4636f1c766 possible fix is here: https://github.com/jigs1993/janusgraph/pull/1/files Let me know if you think this is actually an issue, i can raise the PR against the master branch
|
|
Re: config skip-schema-check=true is not honored for HBase
hadoopmarc@...
Hi Jigar,
Can you provide the properties file you used for opening the graph, as well as the complete stacktrace for the exception listed above? Best wishes, Marc
|
|
config skip-schema-check=true is not honored for HBase
jigar patel <jigar.9408266552@...>
org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions (user=<user>, scope=<namespace>:<table>, params=[table=<namespace>:<table>],action=CREATE) got above error while OLAP without create permission to <user> at line https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-hbase/src/main/java/org/janusgraph/diskstorage/hbase/HBaseStoreManager.java#L732 looks like it is due to this call https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-hbase/src/main/java/org/janusgraph/diskstorage/hbase/HBaseStoreManager.java#L543 being made regardless of the boolean variable skipSchemaCheck value https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-hbase/src/main/java/org/janusgraph/diskstorage/hbase/HBaseStoreManager.java#L258 is this a bug?
|
|
Re: Property keys unique per label
hadoopmarc@...
Hi Laura,
Thanks for explaining in more detail. Another example is a "color" property. Different data sources could use different types of color objects. As long as you do not want to query for paints and socks with the same color, there is no real need to harmonize the color data-value types. Also note that an index on a property can already be constrained to a single vertex or edge label. So, if anyone would contribute your idea as a JanusGraph feature, I would guess there would be no objection. Best wishes, Marc
|
|
Re: Property keys unique per label
Laura Morales <lauretas@...>
Janus describes itself like this
a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster but my feeling when using it is that this definition means "a simple schema with billions of vertices/edges" and not "a graph with a large schema". This limitation with properties is an example. What I mean is a graph big enough that vertices on one corner of the graph represent something entirely different (semantically) from vertices on the far end of the same graph. So for example, I could use property "age" with a meaning, but use it with a completely different meaning somewhere else on the graph. Because properties names are unique, I must namespace them, for example "contextA.age" and "contextB.Age". But if nodes could be grouped by "context" for example, or maybe properties could be bound to labels, I would not need to namespace them and their datatype would only depend by their context. I don't know if this makes sense to others, but to me it does. Sent: Tuesday, July 27, 2021 at 2:00 PM From: hadoopmarc@... To: janusgraph-users@... Subject: Re: [janusgraph-users] Property keys unique per label Hi Laura, Indeed, unique property key names are a limitation. But to be honest: if two properties have a different data-value type I would say these are different properties, so why give them the same name? Best wishes, Marc
|
|
Re: Property keys unique per label
hadoopmarc@...
Hi Laura,
Indeed, unique property key names are a limitation. But to be honest: if two properties have a different data-value type I would say these are different properties, so why give them the same name? Best wishes, Marc
|
|
Re: How to create users and roles
hadoopmarc@...
Hi Jonathan,
User authorization for Gremlin Server was introduced in TinkerPop 3.5.0, see https://tinkerpop.apache.org/docs/current/reference/#authorization JanusGraph will use TinkerPop 3.5.x in its upcoming 0.6.0 release. If you want, you can already build the 0.6.0-SNAPSHOT distribution archives from master, using:
Best wishes, Marc
|
|
How to create users and roles
jonathan.mercier.fr@...
Dear,
I have not found into the documentation on the process to create and manage user and roles in order to contro datal access. At this page https://docs.janusgraph.org/basics/server/ we can see they are a connection andauthentification through HTTPor websocket. But I do not see where it is describe how to How to manage users and roles . Thanks
|
|
Property keys unique per label
Laura Morales <lauretas@...>
The documentation says "Property key names must be unique in the graph". Does it mean that it's not possible to have property keys that are unique *per label*? In other words, can I have two distinct properties with the same name but different data-value types, as long as they are applied to vertexes with different labels?
|
|
Re: janusgraph and deeplearning
hadoopmarc@...
Hi Jonathan,
One thing is not yet clear to me: does your graph fit into a single node (regarding memory and GPU) or do you plan to use distributed pytorch? Either way, I guess it would be most efficient to use a two step process:
Cool that you apply janusgraph to this use case, so do not hesitate to ask for more details! Marc
|
|
Re: How to split graph in multiple graphml files and load them separately
hadoopmarc@...
Hi Laura,
Without checking this in the code, it only seems logical that the graph id is ignored, because you have to supply the io readers with an existing Graph instance. Apparently it was chosen to make the user responsible for supplying the Graph that corresponds to the graph id in the xml file. Marc
|
|
Re: Performance Improvement
Vinayak Bali
Laura that is helpful, will go through it and try to implement it. Also, if there are any configurations that can be tuned for better performance, please share them.
On Mon, Jul 26, 2021 at 2:22 PM Laura Morales <lauretas@...> wrote: There's a BUILDING file with instructions in the repo.
|
|
[ANNOUNCEMENT] JanusGraph enabled donations on LFX Crowdfunding
The JanusGraph Technical Steering Committee is excited to announce that JanusGraph is now accepting donations. As you may know, most of JanusGraph contributors are not full-time JanusGraph employees, thus we came up with the idea to try to collect donations from the community to be able to hire full time employees to JanusGraph. With your help JanusGraph will be able to produce releases much more often and we will be able to develop JanusGraph much faster. JanusGraph Technical Steering Committee guarantees to be fully transparent with the community about any penny spent. We are accepting contributions via LFX Crowdfunding which has an open ledger where you can check all the transactions made and their descriptions. LFX Crowdfunding link where JanusGraph accepts donations is : https://crowdfunding.lfx.linuxfoundation.org/projects/janusgraph Best regards, Oleksandr Porunov on behalf of JanusGraph TSC
|
|
Re: Performance Improvement
Laura Morales <lauretas@...>
There's a BUILDING file with instructions in the repo.
Sent: Monday, July 26, 2021 at 10:31 AM From: "Vinayak Bali" <vinayakbali16@...> To: janusgraph-users@... Subject: Re: [janusgraph-users] Performance Improvement Hi Boxuan, Thank you for your response. I am not sure, how I can build janusgraph from the master branch. If you can share step's/procedure to do the same, I can check otherwise need to wait for the new release. My use case consists of a single node label and self-relation between them. You consider it as BOM in the supply chain. The janusgraph and Cassandra configurations are the same which are set as default while installing. The data loading script takes the CSV files as input, divides the files into different batches, and loads the batches using multi-threading. If you need more details, I can share a generic script with you and also the metrics. Thanks & Regards, Vinayak
|
|
Re: Performance Improvement
Vinayak Bali
Hi Boxuan, Thank you for your response. I am not sure, how I can build janusgraph from the master branch. If you can share step's/procedure to do the same, I can check otherwise need to wait for the new release. My use case consists of a single node label and self-relation between them. You consider it as BOM in the supply chain. The janusgraph and Cassandra configurations are the same which are set as default while installing. The data loading script takes the CSV files as input, divides the files into different batches, and loads the batches using multi-threading. If you need more details, I can share a generic script with you and also the metrics. Thanks & Regards, Vinayak
On Mon, Jul 26, 2021 at 1:38 PM Boxuan Li <liboxuan@...> wrote:
|
|
Re: Performance Improvement
Boxuan Li
Hi Vinayak, Would you be able to build JanusGraph from master branch and try again? The upcoming 0.6.0 release contains many optimizations which might be helpful. Without knowing more details of your use case (your queries, your loading script, your JanusGraph configs, your JanusGraph metrics, your Cassandra metrics), it’s very hard to give any concrete suggestion. Anyway, I would strongly recommend you try out the master version first and see how it goes. Best, Boxuan 「Vinayak Bali <vinayakbali16@...>」在 2021年7月26日 週一,下午3:55 寫道:
|
|
Performance Improvement
Vinayak Bali
Hi All, I am using janusgraph for a while. The use case which I am working on consists of 1.5 million nodes and 3 million edges. Prepared a batch loading groovy script. The performance of the data loading script is as follows: Nodes: 5 mins Edges: 13 mins Total: 18 mins Also, the count query including edges takes mins to execute. Both Janusgraph(0.5.2) and Cassandra are installed on the same instance. Hardware Configuration: RAM: 92 GB Cores: 48 I want expert suggestions/steps which can be followed to improve the performance. Request you to share your thoughts regarding the same. Thanks & Regards, Vinayak
|
|
Re: How to split graph in multiple graphml files and load them separately
Laura Morales <lauretas@...>
I've also noticed that graphml files can specify an "id" for the <graph> node, but I guess this has no effect on Janus at all? Like, it's completely ignored? Am I right?
toggle quoted messageShow quoted text
Sent: Monday, July 26, 2021 at 7:50 AM
|
|