Performance Improvement


Vinayak Bali
 

Hi Oleksandr, 

Thank you for the detailed explanation, regarding the configuration and indexes. I will dig deeper into it and try to resolve the problem. 
But I think the queries which I am executing are not efficient.
Request you to share the gremlin queries for the above two cases mentioned, in the previous mail. That will help a lot to validate the queries.

Thanks & Regards,
Vinayak


On Mon, Oct 4, 2021 at 1:27 AM Oleksandr Porunov <alexandr.porunov@...> wrote:
Hi Vinayak,

I didn't follow your statements about count but I just want to add that if you don't use mixed index for count query than your count will require iteratively returning each element and count them in-memory (i.e. very inefficient). To check if your count query is using mixed index you can use `profile()` step.

I also noticed that you say that you need to return all properties for all vertices / edges. If so, you may consider using multiQuery which will return properties for your vertices faster than valueMap() step in certain cases. The only thing you need to consider when using `multiQuery` (actually any query) is tx-cache size (don't confuse with database cache). In case your tx-cache size is too small to hold all the vertices than some vertices' properties will be evicted from the cache. Thus, when you will try to return values for the vertex properties it will make new database calls to retrieve those properties. In the worst case all your access to properties may lead to separate database calls. To eliminate this downside you need to make sure that your transaction cache size is at least the same amount of vertices your are accessing (or bigger). In such case `multiQuery().addAllVertices(yourVertices).properties()` will return all properties for all vertices and it will hold those properties in-memory instead of evicting them.

Moreover, it looks like your use cases are read-heavy and not write-heavy. You may improve your performance by making sure all your writes are using consistency-level=ALL and all your reads are using consistency-level=ONE. You may want to disable consistency check as well as internal / external checks for your transactions if you are sure about your data. It will make some of your queries faster but less safer.

You also need to make sure that you configured your CQL driver throughput optimally for your load. In case your JanusGraph is embedded into your Application you need to make sure your application has the smallest latency between your Cassandra nodes (you may even consider placing your application to the same nodes with Cassandra or just moving your Gremlin Server to those nodes and use remote connection).

There are many JanusGraph and CQL driver configurations which you may use to tune your performance for your use-case. This topic is to broad to give all-fits solution. Different use cases might need different approaches. I would strongly recommend you to explore all JanusGraph configurations here: https://docs.janusgraph.org/configs/configuration-reference/ . It will allow to configure your general JanusGraph configuration, your transactions configuration, and your CQL driver configuration much better if you are aware about all the configurations. For advanced CQL configuration options see this configurations here: https://docs.datastax.com/en/developer/java-driver/4.13/manual/core/configuration/reference/  (storage.cql.internal in JanusGraph).

You may also try exploring other storage backends which may give you smaller latency (hence better performance), like ScyllaDB, Aerospike, etc.

Best regards,
Oleksandr Porunov


Oleksandr Porunov
 

Hi Vinayak,

I didn't follow your statements about count but I just want to add that if you don't use mixed index for count query than your count will require iteratively returning each element and count them in-memory (i.e. very inefficient). To check if your count query is using mixed index you can use `profile()` step.

I also noticed that you say that you need to return all properties for all vertices / edges. If so, you may consider using multiQuery which will return properties for your vertices faster than valueMap() step in certain cases. The only thing you need to consider when using `multiQuery` (actually any query) is tx-cache size (don't confuse with database cache). In case your tx-cache size is too small to hold all the vertices than some vertices' properties will be evicted from the cache. Thus, when you will try to return values for the vertex properties it will make new database calls to retrieve those properties. In the worst case all your access to properties may lead to separate database calls. To eliminate this downside you need to make sure that your transaction cache size is at least the same amount of vertices your are accessing (or bigger). In such case `multiQuery().addAllVertices(yourVertices).properties()` will return all properties for all vertices and it will hold those properties in-memory instead of evicting them.

Moreover, it looks like your use cases are read-heavy and not write-heavy. You may improve your performance by making sure all your writes are using consistency-level=ALL and all your reads are using consistency-level=ONE. You may want to disable consistency check as well as internal / external checks for your transactions if you are sure about your data. It will make some of your queries faster but less safer.

You also need to make sure that you configured your CQL driver throughput optimally for your load. In case your JanusGraph is embedded into your Application you need to make sure your application has the smallest latency between your Cassandra nodes (you may even consider placing your application to the same nodes with Cassandra or just moving your Gremlin Server to those nodes and use remote connection).

There are many JanusGraph and CQL driver configurations which you may use to tune your performance for your use-case. This topic is to broad to give all-fits solution. Different use cases might need different approaches. I would strongly recommend you to explore all JanusGraph configurations here: https://docs.janusgraph.org/configs/configuration-reference/ . It will allow to configure your general JanusGraph configuration, your transactions configuration, and your CQL driver configuration much better if you are aware about all the configurations. For advanced CQL configuration options see this configurations here: https://docs.datastax.com/en/developer/java-driver/4.13/manual/core/configuration/reference/  (storage.cql.internal in JanusGraph).

You may also try exploring other storage backends which may give you smaller latency (hence better performance), like ScyllaDB, Aerospike, etc.

Best regards,
Oleksandr Porunov


Vinayak Bali
 

Hi All, 

Updated the Janusgraph version to 0.6.0 and added the parallel execution queries in the configuration files as suggested by Oleksandr. Still, the performance is not improved. I think I am missing out on something and hence describing my requirement in detail. 

The attachment along with the mail contains the data model which I am using to query the schema.

This data model will be visible to the user on the UI. He can choose any number of nodes and relationships from the UI. Based on the selection, I am creating the queries to retrieve the data and count respectively. If the count exceeds a certain limit, the additional filter must be added by the user. Hence the count query is an important aspect of the implementation and performance matters. Let's consider some cases of the count queries: 

Case 1: User Selection: Node1, Node3 and Relation3
Count Query Output Required: Node1: 25, Node3: 30, Relation3: 50 i.e only the nodes that contain the relationship
Data Query: Must return all the data for Node1, Node3, and Relation3 including the properties

Case2: User Selection: Node1, Node3 and Relation3 and Node4,Node2 and Relation4
Count Query Output: Node1: 25, Node3: 30, Relation3: 50, Node4:10, Node2: 34, Relation4: 45, only the nodes that contain the relationship
Data Query: Must return all the data for Node1, Node3, Relation3, Node4,Node2 and Relation4 including the properties

Case3: Filters can be added to the above cases based on properties.

I have tried using union along with aggregate steps, but performance as required. The hardware configuration of the machine is not an issue.

Request you all to take a look and provide your valuable suggestions based on experience, to solve the problem. 
If possible share both the count and data query for all cases and configuration required to improve performance.

Thanks & Regards,
Vinayak


On Mon, Sep 13, 2021 at 2:22 AM Oleksandr Porunov <alexandr.porunov@...> wrote:
Hi Vinayak,

0.6.0 version of JanusGraph is released. I posted some quick tips to improve throughput to your CQL storage here:
https://lists.lfaidata.foundation/g/janusgraph-users/message/6148
I also had a post in LinkedIn with links to relative documentation parts and several better suggestions about internal ExecutorServices usage here: https://www.linkedin.com/posts/porunov_release-060-janusgraphjanusgraph-activity-6840714301062307840-r6Uw

In 0.6.0 you can improve your CQL throughput drastically using a simple configuration `storage.cql.executor-service.enabled: false` which I definitely recommend to do but you should properly configure throughput related configurations.

Best regards,
Oleksandr


Oleksandr Porunov
 

Hi Vinayak,

0.6.0 version of JanusGraph is released. I posted some quick tips to improve throughput to your CQL storage here:
https://lists.lfaidata.foundation/g/janusgraph-users/message/6148
I also had a post in LinkedIn with links to relative documentation parts and several better suggestions about internal ExecutorServices usage here: https://www.linkedin.com/posts/porunov_release-060-janusgraphjanusgraph-activity-6840714301062307840-r6Uw

In 0.6.0 you can improve your CQL throughput drastically using a simple configuration `storage.cql.executor-service.enabled: false` which I definitely recommend to do but you should properly configure throughput related configurations.

Best regards,
Oleksandr


Vinayak Bali
 

Laura that is helpful, will go through it and try to implement it. 

Also, if there are any configurations that can be tuned for better performance, please share them.

On Mon, Jul 26, 2021 at 2:22 PM Laura Morales <lauretas@...> wrote:
There's a BUILDING file with instructions in the repo.
 
 
 

Sent: Monday, July 26, 2021 at 10:31 AM
From: "Vinayak Bali" <vinayakbali16@...>
To: janusgraph-users@...
Subject: Re: [janusgraph-users] Performance Improvement

Hi Boxuan, 
 
Thank you for your response. I am not sure, how I can build janusgraph from the master branch. If you can share step's/procedure to do the same, I can check otherwise need to wait for the new release. 
 
My use case consists of a single node label and self-relation between them. You consider it as BOM in the supply chain. 
The janusgraph and Cassandra configurations are the same which are set as default while installing.
 
The data loading script takes the CSV files as input, divides the files into different batches, and loads the batches using multi-threading. If you need more details, I can share a generic script with you and also the metrics. 
 
Thanks & Regards,
Vinayak 






Laura Morales
 

There's a BUILDING file with instructions in the repo.
 
 
 

Sent: Monday, July 26, 2021 at 10:31 AM
From: "Vinayak Bali" <vinayakbali16@gmail.com>
To: janusgraph-users@lists.lfaidata.foundation
Subject: Re: [janusgraph-users] Performance Improvement

Hi Boxuan, 
 
Thank you for your response. I am not sure, how I can build janusgraph from the master branch. If you can share step's/procedure to do the same, I can check otherwise need to wait for the new release. 
 
My use case consists of a single node label and self-relation between them. You consider it as BOM in the supply chain. 
The janusgraph and Cassandra configurations are the same which are set as default while installing.
 
The data loading script takes the CSV files as input, divides the files into different batches, and loads the batches using multi-threading. If you need more details, I can share a generic script with you and also the metrics. 
 
Thanks & Regards,
Vinayak


Vinayak Bali
 

Hi Boxuan, 

Thank you for your response. I am not sure, how I can build janusgraph from the master branch. If you can share step's/procedure to do the same, I can check otherwise need to wait for the new release. 

My use case consists of a single node label and self-relation between them. You consider it as BOM in the supply chain. 
The janusgraph and Cassandra configurations are the same which are set as default while installing.

The data loading script takes the CSV files as input, divides the files into different batches, and loads the batches using multi-threading. If you need more details, I can share a generic script with you and also the metrics. 

Thanks & Regards,
Vinayak

On Mon, Jul 26, 2021 at 1:38 PM Boxuan Li <liboxuan@...> wrote:
Hi Vinayak,

Would you be able to build JanusGraph from master branch and try again? The upcoming 0.6.0 release contains many optimizations which might be helpful. 

Without knowing more details of your use case (your queries, your loading script, your JanusGraph configs, your JanusGraph metrics, your Cassandra metrics), it’s very hard to give any concrete suggestion. Anyway, I would strongly recommend you try out the master version first and see how it goes.

Best,
Boxuan

「Vinayak Bali <vinayakbali16@...>」在 2021年7月26日 週一,下午3:55 寫道:
Hi All, 

I am using janusgraph for a while. The use case which I am working on consists of 1.5 million nodes and 3 million edges. Prepared a batch loading groovy script. The performance of the data loading script is as follows: 

Nodes: 5 mins
Edges: 13 mins
Total: 18 mins

Also, the count query including edges takes mins to execute. 
Both Janusgraph(0.5.2) and Cassandra are installed on the same instance.
 
Hardware Configuration:
RAM: 92 GB
Cores: 48 

I want expert suggestions/steps which can be followed to improve the performance. Request you to share your thoughts regarding the same.

Thanks & Regards,
Vinayak


Boxuan Li
 

Hi Vinayak,

Would you be able to build JanusGraph from master branch and try again? The upcoming 0.6.0 release contains many optimizations which might be helpful. 

Without knowing more details of your use case (your queries, your loading script, your JanusGraph configs, your JanusGraph metrics, your Cassandra metrics), it’s very hard to give any concrete suggestion. Anyway, I would strongly recommend you try out the master version first and see how it goes.

Best,
Boxuan

「Vinayak Bali <vinayakbali16@...>」在 2021年7月26日 週一,下午3:55 寫道:

Hi All, 

I am using janusgraph for a while. The use case which I am working on consists of 1.5 million nodes and 3 million edges. Prepared a batch loading groovy script. The performance of the data loading script is as follows: 

Nodes: 5 mins
Edges: 13 mins
Total: 18 mins

Also, the count query including edges takes mins to execute. 
Both Janusgraph(0.5.2) and Cassandra are installed on the same instance.
 
Hardware Configuration:
RAM: 92 GB
Cores: 48 

I want expert suggestions/steps which can be followed to improve the performance. Request you to share your thoughts regarding the same.

Thanks & Regards,
Vinayak


Vinayak Bali
 

Hi All, 

I am using janusgraph for a while. The use case which I am working on consists of 1.5 million nodes and 3 million edges. Prepared a batch loading groovy script. The performance of the data loading script is as follows: 

Nodes: 5 mins
Edges: 13 mins
Total: 18 mins

Also, the count query including edges takes mins to execute. 
Both Janusgraph(0.5.2) and Cassandra are installed on the same instance.
 
Hardware Configuration:
RAM: 92 GB
Cores: 48 

I want expert suggestions/steps which can be followed to improve the performance. Request you to share your thoughts regarding the same.

Thanks & Regards,
Vinayak