Re: Performance Improvement

Vinayak Bali

Hi All, 

Updated the Janusgraph version to 0.6.0 and added the parallel execution queries in the configuration files as suggested by Oleksandr. Still, the performance is not improved. I think I am missing out on something and hence describing my requirement in detail. 

The attachment along with the mail contains the data model which I am using to query the schema.

This data model will be visible to the user on the UI. He can choose any number of nodes and relationships from the UI. Based on the selection, I am creating the queries to retrieve the data and count respectively. If the count exceeds a certain limit, the additional filter must be added by the user. Hence the count query is an important aspect of the implementation and performance matters. Let's consider some cases of the count queries: 

Case 1: User Selection: Node1, Node3 and Relation3
Count Query Output Required: Node1: 25, Node3: 30, Relation3: 50 i.e only the nodes that contain the relationship
Data Query: Must return all the data for Node1, Node3, and Relation3 including the properties

Case2: User Selection: Node1, Node3 and Relation3 and Node4,Node2 and Relation4
Count Query Output: Node1: 25, Node3: 30, Relation3: 50, Node4:10, Node2: 34, Relation4: 45, only the nodes that contain the relationship
Data Query: Must return all the data for Node1, Node3, Relation3, Node4,Node2 and Relation4 including the properties

Case3: Filters can be added to the above cases based on properties.

I have tried using union along with aggregate steps, but performance as required. The hardware configuration of the machine is not an issue.

Request you all to take a look and provide your valuable suggestions based on experience, to solve the problem. 
If possible share both the count and data query for all cases and configuration required to improve performance.

Thanks & Regards,

On Mon, Sep 13, 2021 at 2:22 AM Oleksandr Porunov <alexandr.porunov@...> wrote:
Hi Vinayak,

0.6.0 version of JanusGraph is released. I posted some quick tips to improve throughput to your CQL storage here:
I also had a post in LinkedIn with links to relative documentation parts and several better suggestions about internal ExecutorServices usage here:

In 0.6.0 you can improve your CQL throughput drastically using a simple configuration `storage.cql.executor-service.enabled: false` which I definitely recommend to do but you should properly configure throughput related configurations.

Best regards,

Join to automatically receive all group messages.