Too low Performance when running PageRank and WCC on Graph500


shepherdkingqsp@...
 

Hi there,

Recently I am trying to measure Janusgraph performance. When I was trying to run benchmark using Janusgraph 0.5.3, I found too low performance when testing pagerank and wcc.

The code I used that you can refer to:
https://github.com/gaolk/graph-database-benchmark/tree/master/benchmark/janusgraph

Data:
Graph500

The environment:
Janusgraph Version: 0.5.3 (download the full release zip from janugraph github)

The config of Janusgraph (default conf/janusgraph-cql.properties)
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.batch-loading=true
storage.hostname=127.0.0.1
storage.cql.keyspace=janusgraph
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5
To be more specific, I ran Khop with it and got reasonable result.

K-Hop

Latency

1-Hop

23.42

2-Hop

16628.49

3-Hop

1872747.62(2/10 2h Timeout)

4-Hop

889146.03(8/10 2h Timeout)

5-Hop

10/10 2h Timeout

6-Hop

10/10 2h Timeout


But when I ran wcc and pagerank, I got 3 hours timeout either.

Could you somebody help find the reason that I got low performance?


Regards,
Shipeng


hadoopmarc@...
 

Hi Shipeng,

Did you use their machine specs: 32 vCPU and 244 GB memory?  The graph is pretty big for in-memory use during OLAP:
marc@antecmarc:~$ curl http://service.tigergraph.com/download/benchmark/dataset/graph500-22/graph500-22_unique_node | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.6M  100 17.6M    0     0  3221k      0  0:00:05  0:00:05 --:--:-- 4166k
2396019
marc@antecmarc:~$ curl http://service.tigergraph.com/download/benchmark/dataset/graph500-22/graph500-22 | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  989M  100  989M    0     0  5427k      0  0:03:06  0:03:06 --:--:-- 6123k
67108864
Best wishes,    Marc


shepherdkingqsp@...
 

Well, the spec I am using is 32vCPU and 128GB memory. But I am testing Janusgraph with Cassandra as storage backend.

I think this is not a thing with memory spec. I think maybe it is a thing with configuration. (Cause you can see the result of Khop is reasonable.)

Best regards,
Shipeng


Oleksandr Porunov
 

Hi Shipeng,

I didn't check the graph which you refereed but 0.5.3 JanusGraph has some hard limits with Cassandra backend. I would recommend trying 0.6.0 version.
You might want to add some configurations related to your throughput. Something like:
```
storage.cql.read-consistency-level: ONE
query.batch: true
query.smart-limit: false
# query.fast-property: false or true depending on queries
ids.block-size: 1000000
storage.batch-loading: true
storage.cql.local-max-connections-per-host: 5
storage.cql.max-requests-per-connection: 1024
storage.cql.executor-service.enabled: false
storage.parallel-backend-executor-service.core-pool-size: 100
```

Best regards,
Oleksandr