Fail to load complete edge data of Graph500 to Janusgraph 0.5.3 with Cassandra CQl as storage backends


shepherdkingqsp@...
 

Hi there,

I am new to Janusgraph. I have some problems in loading data to Janusgraph with Cassandra CQL as storage backend.

When I tried to load Graph500 to Janusgraph, planning to run benchmark on it.  I found that the edges loaded to janusgraph were not complete, 67107183 edges loaded while 67108864 supposed. (Vertices loaded were complete)

The code and config I used is post as below.

The code I used is a benchmark by tigergraph:
- load vertex: https://github.com/gaolk/graph-database-benchmark/blob/master/benchmark/janusgraph/multiThreadVertexImporter.java
- load edge: https://github.com/gaolk/graph-database-benchmark/blob/master/benchmark/janusgraph/multiThreadEdgeImporter.java

The config I used is conf/janusgraph-cql.properties in Janusgraph 0.5.3 full (https://github.com/JanusGraph/janusgraph/releases/download/v0.5.3/janusgraph-full-0.5.3.zip)
cache.db-cache-clean-wait = 20
cache.db-cache-size = 0.5
cache.db-cache-time = 180000
cache.db-cache = true
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.batch-loading=true
storage.cql.keyspace=janusgraph 
storage.hostname=127.0.0.1
I got those exceptions when loading data.
Exception 1:
Caused by: java.util.concurrent.ExecutionException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.OperationTimedOutException: [/127.0.0.1:9042] Timed out waiting for server response))
        at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
        at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
        at io.vavr.control.Try.of(Try.java:62)
        at io.vavr.concurrent.FutureImpl.lambda$run$2(FutureImpl.java:199)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Exception 2:
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT10S
        at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:100)
        at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
        at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:469)
        at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:395)
        at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:52)
        at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:515)
        at org.janusgraph.graphdb.util.SubqueryIterator.<init>(SubqueryIterator.java:66)
        ... 20 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
        at io.vavr.API$Match$Case0.apply(API.java:3174)
        at io.vavr.API$Match.of(API.java:3137)
        at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$static$0(CQLKeyColumnValueStore.java:123)
        at io.vavr.control.Try.getOrElseThrow(Try.java:671)
        at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.getSlice(CQLKeyColumnValueStore.java:290)
        at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:76)
        at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.lambda$getSlice$1(ExpirationKCVSCache.java:91)
        at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)

I have found solution on google but got few things help. Could somebody help?


Best Regards,
Shipeng Qi


hadoopmarc@...
 

Hi Shipeng Qi,

The system that you use might be too small for the number of threads in the loading code. You can try to decrease the number of threads from 8 to 4 with:

private static ExecutorService pool = Executors.newFixedThreadPool(4);

Best wishes,    Marc


shepherdkingqsp@...
 

On Tue, Aug 24, 2021 at 06:20 AM, <hadoopmarc@...> wrote:
with
Got it. I will try it soon. 

Thanks, Marc!

Shipeng


shepherdkingqsp@...
 

HI Marc,

I have tried it. And finally I got complete Graph500 vertices and edges loaded.

But there is a still weird thing that I found the same exception reported in the log.

Could you please explain this? With exception reported, the data was still loaded completely?

Regards,
Shipeng