Re: Bulk Loading with Spark


Joe Obernberger
 

Should have added - I'm connecting with:

JanusGraph graph = JanusGraphFactory.build()
                .set("storage.backend", "cql")
                .set("storage.hostname", "charon:9042, chaos:9042")
                .set("storage.cql.keyspace", "graph")
                .set("storage.cql.cluster-name", "JoeCluster")
.set("storage.cql.only-use-local-consistency-for-system-operations", "true")
                .set("storage.cql.batch-statement-size", 256)
                .set("storage.cql.local-max-connections-per-host", 8)
                .set("storage.cql.read-consistency-level", "ONE")
                .set("storage.batch-loading", true)
                .set("schema.default", "none")
                .set("ids.block-size", 100000)
                .set("storage.buffer-size", 16384)
                .open();


-Joe

On 5/20/2022 5:28 PM, Joe Obernberger via lists.lfaidata.foundation wrote:
Hi All - I'm trying to use spark to do a bulk load, but it's very slow.
The cassandra cluster I'm connecting to is a bare-metal, 15 node cluster.

I'm using java code to do the loading using:
GraphTraversalSource.addV and Vertex.addEdge in a loop.

Is there a better way?

Thank you!

-Joe

--
This email has been checked for viruses by AVG.
https://www.avg.com

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.