Re: How to upload rdf bulk data to janus graph
toggle quoted messageShow quoted text
Try to enable batch loading: "storage.batch-loading=true".
Increase your batch mutations buffer: "storage.buffer-size=20480".
Increase ids block size: "ids.block-size=10000000".
Not sure if your flows just adds or upserts data. In case it upserts you may also set "query.batch=true".
That said, I didn't use rdf2gremlin and can't suggest much. Above configurations are just options which I can immediately think of. Of course a proper investigation should be done to suggest performance improvement. You may additionally optimize your ScyllaDB for your use cases.
On Thursday, December 24, 2020 at 12:24:10 PM UTC+2 ar...@... wrote:
I have data in RDF(ttl) format. It is having around 6 million triplets. Currently, I have used rdf2gremlin python script for this conversion but it's taking to much time i.e. for 10k records it took around 1 hour. I am using Scylla DB as a Janus graph backend. Below is the python code I am using.