Re: Transaction Recovery and Bulk Loading

Boxuan Li

Hi Marc,

I am not familiar with batch-loading, but from what I understand, it might be because of the performance. Batch-loading aims to persist data as fast as possible, in sacrifice of functionalities like consistency checks. Write-ahead log for sure slows down the bulk loading process.

Also, technically, when batch-loading is enabled, there is a chance that your data gets persisted to your data storage in the StandardJanusGraph::prepareCommit method, which is earlier than WAL is written. When batch-loading is disabled, your data always gets persisted only after WAL is written. Not sure if there is any particular reason here but I guess this is by design.

There might or might not be other reasons behind the design choice, but performance is what comes to my mind when I see your question.


「madams via <>」在 2021年6月9日 週三,下午10:53 寫道:

Hi all,

We've been integrating our pipelines with Janusgraph for sometime now, it's been working great, thanks to the developers!
We use the transaction recovery job and enabled batch-loading for performance, and then we realized the write ahead transaction log is not used when batch-loading is enabled.
By curiosity, is there any reason for this?
At the moment we disabled batch loading and consistency checks. We've thought about replacing the transaction recovery with a reindexing job but reindexing is quite a heavy operation.

Best Regards,

Join { to automatically receive all group messages.