Re: Bulk loading into JanusGraph with HBase
Jason Plurad <plu...@...>
Thanks for sharing all that info because makes it much easier to have a constructive conversation. Your default batch size of 100,000 between commits looks really large. Dropping that down to 5,000, these were my results running on my machine (2015 MacBook Pro, 2.8 GHz Intel Core i7 quad core, 16 GB RAM, 1 TB SSD)
Not sure what your machine specs are, but that's already 2x faster. I didn't spend much more time on it, but experimenting with the batch size could get you better results. You mentioned you saw 3h on local laptop vs 12h on the HBase cluster. This sounds like either your cluster is misconfigured/unoptimized or you have a big latency involved between your client application and the cluster. On Friday, October 6, 2017 at 11:51:03 AM UTC-4, Michele Polonioli wrote:
|
|