Note: lists.lfaidata.foundation will be down for maintenance on Monday, September 26th, starting at 9AM Pacific Time (4PM Monday September 26, 2022 UTC), for approximately one hour.
- Data Loading Script Optimization
Re: Data Loading Script Optimization
Good to see some progress!
- Is 40% relative to a single core or to all cores (e.g. CPU usage for a java process in top can be 800% if 8 cores are present)?
- Ncore * 100% is not necessarily the maximum CPU load of the groovy process + storage backend if the loading becomes IO limited. Can you find out what IO usage is?
- Do you use CompositeIndices on the properties "name" and "e-mail" for the has() filters?
- Regarding the idea from Nicolas, I would rather use a ConcurrentMap that maps ORG id's to vertex id's, but only fill it as you go for the ORG's that you add or lookup. The JanusGraph transaction and database caches should be large enough to hold the vertices to be referenced two or more times, thus accommodating g.V(id) lookups.
- On a single system Apache Spark will not help you.
Best wishes, Marc
Join firstname.lastname@example.org to automatically receive all group messages.