Re: Backup & Restore of Janusgraph Data with Mixed Index Backend (Elastisearch)


Awesome, yes, that's very similar to what I was planning!
It's not perfect and definitely needs to tested thoroughly, but it should be much faster and reasonably scriptable.
I'll let you all know how it goes when I get to setting this up.. hopefully won't be long, a decade or so at most.


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Monday, May 3rd, 2021 at 9:06 AM, <hadoopmarc@...> wrote:
In theory (not used in practice) the following should be possible:

  1. make a snapshot of the ScyllaDB keyspace
  2. after the ScyllaDB snapshot is written, make a snapshot of corresponding ES mixed indices
  3. restore all snapshots on separate temporary clusters (doing this manually on a production cluster is a no-go)
  4. find the latest writetime in the ScyllaDB snapshot
  5. try all ES index items later than this timestamp and remove them if the corresponding vertices cannot be retrieved from ScyllaDB
  6. make a new snapshot of the ES mixed indices
This is rather cumbersome, of course, but it would allow for a fast restore of consistent indices (this does not deal with the other issue, the partially succeeded transactions).

Best wishes,   Marc

Join { to automatically receive all group messages.