Re: Backup & Restore of Janusgraph Data with Mixed Index Backend (Elastisearch)


Yeah, good point, it's a bit hairy. Having potentially inconsistent index backups makes them much less attractive. Though I guess I could run a reindex job on just the delta since last Scylla write time and last ES write time.
As a simpler alternative, how about pausing write transactions for say ~1s and initiating simultaneous backups of my Scylla and ES clusters during that time?
From what I can tell, both backup mechanisms guarantee snapshot isolation. A short write pause should ensure that all writes have propagated.
What caveats do you see with this approach?

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Monday, May 3rd, 2021 at 9:49 AM, rngcntr <florian.grieskamp@...> wrote:

Although the solution presented by Marc is also the closest to a consistent backup that I can think of, there are obviously caveats to it. Updates of values which were written after the time of the Scylla snapshot could be present in ES, corrupting the state of the index. Therefore, checking the pure existence of a vertex in Scylla may not be sophisticated enough to guarantee a consistent state. Verifying the property values explicitly can be helpful here, but that still leaves us with the question how to handle mismatches of this kind.
Just keep that in mind when using such a backup strategy in your environment.

Best regards,

Join to automatically receive all group messages.