Hi there,
I am working on ingesting a lot of data and building a specific graph based on this data.
I have data coming in that all result in a vertex being created to represent the data. However, most of this data has additional data on it that gets extracted into its own vertex. This extracted data has an identity that can be matched on later data coming in ( Think of a Person with a social security number).
As I ingest data with People on it, I first look up to see if the Person exists in the graph. If they do, there are 2 operations that may be performed on the Person:
1. If all the Person data on the new data coming in matches what exists in the graph, I simple tie a new data Vertex to the existing Person Vertex.(Let's think of this as a 'read', even though its really not).
2. Else if some of the Person data conflicts with what exists in the graph, I do a much more involved operation on the existing Person that involves deleted the Vertex and creating a new one or even multiple to represent the Person. (Let's think of this as a write)
Currently I ingest this data concurrently across a thread pool and lock a Person vertex on all data that comes in whether it executes 1 or 2 above. This isnt ideal as it results in all data editing the same Person to be serialized. A lot of the data that comes in does operation 1, and two pieces of data doing operation 1 dont really conflict with one another (i should be able to tie two edges to the same vertex concurrently and not have to worry about it.).
So what I really want is a locking mechanism where any piece of data executing operation 2 locks the Person so that nothing can access it (neither data executing operation 1 nor data executing operation 2) but any piece of data executing operation 1 does not prevent any other piece of data from executing.
Currently if i simply dont lock on operation 1 then it prevents any data executing operation 2 from executing while other data is executing operation 2, but it allows data executing operation 1 to occur while operation 2 is executing.
Does anyone know how to do what I am describing?
This describes exactly what I want. As I said, act as though operation 1 is a 'Read' and operation 2 is a 'write'.