Janus Graph Delete All Phantom/Ghost Vertices


mohanty...@...
 

Does Janus provide any approach to clean up ghost vertices ? I am aware of option to detect them at read-time using the option checkInternalVertexExistence().
But I am looking for an approach which would traverse the graph, identify all ghost vertices and do the cleanup. 


Antriksh Shah <sha...@...>
 

The approach I follow to clean up ghosts vertices is to take a OLAP dump of the graph. Figure out the vertexId of unwanted vertices. Delete them with using the vertexId. 

Looking at your question "traverse the graph, identify all ghost vertices and do the cleanup" could you please elaborate what exactly you mean by ghost vertex? 
My understanding of ghost vertex is along the lines mentioned in the below threads:


mohanty...@...
 

Hi Antriksh,

I am referring to ghost vertex as per the below definition by Janus
https://docs.janusgraph.org/latest/common-questions.html#_ghost_vertices

So, in one of my janusgraph instance, I end up with ghost vertex creation as suggested in the above link.
So, I was looking at any utility which can be run at regular intervals to cleanup these vertices. I was going over the management api of Janus where in we have bunch of jobs for Index management(repair/enable/disable/removal etc).
There is also a class for org.janusgraph.graphdb.olap.job.GhostVertexRemover. But, I am not sure as why this has not been exposed as an api, or may be I am missing something here.

Let me know if you need any further information here.


On Wednesday, 17 July 2019 11:02:10 UTC+5:30, Antriksh Shah wrote:
The approach I follow to clean up ghosts vertices is to take a OLAP dump of the graph. Figure out the vertexId of unwanted vertices. Delete them with using the vertexId. 

Looking at your question "traverse the graph, identify all ghost vertices and do the cleanup" could you please elaborate what exactly you mean by ghost vertex? 
My understanding of ghost vertex is along the lines mentioned in the below threads:


Antriksh Shah <sha...@...>
 

Hey,

I am not aware of any utility that can remove ghost vertices. Neither have I seen this being used org.janusgraph.graphdb.olap.job.GhostVertexRemover.
In case you do come across such a utility please do share the workings here with the community. Or you can also see if someone else in the community has a better solution to this issue.

With regards to what you can do in the meantime is (which I am also doing currently): take a OLAP dump of the graph. Figure out the vertexId of unwanted vertices. Delete them with using the vertexId.

The reason I asked you for your definition of ghost vertices is because for me they were not creating an incorrect result in the traversal. I was able to skip past them by using a hasLabel condition to check if the vertex is actually ghost or not.