Re: Cassandra crashing after dropping large graph. Error: Scanned over 100001 tombstones...


Michael Kaiser-Cross <mkaise...@...>
 

Ok thanks that is helpful. That solves my problem I guess but I still think it would be a good idea to implement the ability to change the gc_grace_seconds of a graph. People might run into this issue as they scale apps in production; especially apps that do a lot of deleting.

On Wednesday, December 26, 2018 at 3:59:20 AM UTC-5, Reinhard wrote:
it is easy to prevent this by increasing a threshold in cassandra.yaml.

tombstone_failure_threshold

It is set to 100000


Am Mittwoch, 26. Dezember 2018 08:06:57 UTC+1 schrieb Michael Kaiser-Cross:
After loading a large graph 20k nodes into JanusGraph a couple of times and then deleting via g.V().drop() my cassandra server crashes with the below message.

data_storage_1     | ERROR [ReadStage-2] 2018-12-26 06:36:20,083 StorageProxy.java:1909 - Scanned over 100001 tombstones during query 'SELECT * FROM ns.edgestore WHERE column1 >= 02 AND column1 <= 03 LIMIT 100' (last scanned row partion key was ((d00000000001e300), 02)); query aborted

I found an article on deleting and tombstones in cassandra here. If I understand correctly I am creating too many tombstones too quickly and the tombstones won't be deleted until after the default period which is 10 days. I also found this stack overflow question which suggests lowering the cassandra gc_grace_seconds table option which will result in tombstones being removed more frequently.

I wanted to try this to see if it fixes the problem for me but since JanusGraph creates the tables how would I customize this value? Is there some way to set cassandra table options when starting gremlin-server.sh?


Mike

Join {janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.