Re: Cassandra crashing after dropping large graph. Error: Scanned over 100001 tombstones...


willam boss <jcbm...@...>
 

My cluster base on hbase, If I want to drop all data ,I will use hbase shell and execute : disable tablename, drop tablename ,this is a fastest way for data deleteing

在 2018年12月26日星期三 UTC+8下午3:06:57,Michael Kaiser-Cross写道:

After loading a large graph 20k nodes into JanusGraph a couple of times and then deleting via g.V().drop() my cassandra server crashes with the below message.

data_storage_1     | ERROR [ReadStage-2] 2018-12-26 06:36:20,083 StorageProxy.java:1909 - Scanned over 100001 tombstones during query 'SELECT * FROM ns.edgestore WHERE column1 >= 02 AND column1 <= 03 LIMIT 100' (last scanned row partion key was ((d00000000001e300), 02)); query aborted

I found an article on deleting and tombstones in cassandra here. If I understand correctly I am creating too many tombstones too quickly and the tombstones won't be deleted until after the default period which is 10 days. I also found this stack overflow question which suggests lowering the cassandra gc_grace_seconds table option which will result in tombstones being removed more frequently.

I wanted to try this to see if it fixes the problem for me but since JanusGraph creates the tables how would I customize this value? Is there some way to set cassandra table options when starting gremlin-server.sh?


Mike

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.