Re: Cassandra/HBase storage backend issues
HadoopMarc <m.c.d...@...>
Hi Mike,
Seeing no expert answers uptil now, I can only provide a general reply. I see the following lines of thinking in explaining your situation:
https://github.com/JanusGraph/janusgraph/blob/236dd930a7af35061e393ea8bb1ee6eb65f924b2/janusgraph-hbase-parent/janusgraph-hbase-core/src/test/java/org/janusgraph/graphdb/hbase/HBasePartitionGraphTest.java
Other ideas still welcome!
Marc
Op zondag 18 juni 2017 08:38:02 UTC+2 schreef mi...@...:
Seeing no expert answers uptil now, I can only provide a general reply. I see the following lines of thinking in explaining your situation:
- HBase fails in providing row based consistency: extremely unlikely given the many applications that rely on this
- JanusGraph fails in providing consistency between instances (e.g. using out of date caches). Do you use multiple JanusGraph instances? Or multiple threads that access the same JanusGraph instance?
- Your application fails in handling exceptions in the right way (e.g. ignoring them)
- Your application has logic faults: not so likely because you have been debugging for some while.
https://github.com/JanusGraph/janusgraph/blob/236dd930a7af35061e393ea8bb1ee6eb65f924b2/janusgraph-hbase-parent/janusgraph-hbase-core/src/test/java/org/janusgraph/graphdb/hbase/HBasePartitionGraphTest.java
Other ideas still welcome!
Marc
Op zondag 18 juni 2017 08:38:02 UTC+2 schreef mi...@...:
Hi! I'm running into an issue and wondering if anyone has tips. I'm using HBase (also tried this with cassandra with the same issue) and running into an issue where preprocessing our data yields inconsistent results. We run through a query and for each vertex with a given property, we run a traversal on it and calculate properties or insert edges that weren't inserted on upload to boost performance of our eventual traversal.Our tests run perfectly with a tinkergraph, but when using HBase or Cassandra backend, sometimes the tests fail, sometimes the calculated properties are completely wrong, and sometimes edges aren't created when needed. A preprocess task may depend on the output of a previous preprocess task that may have taken place seconds earlier. I think this is caused by eventual consistency breaking the traversal, but I'm not sure how to get 100% accuracy (where the current preprocess task can be 100% confident it gets the correct value from a previous preprocessing task).I create a transaction for each preprocessing operation, then commit it once successful, but this doesn't seem to fix the issues. Any ideas?Thanks,Mike