parallelNoncontendedLockStressTest failure ?
Demai <nid...@...>
I run the testcase several times already, and always failed on this one (sometimes under hbase98, sometimes hbase10 though) HBaseLockStoreTest>LockKeyColumnValueStoreTest.parallelNoncontendedLockStressTest:364 expected:<100> but was:<81> The surefire report is attached. I suspect that some lock contention or memory issue to cause the failure. Also, I am running on a less-powerful macbook, maybe need to increase the default timeout to get all 100 threads? thanks for any pointers. Demai |
|
Demai <nid...@...>
Got chance to dig more into this. It is the permanent locking causing the failure. I am not familiar with Janusgraph logic to judge whether this locking exception is expected behavior in this test scenario or not, though document 27.1. Data Consistency did warn about the robustness of it. And I am able to consistently reproduce the failure in the past few days LockKeyColumnValueStoreTest#LockStressor.run() tolerates TemporaryLockingException, but not PermanentLockingException ... } catch (TemporaryLockingException e) { temporaryFailures++; } catch (Throwable t) { log.error("Unexpected locking-related exception on iteration " + (opIndex + 1) + "/" + opCount, t); ... The error shows either 'permanent locking failure' or 'Local lock contention', both looks like permanent locking exception from AbstractLocker#writeLock(): ... } catch (TemporaryBackendException tse) { throw new TemporaryLockingException(tse); .... } catch (Throwable t) { throw new PermanentLockingException(t); } finally { ... } } else { // Fail immediately with no retries on local contention throw new PermanentLockingException("Local lock contention"); On Tuesday, September 12, 2017 at 3:03:32 PM UTC-7, Demai wrote:
|
|