Re: Degree-Centrality Filtering & Search – Scalable Strategies for OLTP
"zb...@gmail.com" <zblu...@...>
Hi Marc, Boxuan, Thank you for the discussion. I have been experimenting with different queries including your id() suggesting Marc. Along Boxuan’s feedback, the where() step performs about the same (maybe slightly slower) when adding .id() step. My bigger concern for my use case is how this type of operation scales in a matter that seems relatively linear with sample size. i.e. g.V().limit(10).where(InE().count().is(gt(6))).profile() => ~30 ms g.V().limit(100).where(InE().count().is(gt(6))).profile() => ~147 ms g.V().limit(1000).where(InE().count().is(gt(6))).profile() => ~1284 ms g.V().limit(10000).where(InE().count().is(gt(6))).profile() => ~13779 ms g.V().limit(100000).where(InE().count().is(gt(6))).profile() => ? > 120000 ms (timeout)
This behavior makes sense when I think about it and also when I inspect the profile (example profile of limit(10) traversal below) I know the above traversal seems a bit funky, but I am trying to consistently analyze the effect of sample size on the edge count portion of the query. Looking at the profile, it seems like JG needs to perform a sliceQuery operation on each vertex sequentially which isn’t well optimized for my use case. I know that if centrality properties were included in a mixed index then it can be configured for scalable performance. However, going back to the original post, I am not sure that is the best/only way. Are there other configurations that could be optimized to make this operation more scalable without to an additional index property? In case it is relevant, I am using JanusGraph v 0.5.2 with Cassandra-CQL backend v3.11. Thank you, Zach Example Profile gremlin> g.V().limit(10).where(inE().count().is(gt(6))).profile() ==>Traversal Metrics Step Count Traversers Time (ms) % Dur ============================================================================================================= JanusGraphStep(vertex,[]) 10 10 8.684 28.71 \_condition=() \_orders=[] \_limit=10 \_isFitted=false \_isOrdered=true \_query=[] optimization 0.005 optimization 0.001 scan 0.000 \_query=[] \_fullscan=true \_condition=VERTEX TraversalFilterStep([JanusGraphVertexStep(IN,ed... 21.564 71.29 JanusGraphVertexStep(IN,edge) 13 13 21.350 \_condition=(EDGE AND visibility:normal) \_orders=[] \_limit=7 \_isFitted=false \_isOrdered=true \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_vertices=1 optimization 0.003 backend-query 3 4.434 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 1 1.291 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 2 1.311 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 1 2.483 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 2 1.310 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 2 1.313 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 2 1.192 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 4 1.287 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 3 1.231 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 optimization 0.001 backend-query 2 3.546 \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@9c76d \_limit=14 RangeGlobalStep(0,7) 13 13 0.037 CountGlobalStep 10 10 0.041 IsStep(gt(6)) 0.022 >TOTAL - - 30.249 - On Wednesday, December 30, 2020 at 4:59:20 AM UTC-5 li...@... wrote:
|
|