Hello all,
Curious about best approaches/practices for scalable
degree-centrality search filters on large (millions to billions of nodes) JanusGraphs. i.e. something like :
g.V()
.has("someProperty",eq("someValue"))
.where(outE().count().is(gt(10)));
Suppose the has-step narrows down to a large number
of vertices (hundreds of thousands), then performing that form of count on that
many vertices will result in timeouts and inefficiencies (at least in my experience). My workaround for this has been pre-calculating
centrality in another job and writing to a Vertex Property that can subsequently be included
in a mixed index. So we can do:
g.V()
.has("someProperty",eq("someValue"))
.has(“outDegree”,gt(10))
This works, but it is yet another calculation we must maintain
in our pipeline and while sufficing, it seems like more of a workaround then a
great solution. I was hoping there was a more optimal approach/strategy. Please let me know.
Thank you,
Zach