toggle quoted messageShow quoted text
I have some concerns over concurrency and consistency issues, but this might still be a nice feature to have. I think you could open a new discussion on https://github.com/JanusGraph/janusgraph/discussions
. That would be a better place for brainstorming. It would be awesome if you can share more context on why you think this is a very common business requirement.
Thank you Boxuan,
Was using the term “job” pretty loosely. Your inference about doing these things within ingest/deletion process makes sense.
I know there is a lot on the community’s plate now, but if my above solution is truly optimal for current state, I wonder if a JG feature addition may help tackle this problem more consistently. Something like an additional, 3rd , index type (in addition to “graph” and “vertex-centric” indices) . i.e. “edge-connection” or “degree-centrality” index. The feature would require a mixed indexing backend, and minimally a mechanism to choose vertex and edge label combinations to count IN, OUT, and/or BOTH degree centrality.
Not sure what the level of effort or implementation details would be, but this is a very common business requirement for graph-based search. If JanusGraph has native/tested support for it, it would make JanusGraph even easier to champion.
On Tuesday, December 29, 2020 at 3:19:46 AM UTC-5 libo...@connect.hku.hk
Personally I think your workaround is the most optimal one. JanusGraph does not store number of edges as metadata in the vertex (there are both Pros & Cons for doing / not doing this).
Btw do you have to have another job doing centrality calculation separately? If your application is built on top of JanusGraph, then probably you can maintain the “outDegree” property when inserting/deleting edges.
Curious about best approaches/practices for scalable
degree-centrality search filters on large (millions to billions of nodes) JanusGraphs. i.e. something like :
Suppose the has-step narrows down to a large number
of vertices (hundreds of thousands), then performing that form of count on that
many vertices will result in timeouts and inefficiencies (at least in my experience). My workaround for this has been pre-calculating
centrality in another job and writing to a Vertex Property that can subsequently be included
in a mixed index. So we can do:
This works, but it is yet another calculation we must maintain
in our pipeline and while sufficing, it seems like more of a workaround then a
great solution. I was hoping there was a more optimal approach/strategy. Please let me know.
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@...
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/c6539751-c586-42c1-af96-010b6275d1f1n%40googlegroups.com