Re: hasNext() slow for large number of incoming edges

Matthew Nguyen <nguyenm9@...>

Hi Boxuan, thanks for the response. Some background:  I'm trying to use JG as a triplestore and importing rdf.

The triple <microsoft> <rdfs:type> <company> can be modelled as V('microsoft') -> E('rdfs:type') -> V('company') such that:

g.V().has('value', 'microsoft').out().has('value', 'company').inE('rdfs:type').hasNext() = true

Certainly there can be millions of companies out there that can be modelled similarly.  I u/d the issue surround supernodes,  so perhaps this question is more about trying to u/d some internals of JG.

Note:  again, my use case is not exactly like above where everything is know but more around the sparql query:  select ?company where { ?comp rdfs:type <company> } or give me all companies of rdfs:type company which translates to Gremlin:
   g.V('value','company').inE() and then traverse inE().  But  g.V('value','company').inE().hasNext() takes a long time to initially run.

1) what is g.V(v).inE(e).hasNext() doing above that a call on a supernode is taking so long?  if it's trying to load all incidental edges, should either the documentation be updated or maybe the function be renamed to reflect potential latency issues?  or maybe the implementation is broken up something like c++ iteration -> traversal.begin(); while (traversal.hasNext()) or something like that.  begin() and hasNext() can be implemented via the range(..) function you mentioned to better control perceived latency.  

2) When you mention remodelling, I can think of 2 ways to do so off the top of my head (please advise on others).
a. Have multiple types of Companies (TechCompany, FinancialCompany, etc.) to reduce the likelihood of a supernode
b. Add a property to V('microsoft').has('rdfs:type', 'company').  If I do this, and assuming 'rdfs:type' is property indexed, will V().has('rdf:type', 'company').hasNext() be fast?  If so, why?  

I hope this doesn't come across negatively.  I am very interested in trying to bridge the gap btwn LPG & RDF (3store) and I think I have some good use cases that can hopefully help to improve JG down the road.

thx, matt



Join to automatically receive all group messages.