No results returned with duplicate Has steps in a vertex-search traversal


Patrick Streifel <prstreifel@...>
 

We are running into a JanusGraph bug where a traversal that should return a list of vertices is returning an empty list.

 

Here is some background info:

Using a JanusGraph Server with ConfigureGraphFactory running v. 0.5.2.

Storage: Cassandra v. 3.11.9

Index: Elasticsearch v. 6.7.2

Connecting to the server via java gremlin driver.

 

Our use case is this:

 

We are searching for vertices in the graph based on various property filters (e.g. Give me people named "Patrick" with a last name matching the regex "Str.*el"). When we just do this, there are no issues, of course.

 

The tricky part is that we are adding extra filters on a property called DomainGroup, which essentially allows us to filter out results per search user based on what they are interested in seeing. The user running the query provides a list of Domains they are interested in, and there has to be some overlap between the user's Domains and the list of DomainGroups on the vertices for those vertices to be returned. In short, we put in extra "has" steps that filters out vertices in certain groups from the results.

 

Another important note: These "has" steps to filter on Domain occur after each other step in the query. That may not be a great idea for this use case, but we have others where we need it. We have logic that groups together a set of has statements automatically based on user requests.  Sometimes this automated process will duplicate certain property searches when constructing the traversal and it is hard to avoid in certain cases.  We could work to deduplicate, but this still seems like a true bug in JanusGraph, albeit for a weird use case.

 

An example of one of our DomainGroup "has" steps is here:

has(DomainGroup, within([GROUP_A, GROUP_B])),

 

We combed through our DEBUG level logs in the JG Server.

We noticed that JG was querying the Elasticsearch index for results, as expected. Elasticsearch was actually returning the expected vertex(es), but the JG Server was not returning anything after that.

 

Here are some additional conditions we noticed:

  1. This appears only to happen when there are multiple duplicate "has" steps in the traversal.
    1. When we run a traversal with only one property search ( has(FirstName, textRegex(Patric.*)) ) and one DomainGroup filter ( has(DomainGroup , within([GROUP_A, GROUP_B])) ), then we get the expected results.
    2. When we provide two property searches, and thus two (duplicate) DomainGroup filters, we get no results. This leads us to believe there is an issue with having duplicate "has" steps, or specifically duplicate "has" steps with "within" filters.

 

Example of a traversal that we get empty results with:

args={gremlin=[[], [V(),

has(FirstName, textRegex(Patric.*)),

has(DomainGroup , within([GROUP_A, GROUP_B])),

has(PersonSurName, textRegex(Str.*el)),

has(DomainGroup , within([GROUP_A, GROUP_B])),

limit(5), valueMap(), with(~tinkerpop.valueMap.tokens)]], aliases={g=my_graph_traversal}}

 

Logs show the Elastic search scroll request returning a document with the correct id, but the logs also show JG ultimately sending an empty response to our API. Something is lost in between there.

 

Just wanted to bring this to your attention. We are figuring out workarounds on our side, but this seems like a JG bug.

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.