Incorrect result when lucene index is present


inverseintegral42@...
 

When I run the following code with janusgraph-inmemory and janusgraph-lucene both in version 0.6.2
PropertiesConfiguration conf = ConfigurationUtil.loadPropertiesConfig("conf/test.properties");
JanusGraph graph = JanusGraphFactory.open(conf);

GraphTraversalSource g = graph.traversal();
JanusGraphManagement m = graph.openManagement();

VertexLabel l = m.makeVertexLabel("L").make();
PropertyKey p = m.makePropertyKey("p").dataType(Short.class).make();
PropertyKey q = m.makePropertyKey("q").dataType(UUID.class).make();
m.buildIndex("someName", Vertex.class).addKey(p).addKey(q).indexOnly(l).buildMixedIndex("search");
m.commit();

g.addV("L").property("p", (short) 1).next();
g.tx().commit();

System.out.println(g.V().hasLabel("L").has("q").count().next());
System.out.println(g.V().hasLabel("L").has("q", not(eq(UUID.randomUUID()))).count().next());

I get the output
0
1

But I would expect the output to be
0
0

since there is no vertex with label L and property q. When I remove the index the result is correct.
I assume that this is because the UUID type is handled incorrectly.
Note that this only happens if the index is on both keys (p and q).

I'm using the following configuration:

gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=inmemory
index.search.backend=lucene
index.search.directory=data/searchindex
schema.default=none

 


hadoopmarc@...
 

Hi inverseintegral,

You have been quite succesful as a test driver, that is detecting easily reproducible issues!
The current issue seems more like an issue of undefined behaviour, rather than an issue related to Lucene or UUID objects.
I get the same behaviour when using cql and elasticsearch:

```
graph = JanusGraphFactory.open('conf/janusgraph-cql-es.properties')
g = graph.traversal();
m = graph.openManagement();

l = m.makeVertexLabel("L").make();
p = m.makePropertyKey("p").dataType(Integer.class).make();
q = m.makePropertyKey("q").dataType(Integer.class).make();
m.buildIndex("someName", Vertex.class).addKey(p).addKey(q).indexOnly(l).buildMixedIndex("search");
m.commit();

g.addV("L").property("p", 1).next();
g.tx().commit();

g.V().hasLabel("L").has("q").count()
// ==> 0
g.V().hasLabel("L").has("q", not(eq(2))).count()
// ==> 1
```
The CompositeIndex, though, has the same behaviour as the case without index (counts 0 and 0). The reference docs do not expand on the use of multiple property keys in an index after adding them.

Apparently, when querying a mixed index with a neq() predicate, it is not checked whether the associated property exists or is non-null in the index for that graph element.
You can make an issue, again, if you want. Any interest in providing a PR for any of the issues you found? I am sure people on https://lists.lfaidata.foundation/g/janusgraph-dev/topics will want to help you if you would get stuck in the PR process. See also https://docs.janusgraph.org/development/ .

I checked for related issue, but only https://github.com/JanusGraph/janusgraph/issues/2588 could be related.

Best wishes,   Marc


inverseintegral42@...
 

Dear Marc,

I will report the issue on GitHub in that case. Currently, I'm busy with a project that automatically identifies these bugs but in the near future I can definitely start submitting PRs for the issues that I found.


inverseintegral42@...
 

Hey Marc,

I actually just tested your example where p and q are of type Integer with cql and elasticsearch:

PropertiesConfiguration conf = ConfigurationUtil.loadPropertiesConfig("conf/test.properties");
graph = JanusGraphFactory.open(conf);

GraphTraversalSource g = graph.traversal();
JanusGraphManagement m = graph.openManagement();

VertexLabel l = m.makeVertexLabel("L").make();
PropertyKey p = m.makePropertyKey("p").dataType(Integer.class).make();
PropertyKey q = m.makePropertyKey("q").dataType(Integer.class).make();
m.buildIndex("someName", Vertex.class).addKey(p).addKey(q).indexOnly(l).buildMixedIndex("jgex");
m.commit();

g.addV("L").property("p", 1).next();
g.tx().commit();

System.out.println(g.V().hasLabel("L").has("q").count().next());
System.out.println(g.V().hasLabel("L").has("q", not(eq(2))).count().next());
But I get the output 0, 0 which is what I would expect. You wrote that you got the output 0, 1 though. Do you have any idea where this could come from?


hadoopmarc@...
 
Edited

OK, I repeated the experiment and got "0 1" again, so I expand on what I did.

I use the janusgraph-0.6.2 full binary distribution and start cassandra and elasticsearch with:

$ bin/janusgraph.sh start

I leave the running Janusgraph Server unused and start a JanusGraph instance in the gremlin console with:
graph = JanusGraphFactory.open('conf/janusgraph-cql-es.properties')

Note that your code lines show a private properties file.

Best wishes,   Marc