Date
1 - 4 of 4
Queries with negated text predicates fail with lucene
toom@...
Hi,
With JanusGraph 0.6.0 and Lucene index backend, queries fail if they contain predicate like textNotPrefix, textNotContains: java.lang.IllegalArgumentException: Relation is not supported for string value: textNotPrefix
at org.janusgraph.diskstorage.lucene.LuceneIndex.convertQuery(LuceneIndex.java:814)
at org.janusgraph.diskstorage.lucene.LuceneIndex.convertQuery(LuceneIndex.java:864)
at org.janusgraph.diskstorage.lucene.LuceneIndex.query(LuceneIndex.java:593)
at org.janusgraph.diskstorage.indexing.IndexTransaction.queryStream(IndexTransaction.java:110)
If ElasticSearch is used or if there is no index backend, the same query work. I'm not sure Lucene index can be used for negated queries but the queries should not fails. How can I transform my query to make it work ? Regards, Toom. |
|
hadoopmarc@...
Hi Toom,
See, https://docs.janusgraph.org/index-backend/text-search/#full-text-search_1 Indeed, the negative text predicates are only available to Elasticsearch (and, apparently as you say, to the CompositeIndex). Best wishes, Marc |
|
toom@...
Hi Marc,
IMHO, an index should not prevent a query to work. Moreover the result of a query should not depends of backends (storage and index). If an index backend cannot process a predicate, the predicate should be be executed as if index wasn't present. To clarify, below is a sample of code. The same query works without index (line 13) and fails with index (line 31). 1 // create schema
2 mgmt = graph.openManagement()
3 mgmt.makePropertyKey('string').dataType(String.class).cardinality(Cardinality.SINGLE).make()
4 mgmt.makeVertexLabel('data').make()
5 mgmt.commit()
6
7 // add data
8 g.addV('data').property('string', 'foo')
9 ==>v[4120]
10 g.addV('data').property('string', 'bar')
11 ==>v[4312]
12
13 g.V().hasLabel('data').has('string', textNotContains('bar'))
14 WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [(~label = data AND string textNotContains bar)]. For better performance, use indexes
15 ==>v[4120]
16
17 // add indexe with lucene backend
18 mgmt = graph.openManagement()
19 string = mgmt.getPropertyKey("string")
20 mgmt.buildIndex('myindex', Vertex.class).addKey(string, Mapping.TEXTSTRING.asParameter()).buildMixedIndex("search")
21 mgmt.commit()
22
23 // Wait the indexes
24 ManagementSystem.awaitGraphIndexStatus(graph, 'myindex').call()
25
26 // Reindex data
27 mgmt = graph.openManagement()
28 mgmt.updateIndex(mgmt.getGraphIndex("myindex"), SchemaAction.REINDEX).get()
29 mgmt.commit()
30
31 g.V().hasLabel('data').has('string', textNotContains('bar'))
32 Could not call index
Regards, Toom. |
|
hadoopmarc@...
Hi Toom,
Yes, you are right, this behavior is not 100% consistent. Also, as noted, the documentation regarding text predicates on properties without index is incomplete. Use cases are sparse, though, because on a graph of practical size, working without index is not an option. Finally, improving this in a backward compatible way might prove impossible. Best wishes, Marc |
|