Date   

Bulk loading

Laura Morales <lauretas@...>
 

I've read the "Bulk Loading" chapter of the documentation several times but I still don't understand how to create a graph. Everything that I can find online is some Java or Groovy code.
Given:
1. a graph schema (say, in JSON)
2. a bunch of data (say, in CSV)
does Janus have any tool to load this stuff into a database, or to create a new one? Without using any Java/Groovy programming, or 3rd party tools? Or am I expected to write my own Groovy scripts for parsing the CSV and creating the graph?


Re: Very slow performance when opening a new session

hadoopmarc@...
 

Hi Roy,

I can confirm your observation using the standard 'bin/janusgraph.sh start' from the full janusgraph distribution.
I just used the the gremlin console with:

:remote connect tinkerpop.server conf/remote.yaml session
:remote console
a = 3

Although there is no logical reason for it in hindsight, I checked whether the delay was not due to class loading in gremlin console, using:
export JAVA_OPTIONS='-verbose:class'

I can also confirm that the delay does not happen with a non-sessioned connection.
I can also confirm that the delay occurs for the gremlin server and gremlin console of the Apache TinkerPop distribution (version 3.4.8).

I guess the initial delay is due to the additional overhead of sessions as described in:
https://tinkerpop.apache.org/docs/current/reference/#sessions

Best wishes,     Marc


Janus multiple subgraphs

Laura Morales <lauretas@...>
 

My understanding is that a Janus server can host multiple graphs, but they are isolated and cannot be queried together.
I'd like to know if/how it's possible to split one single graph into multiple subgraphs such that:

- I can query only one subgraph, or the entire graph
- vertex/edge properties (and their indexes) are local to a subgraph
- subgraphs can have links from one another, so I should be able to query multiple subgraphs in the same query

I think what I'm trying to achieve is something akin to Postgres' database schemas. In Postgres, databases are independent but they can be sub-divided into multiple schemas. Each schema has its own table, constraints, indexes, but I can query multiple schemas at once by using their fully qualified name "schema.table".


Janus multiple subgraphs

Laura Morales <lauretas@...>
 

My understanding is that a Janus server can host multiple graphs, but they are isolated and cannot be queried together.
I'd like to know if/how it's possible to split one single graph into multiple subgraphs such that:

- I can query only one subgraph, or the entire graph
- vertex/edge properties (and their indexes) are local to a subgraph
- subgraphs can have links from one another, so I should be able to query multiple subgraphs in the same query

I think what I'm trying to achieve is something akin to Postgres' database schemas. In Postgres, databases are independent but they can be sub-divided into multiple schemas. Each schema has its own table, constraints, indexes, but I can query multiple schemas at once by using their fully qualified name "schema.table".


Very slow performance when opening a new session

Roy Reznik <reznik.roy@...>
 

I'm seeing very slow performance when opening a new session in JanusGraph.
The message I'm sending is this:
{"requestId":"02a58ee3-e4d3-11eb-bd29-04d4c4eaf347","op":"eval","processor":"session","args":{"bindings":{},"evaluationTimeout":120000,"gremlin":"g.V().limit(1).id()","language":"gremlin-groovy","rebindings":{},"session":"50052633-079b-4500-bc29-a3eacb1f0dba"}}

Basically, the inner query doesn't really matter. When I use the session processor with a new session id that's never been used it takes ~1.2s for JanusGraph to respond.
Queries afterwards, with the same session id are much quicker.
Why is the overhead of starting a new session so large? Can it be reduced somehow by configuration?

Thanks,
Roy.


Re: Count Query

owner.mad.epa@...
 

mvn clean -DskipTests -Drat.skip -Pjanusgraph-release -Dgpg.skip source:jar install


Re: JanusGraph System Requirements

Peter Corless
 

I think you need to qualify this a little better.

For example, 100GB data is relatively small. Would easily fit in a single AWS i3.large. But what's going to be your analytical load? How many queries per second are you going to run? A few, or a lot? Just limited traversals or very broad-ranging queries? What sort of latencies are tolerable? Are you looking for millisecond-scale response times (<10ms) or is multi-second query responses acceptable.

If you want fast response then you're looking at directly-attached NVMe. If you really don't care about the latencies you can use EBS.

-Peter.

On Fri, Jul 9, 2021, 7:22 AM <csconnor257@...> wrote:
Hello, 

What would the system requirements for JanusGraph be with 100 GB of data?

Thanks


Re: Cassandra crashing after dropping large graph. Error: Scanned over 100001 tombstones...

Clement de Groc
 

Next JanusGraph release will allow tuning Cassandra's gc_grace_seconds: https://github.com/JanusGraph/janusgraph/pull/2693


Re: Query failure due to cassandra backend tombstone exception #1675

Clement de Groc
 

Next JanusGraph release will allow tuning Cassandra's gc_grace_seconds: https://github.com/JanusGraph/janusgraph/pull/2693


Re: How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Sounds good. Will check in the gremlin-users group. Thanks Boxuan!


JanusGraph System Requirements

csconnor257@...
 

Hello, 

What would the system requirements for JanusGraph be with 100 GB of data?

Thanks


Re: How to filter out step 3 vertex list based on step 1 vertex

Boxuan Li
 

That’s a great question! To be honest I am not sure about the reason. My assumption is the __.out("edgeCB").is("B") is an anonymous child traversal within where() step, and thus it has no access to the label “B” which is defined in the outer traversal.

I am not sure if my understanding is correct. It might be a better idea to ask in the gremlin-users group.

Cheers,
Boxuan

「Ronnie via lists.lfaidata.foundation <rputhukkeril=qualys.com@...>」在 2021年7月9日 週五,上午3:07 寫道:

Thanks Boxuan! I tried the single query and that worked accurately!

On the other hand, I am still trying to figure out why the where traversal e.g. where(__.out("edgeCB").is("B")) didnt work. Is it because "B" is considered as literal instead of step label?

Also thanks for pointing me to the gremlin-users group.

Thanks!
Ronnie


Re: How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Thanks Boxuan! I tried the single query and that worked accurately!

On the other hand, I am still trying to figure out why the where traversal e.g. where(__.out("edgeCB").is("B")) didnt work. Is it because "B" is considered as literal instead of step label?

Also thanks for pointing me to the gremlin-users group.

Thanks!
Ronnie


Re: How to filter out step 3 vertex list based on step 1 vertex

Boxuan Li
 

Hi Ronnie,

Not sure if it's optimal but this should work:

g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").as("C").out("edgeCB").as("B2").where("B", eq("B2")).select("C")

You can also do this in two steps:

b = g.V().hasLabel("VertexB").as("B").next()
g.V(b).in("edgeAB").out("edgeAC").where(out("edgeCB").is(b))

FYI, for general gremlin query questions, you can also ask in the gremlin-users mailing list: https://groups.google.com/g/gremlin-users

Best,
Boxuan


Re: How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Sorry the gremlin queries should be as below

g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").is("B"))
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").hasId(__.select("B").id()))


How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Hi,
Assuming following schema
VertexA--edgeAB-->VertexB
VertexA--edgeAC-->VertexC
VertexC--edgeCB-->VertexB

Traversal
step 1: start with VertexB,
step 2: traverse edgeAB to find connected VertexA,
step 3: traverse edgeAC to find connected VertexC
step 4: how to filter out VertexC which are not connect to VertexB from step 1 ?

Gremlin queries that i tried:
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").is("cert"))
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").hasId(__.select("cert").id()))

Expected result: list of VertexC vertices which are connect back to the same VertexB vertex from step 1
Actual result: empty

Any pointers to why these gremlin queries don't work as expected?

Thanks,
Ronnie


Re: Multiple or-steps are conflated when using the textRegex() predicate

Mladen Marović
 

Hi,

Thanks for the quick answer. That seems to be it. Hopefully the fix will be available soon.

Best regards,

Mladen Marović


Re: Multiple or-steps are conflated when using the textRegex() predicate

Boxuan Li
 

Hi,


Best,
Boxuan

On Jul 7, 2021, at 8:29 PM, Mladen Marović <mladen.marovic@...> wrote:

Hello!

I came upon some unexpected behavior when running queries with multiple or() steps and string searches on mixed indexes and would like some clarification if this is intended or not.

I have a graph with a vertex label of type person and an edge label of type sent-message-to. All edges have the properties sender and receiver, and some others. A mixed index backed by elasticsearch is created for the edge label. Both sender and receiver are indexed (as STRING types), as well as some others.

The query that's causing me problems is:

g.E() \
    .hasLabel('sent-message-to') \
    .or( \
        has('sender', textRegex('.*alice.*')), \
        has('receiver', textRegex('.*alice.*')) \
    ).or( \
        has('sender', textRegex('.*bob.*')), \
        has('receiver', textRegex('.*bob.*')) \
    ).toList()

The query should return (roughly) messages between alices and bobs (and some edge cases where an alice bobowitz talks to an eve, but that's not important here). However, I'm getting some unexpected results where, for example, neither the sender nor the recipient contain the substring bob.

The explain plan for the query is as follows:

gremlin> g.E() \
......1>     .hasLabel('sent-message-to') \
......2>     .or( \
......3>         has('sender', textRegex('.*alice.*')), \
......4>         has('receiver', textRegex('.*alice.*')) \
......5>     ).or( \
......6>         has('sender', textRegex('.*bob.*')), \
......7>         has('receiver', textRegex('.*bob.*')) \
......8>     ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================================
Original Traversal                          [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]

ConnectiveStrategy                    [D]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
IncidentToAdjacentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
RepeatUnrollStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
MatchPredicateStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
PathRetractionStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
EarlyLimitStrategy                    [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
FilterRankingStrategy                 [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
InlineFilterStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentToIncidentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
CountStrategy                         [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
LazyBarrierStrategy                   [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexHasIdOptimizerStrategy  [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexIsOptimizerStrategy     [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphLocalQueryOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphStepStrategy                [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
JanusGraphIoRegistrationStrategy      [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
ProfileStrategy                       [F]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
StandardVerificationStrategy          [V]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]

Final Traversal                             [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]

The final (seemingly incorrect) traversal explains the results I'm getting. However, what strikes me as odd is that the JanusGraphLocalQueryOptimizerStrategy seems to return a correct traversal with two separate or steps:

[
	GraphStep(edge,[]),
	HasStep([~label.eq(sent-message-to)]),
	OrStep([
		[HasStep([sender.textRegex(.*alice.*)])],
		[HasStep([receiver.textRegex(.*alice.*)])]
	]),
	OrStep([
		[HasStep([sender.textRegex(.*bob.*)])],
		[HasStep([receiver.textRegex(.*bob.*)])]
	])
]

but the following JanusGraphStepStrategy conflates the two or steps into a single one:

[
	JanusGraphStep(
		[],[~label.eq(sent-message-to)]
	)
	.Or(
		JanusGraphStep([],[sender.eq(alice)]),
		JanusGraphStep([],[receiver.eq(alice)]),
		JanusGraphStep([],[sender.eq(bob)]),
		JanusGraphStep([],[receiver.eq(bob)])
	)
]

, which should not be correct because (A or B) and (C or D) is not equal to (A or B or C or D).

What's more confusing is that if I replace the textRegex() predicate with the tinkerpop predicate containing(), I get the proper results, because the explain plan is different:

gremlin> g.E() \
......1>     .hasLabel('sent-message-to') \
......2>     .or( \
......3>         has('sender', containing('alice')), \
......4>         has('receiver', containing('alice')) \
......5>     ).or( \
......6>         has('sender', containing('bob')), \
......7>         has('receiver', containing('bob')) \
......8>     ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================
Original Traversal                          [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

ConnectiveStrategy                    [D]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
IncidentToAdjacentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
RepeatUnrollStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
MatchPredicateStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
PathRetractionStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
EarlyLimitStrategy                    [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
FilterRankingStrategy                 [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
InlineFilterStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentToIncidentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
CountStrategy                         [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
LazyBarrierStrategy                   [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexHasIdOptimizerStrategy  [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexIsOptimizerStrategy     [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphLocalQueryOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphStepStrategy                [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphIoRegistrationStrategy      [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
ProfileStrategy                       [F]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
StandardVerificationStrategy          [V]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

Final Traversal                             [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

and the final traversal contains two or steps:

[
	JanusGraphStep([],[~label.eq(sent-message-to)]),
	OrStep([
		[HasStep([sender.containing(alice)])],
		[HasStep([receiver.containing(alice)])]
	]),
	OrStep([
		[HasStep([sender.containing(bob)])],
		[HasStep([receiver.containing(bob)])]
	])]

I'd like to use the underlying mixed index to fetch the results, and preferably only one index query should be performed under the hood.

Is there a way to force this query to properly use the mixed index? Why is the explain plan in these two cases different?

Kind regards,

Mladen Marović



Multiple or-steps are conflated when using the textRegex() predicate

Mladen Marović
 

Hello!

I came upon some unexpected behavior when running queries with multiple or() steps and string searches on mixed indexes and would like some clarification if this is intended or not.

I have a graph with a vertex label of type person and an edge label of type sent-message-to. All edges have the properties sender and receiver, and some others. A mixed index backed by elasticsearch is created for the edge label. Both sender and receiver are indexed (as STRING types), as well as some others.

The query that's causing me problems is:

g.E() \
    .hasLabel('sent-message-to') \
    .or( \
        has('sender', textRegex('.*alice.*')), \
        has('receiver', textRegex('.*alice.*')) \
    ).or( \
        has('sender', textRegex('.*bob.*')), \
        has('receiver', textRegex('.*bob.*')) \
    ).toList()

The query should return (roughly) messages between alices and bobs (and some edge cases where an alice bobowitz talks to an eve, but that's not important here). However, I'm getting some unexpected results where, for example, neither the sender nor the recipient contain the substring bob.

The explain plan for the query is as follows:

gremlin> g.E() \
......1>     .hasLabel('sent-message-to') \
......2>     .or( \
......3>         has('sender', textRegex('.*alice.*')), \
......4>         has('receiver', textRegex('.*alice.*')) \
......5>     ).or( \
......6>         has('sender', textRegex('.*bob.*')), \
......7>         has('receiver', textRegex('.*bob.*')) \
......8>     ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================================
Original Traversal                          [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]

ConnectiveStrategy                    [D]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
IncidentToAdjacentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
RepeatUnrollStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
MatchPredicateStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
PathRetractionStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
EarlyLimitStrategy                    [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
FilterRankingStrategy                 [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
InlineFilterStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentToIncidentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
CountStrategy                         [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
LazyBarrierStrategy                   [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexHasIdOptimizerStrategy  [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
AdjacentVertexIsOptimizerStrategy     [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphLocalQueryOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.textRegex(.*alice.*)])], [HasStep([receiver.textRegex(.*alice.*)])]]), OrStep([[HasStep([sender.textRegex(.*bob.*)])], [HasStep([receiver.textRegex(.*bob.*)])]])]
JanusGraphStepStrategy                [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
JanusGraphIoRegistrationStrategy      [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
ProfileStrategy                       [F]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]
StandardVerificationStrategy          [V]   [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]

Final Traversal                             [JanusGraphStep([],[~label.eq(sent-message-to)]).Or(JanusGraphStep([],[sender.textRegex(.*alice.*)]),JanusGraphStep([],[receiver.textRegex(.*alice.*)]),JanusGraphStep([],[sender.textRegex(.*bob.*)]),JanusGraphStep([],[receiver.textRegex(.*bob.*)]))]

The final (seemingly incorrect) traversal explains the results I'm getting. However, what strikes me as odd is that the JanusGraphLocalQueryOptimizerStrategy seems to return a correct traversal with two separate or steps:

[
	GraphStep(edge,[]),
	HasStep([~label.eq(sent-message-to)]),
	OrStep([
		[HasStep([sender.textRegex(.*alice.*)])],
		[HasStep([receiver.textRegex(.*alice.*)])]
	]),
	OrStep([
		[HasStep([sender.textRegex(.*bob.*)])],
		[HasStep([receiver.textRegex(.*bob.*)])]
	])
]

but the following JanusGraphStepStrategy conflates the two or steps into a single one:

[
	JanusGraphStep(
		[],[~label.eq(sent-message-to)]
	)
	.Or(
		JanusGraphStep([],[sender.eq(alice)]),
		JanusGraphStep([],[receiver.eq(alice)]),
		JanusGraphStep([],[sender.eq(bob)]),
		JanusGraphStep([],[receiver.eq(bob)])
	)
]

, which should not be correct because (A or B) and (C or D) is not equal to (A or B or C or D).

What's more confusing is that if I replace the textRegex() predicate with the tinkerpop predicate containing(), I get the proper results, because the explain plan is different:

gremlin> g.E() \
......1>     .hasLabel('sent-message-to') \
......2>     .or( \
......3>         has('sender', containing('alice')), \
......4>         has('receiver', containing('alice')) \
......5>     ).or( \
......6>         has('sender', containing('bob')), \
......7>         has('receiver', containing('bob')) \
......8>     ).explain()
==>Traversal Explanation
=======================================================================================================================================================================================================================================================================================
Original Traversal                          [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

ConnectiveStrategy                    [D]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
IncidentToAdjacentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
RepeatUnrollStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
MatchPredicateStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
PathRetractionStrategy                [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
EarlyLimitStrategy                    [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
FilterRankingStrategy                 [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
InlineFilterStrategy                  [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentToIncidentStrategy            [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
CountStrategy                         [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
LazyBarrierStrategy                   [O]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexFilterOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexHasIdOptimizerStrategy  [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
AdjacentVertexIsOptimizerStrategy     [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphLocalQueryOptimizerStrategy [P]   [GraphStep(edge,[]), HasStep([~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphStepStrategy                [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
JanusGraphIoRegistrationStrategy      [P]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
ProfileStrategy                       [F]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]
StandardVerificationStrategy          [V]   [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

Final Traversal                             [JanusGraphStep([],[~label.eq(sent-message-to)]), OrStep([[HasStep([sender.containing(alice)])], [HasStep([receiver.containing(alice)])]]), OrStep([[HasStep([sender.containing(bob)])], [HasStep([receiver.containing(bob)])]])]

and the final traversal contains two or steps:

[
	JanusGraphStep([],[~label.eq(sent-message-to)]),
	OrStep([
		[HasStep([sender.containing(alice)])],
		[HasStep([receiver.containing(alice)])]
	]),
	OrStep([
		[HasStep([sender.containing(bob)])],
		[HasStep([receiver.containing(bob)])]
	])]

I'd like to use the underlying mixed index to fetch the results, and preferably only one index query should be performed under the hood.

Is there a way to force this query to properly use the mixed index? Why is the explain plan in these two cases different?

Kind regards,

Mladen Marović


Re: Indexing on sub-attribute of custom data type

Ronnie
 

Hi Marc,
Thanks for confirming about "creating an associated vertex which defines this custom data type" approach. In which case i would not be experimenting custom data types for now. Thanks for the details regarding the serializers for custom attributes - i am sure these will come handy for me.

Thanks!
Ronnie

641 - 660 of 6656