Date   

Re: Bulk loading

Laura Morales <lauretas@...>
 

If I setup an empty graph with persistent storage, for example berkeley (thus not an in-memory graph), can I load a graphml/graphson file and have it all added to the graph?
 
 
 

Sent: Tuesday, July 20, 2021 at 2:52 PM
From: hadoopmarc@...
To: janusgraph-users@...
Subject: Re: [janusgraph-users] Bulk loading
Hi Laura,

JanusGraph support for loading data does not go further than the Traversal API (used in https://tinkerpop.apache.org/docs/current/reference/#addvertex-step ) and the JanusGraphManagement API (used in https://docs.janusgraph.org/basics/schema/[https://docs.janusgraph.org/basics/schema/] ).

A better resource than the reference docs to get you started is:
http://www.kelvinlawrence.net/book/PracticalGremlin.html[http://www.kelvinlawrence.net/book/PracticalGremlin.html]

The bulk loading tips will be useful for graphs with millions of vertices and edges.

Best wishes,     Marc


Re: JanusGraph combined with Belief Propagation

hadoopmarc@...
 

Hi,

You can first try to write a custom VertexProgram for belief propagation with Apache TinkerPop. A custom VertexProgram supports the massive message parsing needed for belief propagation.
If it works on TinkerPop you can use the same VertexProgram on JanusGraph (if a single TinkerPop machine does not suffice to hold your graph), but you will have the additional complexity of getting JanusGraph to work with Apache Spark.

A Google search on belief propagation VertexProgram TinkerPop does not give any relevant results.

Best wishes,    Marc


Re: Bulk loading

hadoopmarc@...
 

Hi Laura,

JanusGraph support for loading data does not go further than the Traversal API (used in https://tinkerpop.apache.org/docs/current/reference/#addvertex-step ) and the JanusGraphManagement API (used in https://docs.janusgraph.org/basics/schema/ ).

A better resource than the reference docs to get you started is:
http://www.kelvinlawrence.net/book/PracticalGremlin.html

The bulk loading tips will be useful for graphs with millions of vertices and edges.

Best wishes,     Marc


JanusGraph combined with Belief Propagation

ganeshanvinothkumar@...
 

I'm moving from AWS Neptune architecture to JanusGraph in CentOS.

Anyone has tried implementing Belief Propagation with JanusGraph?


Re: Janus multiple subgraphs

Boxuan Li
 

Hi Laura, unfortunately, you would have to handle this in your application layer. For example, you can use different labels, properties, indexes for different subgraphs.


Bulk loading

Laura Morales <lauretas@...>
 

I've read the "Bulk Loading" chapter of the documentation several times but I still don't understand how to create a graph. Everything that I can find online is some Java or Groovy code.
Given:
1. a graph schema (say, in JSON)
2. a bunch of data (say, in CSV)
does Janus have any tool to load this stuff into a database, or to create a new one? Without using any Java/Groovy programming, or 3rd party tools? Or am I expected to write my own Groovy scripts for parsing the CSV and creating the graph?


Re: Very slow performance when opening a new session

hadoopmarc@...
 

Hi Roy,

I can confirm your observation using the standard 'bin/janusgraph.sh start' from the full janusgraph distribution.
I just used the the gremlin console with:

:remote connect tinkerpop.server conf/remote.yaml session
:remote console
a = 3

Although there is no logical reason for it in hindsight, I checked whether the delay was not due to class loading in gremlin console, using:
export JAVA_OPTIONS='-verbose:class'

I can also confirm that the delay does not happen with a non-sessioned connection.
I can also confirm that the delay occurs for the gremlin server and gremlin console of the Apache TinkerPop distribution (version 3.4.8).

I guess the initial delay is due to the additional overhead of sessions as described in:
https://tinkerpop.apache.org/docs/current/reference/#sessions

Best wishes,     Marc


Janus multiple subgraphs

Laura Morales <lauretas@...>
 

My understanding is that a Janus server can host multiple graphs, but they are isolated and cannot be queried together.
I'd like to know if/how it's possible to split one single graph into multiple subgraphs such that:

- I can query only one subgraph, or the entire graph
- vertex/edge properties (and their indexes) are local to a subgraph
- subgraphs can have links from one another, so I should be able to query multiple subgraphs in the same query

I think what I'm trying to achieve is something akin to Postgres' database schemas. In Postgres, databases are independent but they can be sub-divided into multiple schemas. Each schema has its own table, constraints, indexes, but I can query multiple schemas at once by using their fully qualified name "schema.table".


Janus multiple subgraphs

Laura Morales <lauretas@...>
 

My understanding is that a Janus server can host multiple graphs, but they are isolated and cannot be queried together.
I'd like to know if/how it's possible to split one single graph into multiple subgraphs such that:

- I can query only one subgraph, or the entire graph
- vertex/edge properties (and their indexes) are local to a subgraph
- subgraphs can have links from one another, so I should be able to query multiple subgraphs in the same query

I think what I'm trying to achieve is something akin to Postgres' database schemas. In Postgres, databases are independent but they can be sub-divided into multiple schemas. Each schema has its own table, constraints, indexes, but I can query multiple schemas at once by using their fully qualified name "schema.table".


Very slow performance when opening a new session

Roy Reznik <reznik.roy@...>
 

I'm seeing very slow performance when opening a new session in JanusGraph.
The message I'm sending is this:
{"requestId":"02a58ee3-e4d3-11eb-bd29-04d4c4eaf347","op":"eval","processor":"session","args":{"bindings":{},"evaluationTimeout":120000,"gremlin":"g.V().limit(1).id()","language":"gremlin-groovy","rebindings":{},"session":"50052633-079b-4500-bc29-a3eacb1f0dba"}}

Basically, the inner query doesn't really matter. When I use the session processor with a new session id that's never been used it takes ~1.2s for JanusGraph to respond.
Queries afterwards, with the same session id are much quicker.
Why is the overhead of starting a new session so large? Can it be reduced somehow by configuration?

Thanks,
Roy.


Re: Count Query

owner.mad.epa@...
 

mvn clean -DskipTests -Drat.skip -Pjanusgraph-release -Dgpg.skip source:jar install


Re: JanusGraph System Requirements

Peter Corless
 

I think you need to qualify this a little better.

For example, 100GB data is relatively small. Would easily fit in a single AWS i3.large. But what's going to be your analytical load? How many queries per second are you going to run? A few, or a lot? Just limited traversals or very broad-ranging queries? What sort of latencies are tolerable? Are you looking for millisecond-scale response times (<10ms) or is multi-second query responses acceptable.

If you want fast response then you're looking at directly-attached NVMe. If you really don't care about the latencies you can use EBS.

-Peter.

On Fri, Jul 9, 2021, 7:22 AM <csconnor257@...> wrote:
Hello, 

What would the system requirements for JanusGraph be with 100 GB of data?

Thanks


Re: Cassandra crashing after dropping large graph. Error: Scanned over 100001 tombstones...

Clement de Groc
 

Next JanusGraph release will allow tuning Cassandra's gc_grace_seconds: https://github.com/JanusGraph/janusgraph/pull/2693


Re: Query failure due to cassandra backend tombstone exception #1675

Clement de Groc
 

Next JanusGraph release will allow tuning Cassandra's gc_grace_seconds: https://github.com/JanusGraph/janusgraph/pull/2693


Re: How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Sounds good. Will check in the gremlin-users group. Thanks Boxuan!


JanusGraph System Requirements

csconnor257@...
 

Hello, 

What would the system requirements for JanusGraph be with 100 GB of data?

Thanks


Re: How to filter out step 3 vertex list based on step 1 vertex

Boxuan Li
 

That’s a great question! To be honest I am not sure about the reason. My assumption is the __.out("edgeCB").is("B") is an anonymous child traversal within where() step, and thus it has no access to the label “B” which is defined in the outer traversal.

I am not sure if my understanding is correct. It might be a better idea to ask in the gremlin-users group.

Cheers,
Boxuan

「Ronnie via lists.lfaidata.foundation <rputhukkeril=qualys.com@...>」在 2021年7月9日 週五,上午3:07 寫道:

Thanks Boxuan! I tried the single query and that worked accurately!

On the other hand, I am still trying to figure out why the where traversal e.g. where(__.out("edgeCB").is("B")) didnt work. Is it because "B" is considered as literal instead of step label?

Also thanks for pointing me to the gremlin-users group.

Thanks!
Ronnie


Re: How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Thanks Boxuan! I tried the single query and that worked accurately!

On the other hand, I am still trying to figure out why the where traversal e.g. where(__.out("edgeCB").is("B")) didnt work. Is it because "B" is considered as literal instead of step label?

Also thanks for pointing me to the gremlin-users group.

Thanks!
Ronnie


Re: How to filter out step 3 vertex list based on step 1 vertex

Boxuan Li
 

Hi Ronnie,

Not sure if it's optimal but this should work:

g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").as("C").out("edgeCB").as("B2").where("B", eq("B2")).select("C")

You can also do this in two steps:

b = g.V().hasLabel("VertexB").as("B").next()
g.V(b).in("edgeAB").out("edgeAC").where(out("edgeCB").is(b))

FYI, for general gremlin query questions, you can also ask in the gremlin-users mailing list: https://groups.google.com/g/gremlin-users

Best,
Boxuan


Re: How to filter out step 3 vertex list based on step 1 vertex

Ronnie
 

Sorry the gremlin queries should be as below

g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").is("B"))
g.V().hasLabel("VertexB").as("B").in("edgeAB").out("edgeAC").where(__.out("edgeCB").hasId(__.select("B").id()))

641 - 660 of 6661