Date   

Re: Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks

marc.d...@...
 

Hi Dimitar,

Your spark screenshots do not show any parallelism. You state that your spark cluster only has a single worker. It seems that this worker also has only one core available (the spark.executor.cores property is not specified, so by default all available worker cores would be available to SparkGraphComputer). Without any parallelism a spark job will never be faster that an application without spark.

That being said, I do not understand why a single task takes 2 seconds. Retrieving 4 rows by average from Cassandra should rather take some 40 ms, so we are two orders of speed away from that. Apparently each task has some unexplained overhead, for setting up ........???? I would expect that the spark worker keeps its JVM, and that SparkGraphComputer keeps its classes loaded and its Cassandra connection established between tasks.

What I also do not understand is why the different 23 minute jobs are scheduled with a large delay in between. Is the underlying cloud not available? Would that also mean that the vcores used in the spark worker have a very low performance?  I would first try some simple spark jobs for a test application (no janusgraph, no cassandra) and be sure that you have a standalone spark cluster that behaves as expected: parallelism visible in the executor tab of the spark UI and no strange waiting periods between jobs of a single application.

Cheers,    Marc






Op maandag 21 oktober 2019 11:23:26 UTC+2 schreef Dimitar Tenev:

Hi Marc,

The output of nodetool gives: Number of partitions (estimate): 967 the whole output is attached as "nodetool_log.txt". 
Regarding the Spark configuration - Yes I have used the guides from the link you have provided, and "janus-spark.properties" (attached) is the graph configuration which I use for "og". I have also attached the html pages from Spark UI for the janusgraph job (Stage, Job, Environment) as spark.zip. Spark is configured with one master and one worker node, and yes the worker node output shows that the tasks are processed by it. Any help is appreciated!

Thanks,
Dimitar 

On Monday, October 21, 2019 at 10:48:00 AM UTC+3, ma...@... wrote:
Hi Dimitar,

The number 513 is probably the number of Cassandra partitions. You can inspect the number of partitions in the tables of the Cassandra cluster with:
$ nodetool tablestats <your_keyspace>

Involving SparkGraphComputer only helps for a large number of vertices (100.000+) because there is a lot of one-off overhead for instantiating the JVM's for the Spark executors. Even then, the 25 minutes you mention is excessive. Are you sure your k8s spark cluster was used? The janusgraph default is to use spark local inside your janusgraph container, see the docs for how to configure JanusGraph for a Spark standalone cluster.

HTH,     Marc


Op vrijdag 18 oktober 2019 16:19:19 UTC+2 schreef dim...@...:
Hello,

I have setup Janusgraph 0.4.0 with Hadoop 2.9.0 and Spark 2.4.4 in a K8s cluster.
I connect to Janusgraph from gremlin console and execute: 
gremlin> og
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
==>1889

It takes 25min to do the count! The same time took when there were no vertices - e.g. -> 0.  Spark job shows that there were 513 tasks run! Number of task is always constant 513 no matter of the number of vertices.
I have set "spark.sql.shuffle.partitions=4" at spark job's environment, but again the number of Spark tasks was 513! My assumption is that Janusgraph somehow specifies this number of tasks when submits the job to Spark.
The questions are:
- Why Janusgraph job submitted to Spark is always palatalized to 513 tasks? 
- How to manage the number of tasks which are created for a Janusgrap job? 
- How to minimize the execution time of OLAP query for this small graph (OLTP query takes less than a second to execute)?

Thanks,
Dimitar


Re: Concurrent TimeoutException on connection to gremlin server remotely

Stephen Mallette <spmal...@...>
 

That seems to be the failure as a result of a request - is that right?  I'm wondering if there is an error at server startup when the script executes that you're missing? Or do the startup logs look clean?


On Mon, Oct 21, 2019 at 2:38 PM <sarthak...@...> wrote:
Hi Stephen,
Below is the error I get for g2

```
globals << [g2:graph2.traversal(),g1:graph1.traversal()] took 894ms
08:14:59.098 [gremlin-server-exec-2] ERROR o.a.t.g.j.DefaultGremlinScriptEngineManager - Could not create GremlinScriptEngine for gremlin-groovy
java.lang.IllegalStateException: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.lambda$createGremlinScriptEngine$16(DefaultGremlinScriptEngineManager.java:464) ~[gremlin-core-3.3.3.jar:3.3.3] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[na:1.8.0_222] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) ~[na:1.8.0_222] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[na:1.8.0_222] at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[na:1.8.0_222] at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.createGremlinScriptEngine(DefaultGremlinScriptEngineManager.java:450) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.getEngineByName(DefaultGremlinScriptEngineManager.java:219) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.lambda$getEngineByName$0(CachedGremlinScriptEngineManager.java:57) [gremlin-core-3.3.3.jar:3.3.3] at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.getEngineByName(CachedGremlinScriptEngineManager.java:57) [gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:263) ~[gremlin-groovy-3.3.3.jar:3.3.3] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_222] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_222] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222] Caused by: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902 at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:397) ~[gremlin-groovy-3.3.3.jar:3.3.3] at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.lambda$createGremlinScriptEngine$16(DefaultGremlinScriptEngineManager.java:460) ~[gremlin-core-3.3.3.jar:3.3.3] ... 24 common frames omitted Caused by: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:713) ~[gremlin-groovy-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:395) ~[gremlin-groovy-3.3.3.jar:3.3.3] ... 26 common frames omitted Caused by: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:66) ~[groovy-2.4.15-indy.jar:2.4.15] at org.codehaus.groovy.runtime.callsite.PogoGetPropertySite.getProperty(PogoGetPropertySite.java:51) ~[groovy-2.4.15-indy.jar:2.4.15] at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjectGetProperty(AbstractCallSite.java:310) ~[groovy-2.4.15-indy.jar:2.4.15] at Script12902.run(Script12902.groovy:40) ~[na:na] at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:690) ~[gremlin-groovy-3.3.3.jar:3.3.3] ... 27 common frames omitted 08:14:59.099 [gremlin-server-exec-2] WARN o.a.t.g.s.h.HttpGremlinEndpointHandler - Invalid request - responding with 500 Internal Server Error and gremlin-groovy is not an available GremlinScriptEngine java.lang.IllegalArgumentException: gremlin-groovy is not an available GremlinScriptEngine at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.registerLookUpInfo(CachedGremlinScriptEngineManager.java:95) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.getEngineByName(CachedGremlinScriptEngineManager.java:58) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:263) ~[gremlin-groovy-3.3.3.jar:3.3.3] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_222] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_222] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222] 08:14:59.099 [gremlin-server-worker-1] DEBUG log-aggregator-encoder - [id: 0x2216578e, L:/127.0.0.1:8182 - R:/127.0.0.1:36244] WRITE: 70B ``` gremlin-server.yaml ```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
}
```

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/4a9dd12d-22b3-46d8-b458-084266c3a493%40googlegroups.com.


Re: Concurrent TimeoutException on connection to gremlin server remotely

sarthak...@...
 

Hi Stephen,
Below is the error I get for g2

```
globals << [g2:graph2.traversal(),g1:graph1.traversal()] took 894ms
08:14:59.098 [gremlin-server-exec-2] ERROR o.a.t.g.j.DefaultGremlinScriptEngineManager - Could not create GremlinScriptEngine for gremlin-groovy
java.lang.IllegalStateException: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.lambda$createGremlinScriptEngine$16(DefaultGremlinScriptEngineManager.java:464) ~[gremlin-core-3.3.3.jar:3.3.3] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[na:1.8.0_222] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) ~[na:1.8.0_222] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[na:1.8.0_222] at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[na:1.8.0_222] at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[na:1.8.0_222] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_222] at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.createGremlinScriptEngine(DefaultGremlinScriptEngineManager.java:450) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.getEngineByName(DefaultGremlinScriptEngineManager.java:219) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.lambda$getEngineByName$0(CachedGremlinScriptEngineManager.java:57) [gremlin-core-3.3.3.jar:3.3.3] at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.getEngineByName(CachedGremlinScriptEngineManager.java:57) [gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:263) ~[gremlin-groovy-3.3.3.jar:3.3.3] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_222] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_222] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222] Caused by: javax.script.ScriptException: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902 at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:397) ~[gremlin-groovy-3.3.3.jar:3.3.3] at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264) ~[na:1.8.0_222] at org.apache.tinkerpop.gremlin.jsr223.DefaultGremlinScriptEngineManager.lambda$createGremlinScriptEngine$16(DefaultGremlinScriptEngineManager.java:460) ~[gremlin-core-3.3.3.jar:3.3.3] ... 24 common frames omitted Caused by: javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:713) ~[gremlin-groovy-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:395) ~[gremlin-groovy-3.3.3.jar:3.3.3] ... 26 common frames omitted Caused by: groovy.lang.MissingPropertyException: No such property: g2 for class: Script12902
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:66) ~[groovy-2.4.15-indy.jar:2.4.15] at org.codehaus.groovy.runtime.callsite.PogoGetPropertySite.getProperty(PogoGetPropertySite.java:51) ~[groovy-2.4.15-indy.jar:2.4.15] at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjectGetProperty(AbstractCallSite.java:310) ~[groovy-2.4.15-indy.jar:2.4.15] at Script12902.run(Script12902.groovy:40) ~[na:na] at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:690) ~[gremlin-groovy-3.3.3.jar:3.3.3] ... 27 common frames omitted 08:14:59.099 [gremlin-server-exec-2] WARN o.a.t.g.s.h.HttpGremlinEndpointHandler - Invalid request - responding with 500 Internal Server Error and gremlin-groovy is not an available GremlinScriptEngine java.lang.IllegalArgumentException: gremlin-groovy is not an available GremlinScriptEngine at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.registerLookUpInfo(CachedGremlinScriptEngineManager.java:95) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.jsr223.CachedGremlinScriptEngineManager.getEngineByName(CachedGremlinScriptEngineManager.java:58) ~[gremlin-core-3.3.3.jar:3.3.3] at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:263) ~[gremlin-groovy-3.3.3.jar:3.3.3] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_222] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_222] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_222] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222] 08:14:59.099 [gremlin-server-worker-1] DEBUG log-aggregator-encoder - [id: 0x2216578e, L:/127.0.0.1:8182 - R:/127.0.0.1:36244] WRITE: 70B ``` gremlin-server.yaml ```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
}
```


Re: Concurrent TimeoutException on connection to gremlin server remotely

sarthak...@...
 

Well I don't have the error message right now. I'll post it when I'll run in that scenario again but the thing is the gremlin server starts. But graph isn't initialised. i.e. g2 isn't available for querying data. And after restarting the service, with the same properties, it works fine. 

So, I couldn't understand this random behaviour. 

Also, should the value in `storage.hbase.table` name in properties file provided to gremlin-server.yaml be already created in hbase table?? 

We first start the service then insert the data in hbase which created the table and then query the results.


Re: Concurrent TimeoutException on connection to gremlin server remotely

Stephen Mallette <spmal...@...>
 

I'm reading this as a JanusGraph-hbase specific sort of question and I'm not sure what the issue might be there. When you say "g2" doesn't get initialized, do you get an error in the server startup? or is it some other kind of error?


On Mon, Oct 21, 2019 at 10:33 AM <sarthak...@...> wrote:
Ok. Got it. Thanks Stephen.

I just have another query. Maybe you can help.

So I have multiple graphs which I have defined them in my `gremlin-server.yaml`

```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
scriptEngines: {
gremlin-groovy: {
scripts[scripts/empty-sample.groovy],
plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}
}
}
```

And to initialise them at run time, I have mentioned these graph in my empty-sample.groovy

```
// define the default TraversalSource to bind queries to - this one will be named "g".
globals << [g2: graph2.traversal(),g1:graph1.traversal()]
```

The issue here is, sometimes this g2 doesn't get initialised at runtime.. and if I restart my server again.. it works.. The only difference between these properties file is storage.hbase.table

What could be the reason behind this random behaviour? 

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/efad350c-7d43-44a8-a3b7-39197f2577e3%40googlegroups.com.


Re: Concurrent TimeoutException on connection to gremlin server remotely

sarthak...@...
 

Ok. Got it. Thanks Stephen.

I just have another query. Maybe you can help.

So I have multiple graphs which I have defined them in my `gremlin-server.yaml`

```
graphs: {
  graph1: conf/janusgraph-hbase.properties,graph2: conf/janusgraph-insights-hbase.properties
}
scriptEngines: {
gremlin-groovy: {
scripts: [scripts/empty-sample.groovy],
plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}}
}
}
```

And to initialise them at run time, I have mentioned these graph in my empty-sample.groovy

```
// define the default TraversalSource to bind queries to - this one will be named "g".
globals << [g2: graph2.traversal(),g1:graph1.traversal()]
```

The issue here is, sometimes this g2 doesn't get initialised at runtime.. and if I restart my server again.. it works.. The only difference between these properties file is storage.hbase.table

What could be the reason behind this random behaviour? 


Re: Concurrent TimeoutException on connection to gremlin server remotely

Stephen Mallette <spmal...@...>
 

Yeah....I think the documentation misled you a bit.


You perhaps took that "Default" written there very literally without considering the "Description" which describes that field as:

"The fully qualified classname of the client Channelizer that defines how to connect to the server."

So, "Channelizer.WebSocketChannelizer" is really just the class name, not the FQCN. There wasn't enough room to put that whole thing in that "Default" column. If you'd done:

connectionPool: { channelizer: org.apache.tinkerpop.gremlin.driver.Channelizer$WebSocketChannelizer }

it would be working. What isn't so good is that you didn't get an error message for that configuration problem.




On Mon, Oct 21, 2019 at 8:59 AM <sarthak...@...> wrote:
Hi Stephen,
Thanks for replying. I was able to solve this issue but I'm not certain how this was causing an error.

I'm connecting to `remote-objects.yaml` file from my java code. I recently added a property
```
connectionPool: {
channelizer: Channelizer.WebSocketChannelizer
}
```

After removing this, the system is again back to normal. I got this value from Doc: http://tinkerpop.apache.org/docs/3.2.9/reference/#connecting-via-remotegraph
Topic: Connecting via Java > Configuration

I am not able to understand how this value is making system unavailable even though my gremlin-server is in a working state. It's like the request doesn't even go to gremlin-server.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/04d4c0b3-a439-4bac-bd5e-322a28189e20%40googlegroups.com.


Re: Concurrent TimeoutException on connection to gremlin server remotely

sarthak...@...
 

Hi Stephen,
Thanks for replying. I was able to solve this issue but I'm not certain how this was causing an error.

I'm connecting to `remote-objects.yaml` file from my java code. I recently added a property
```
connectionPool: {
channelizer: Channelizer.WebSocketChannelizer
}
```

After removing this, the system is again back to normal. I got this value from Doc: http://tinkerpop.apache.org/docs/3.2.9/reference/#connecting-via-remotegraph
Topic: Connecting via Java > Configuration

I am not able to understand how this value is making system unavailable even though my gremlin-server is in a working state. It's like the request doesn't even go to gremlin-server.


Re: Concurrent TimeoutException on connection to gremlin server remotely

Stephen Mallette <spmal...@...>
 

It's hard to say what the problem could be given your description. All I can say given the information you've provided is that the driver has marked all of your hosts as "dead" for some reason. That could have happened for any number of reasons. Assuming you knew the server was running and are certain of network stability, then I guess I'd next look at server logs to see what kinds of errors were occurring just prior to the driver returning this error. 

As an aside, I see that you're using 3.3.3 for the TinkerPop driver and I don't remember exactly what conditions triggered a "dead" host back then. I'm pretty sure there have been some refinements to that decision making in more recent releases.

On Sat, Oct 19, 2019 at 5:55 AM <sarthak...@...> wrote:
Hi,
I have a Gremlin Server Running v3.3.3
I am connecting to it remotely to run my gremlin queries via Java. But recently I'm bombarded with this error

`org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists`

Initially, when I faced this issue, I used to restart gremlin service and it used to work again but this that doesn't solve the problem anymore. I'm not sure what is the issue here

Here is my remote-objects.yaml file
```
hosts: [fci-graph-writer-gremlin]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
connectionPool: {
  channelizer: Channelizer.WebSocketChannelizer,
  maxContentLength: 81928192
}
```

gremlin-server.yaml
```
host: 0
port: 8182
scriptEvaluationTimeout: 120000
threadPoolWorker: 4
gremlinPool: 16
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  fci: conf/janusgraph-hbase.properties,
  insights: conf/janusgraph-insights-hbase.properties
}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}},
    scripts: [scripts/empty-sample.groovy], 
    staticImports: ['org.opencypher.gremlin.process.traversal.CustomPredicates.*']}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
  - { className: org.opencypher.gremlin.server.op.cypher.CypherOpProcessor, config: { sessionTimeout: 28800000}}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: false, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 81928192
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
```

pom.xml
```

<dependency>

<groupId>org.janusgraph</groupId>

<artifactId>janusgraph-all</artifactId>

<version>0.3.1</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>gremlin-driver</artifactId>

<version>3.3.3</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>tinkergraph-gremlin</artifactId>

<version>3.3.3</version>

</dependency>
```

I'm really stuck here. Any help is appreciated. Thanks!!

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/aae648cf-32fd-4579-8f92-9e89f4e17d8a%40googlegroups.com.


Re: Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks

Dimitar Tenev <dimitar....@...>
 

Hi Marc,

The output of nodetool gives: Number of partitions (estimate): 967 the whole output is attached as "nodetool_log.txt". 
Regarding the Spark configuration - Yes I have used the guides from the link you have provided, and "janus-spark.properties" (attached) is the graph configuration which I use for "og". I have also attached the html pages from Spark UI for the janusgraph job (Stage, Job, Environment) as spark.zip. Spark is configured with one master and one worker node, and yes the worker node output shows that the tasks are processed by it. Any help is appreciated!

Thanks,
Dimitar 

On Monday, October 21, 2019 at 10:48:00 AM UTC+3, ma...@... wrote:
Hi Dimitar,

The number 513 is probably the number of Cassandra partitions. You can inspect the number of partitions in the tables of the Cassandra cluster with:
$ nodetool tablestats <your_keyspace>

Involving SparkGraphComputer only helps for a large number of vertices (100.000+) because there is a lot of one-off overhead for instantiating the JVM's for the Spark executors. Even then, the 25 minutes you mention is excessive. Are you sure your k8s spark cluster was used? The janusgraph default is to use spark local inside your janusgraph container, see the docs for how to configure JanusGraph for a Spark standalone cluster.

HTH,     Marc


Op vrijdag 18 oktober 2019 16:19:19 UTC+2 schreef dim...@...:
Hello,

I have setup Janusgraph 0.4.0 with Hadoop 2.9.0 and Spark 2.4.4 in a K8s cluster.
I connect to Janusgraph from gremlin console and execute: 
gremlin> og
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
==>1889

It takes 25min to do the count! The same time took when there were no vertices - e.g. -> 0.  Spark job shows that there were 513 tasks run! Number of task is always constant 513 no matter of the number of vertices.
I have set "spark.sql.shuffle.partitions=4" at spark job's environment, but again the number of Spark tasks was 513! My assumption is that Janusgraph somehow specifies this number of tasks when submits the job to Spark.
The questions are:
- Why Janusgraph job submitted to Spark is always palatalized to 513 tasks? 
- How to manage the number of tasks which are created for a Janusgrap job? 
- How to minimize the execution time of OLAP query for this small graph (OLTP query takes less than a second to execute)?

Thanks,
Dimitar


can't removel an already DISABLED index and can't rebuild index with the same propertykey name

gaming CG <17600...@...>
 

In my work,i disabled a composite index inorder to remove it,I have confirm that the index status is DISABLE, but it can't be remove by SchamaAction.REMOVE_INDEX operation. when i check the index, it still shows up in management interface as DISABLED(official document say this is normal), but i can't rebuild index with the same propertykey name


Re: Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks

marc.d...@...
 

Hi Dimitar,

The number 513 is probably the number of Cassandra partitions. You can inspect the number of partitions in the tables of the Cassandra cluster with:
$ nodetool tablestats <your_keyspace>

Involving SparkGraphComputer only helps for a large number of vertices (100.000+) because there is a lot of one-off overhead for instantiating the JVM's for the Spark executors. Even then, the 25 minutes you mention is excessive. Are you sure your k8s spark cluster was used? The janusgraph default is to use spark local inside your janusgraph container, see the docs for how to configure JanusGraph for a Spark standalone cluster.

HTH,     Marc


Op vrijdag 18 oktober 2019 16:19:19 UTC+2 schreef dim...@...:

Hello,

I have setup Janusgraph 0.4.0 with Hadoop 2.9.0 and Spark 2.4.4 in a K8s cluster.
I connect to Janusgraph from gremlin console and execute: 
gremlin> og
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
==>1889

It takes 25min to do the count! The same time took when there were no vertices - e.g. -> 0.  Spark job shows that there were 513 tasks run! Number of task is always constant 513 no matter of the number of vertices.
I have set "spark.sql.shuffle.partitions=4" at spark job's environment, but again the number of Spark tasks was 513! My assumption is that Janusgraph somehow specifies this number of tasks when submits the job to Spark.
The questions are:
- Why Janusgraph job submitted to Spark is always palatalized to 513 tasks? 
- How to manage the number of tasks which are created for a Janusgrap job? 
- How to minimize the execution time of OLAP query for this small graph (OLTP query takes less than a second to execute)?

Thanks,
Dimitar


JanusGraph REINDEXING with MapReduce job

Debasish Kanhar <d.k...@...>
 

Thanks @pluradj for helping out. The problem we were facing while using JanusGraph is that, the data we have loaded in our system was loaded one time, as a batch job quite early. We have a dataset with around 1.8 billion nodes. The data loading speed is pretty slower (When it was done first time) and made use of only selective composite index at that time. Thus when we are creating microservices around multiple queries, not all the microservices are quick enough, i.e. their response times are huge and unacceptable.
Upon profiling, we can see that JanusGraph and query in itself didn't make use of proper index per our understanding like hasNot() or has(property, null) does optimizations instead of using index queries thus making the responses slower. Avoiding that can be a way to create meta properties and defaulting them to single value. Like instead of has(property, null) you can create a traversal which replaces null with a default value like g.V().has("nodelabel", "vertex").as("a").property("property", coalesce(select("a").values("property"), -999).iterate() and then querying like g.V().has("property", -999) and such queries will make use of index, and optimize the traversal time as well.
Now, sometime, if we follow such process, we may need to create additional properties which weren't defined as part of initial schema. Like if we want to Query those vertex which doesn't have a particular type of edge the query will fall back to issue previously mentioned, and the query won't be fast. Workaround it will be create a property edge_count_on_vertex and store the number of edge count in it. If 0, it means that the vertex doesn't have the edge which is same as hasNot(__.bothE("edgeLabel")) in traversal.
Since, it requires creating a new property, the same new property needs to be reindex as well. Ideally we want schema to be defined apriori and avoid re-indexing as much as possible, but in our case that's not possible, hence we had to reindex our data. So the issue we faced was related to re-indexing and above was background into the problem
The following is snippet of conversation b/w me and @pluradj
So, Reindexing is an expensive job. Our data size is around 1.8 billion vertices (This is for Lighthouse project as well)I know traditional way of reindexing using following steps:
CREATE INDEX:
// Create an index
mgmt = graph.openManagement()

deletedOn = mgmt.getPropertyKey("deletedOn")
expirationDate = mgmt.getPropertyKey("expirationDate")
vertexLabel = mgmt.getPropertyKey("vertexLabel")
videoMixedIndex = mgmt.buildIndex('byVideoCombo1Mixed_2', Vertex.class).addKey(deletedOn).addKey(expirationDate).addKey(vertexLabel).buildMixedIndex("search")
mgmt.commit()

graph.tx().rollback()

# REINDEX
//Wait for the index to become available
ManagementSystem.awaitGraphIndexStatus(graph, 'byVideoCombo1Mixed_2').call()
//Reindex the existing data
mgmt = graph.openManagement()
mgmt.updateIndex(mgmt.getGraphIndex("byVideoCombo1Mixed_2"), SchemaAction.REINDEX).get()
mgmt.commit()
[1:24 PM]
I think this makes use of JanusGraphManagement to do reindexing on single machine (From docs as well) spawns a single-machine OLAP job so as expected this will be really slow for the scale of data we are talking about.
[1:25 PM]
I think there is also a way to reindex data using MapReduce job right? How do we do that? I think this was part of new versions. Per docs (https://docs.janusgraph.org/index-management/index-reindexing/) we can do following:
mgmt = graph.openManagement()
mr = new MapReduceIndexManagement(graph)
mr.updateIndex(mgmt.getRelationIndex(mgmt.getRelationType("battled"), "battlesByTime"), SchemaAction.REINDEX).get()
mgmt.commit()
[1:27 PM]
But, gremlin console throws an exception when I run```
mr = new MapReduceIndexManagement(graph)

I'm using `JanusGraph 0.3.2`

gremlin>  mr = new MapReduceIndexManagement(graph)
groovysh_evaluate: 3: unable to resolve class MapReduceIndexManagement
```
JASON:
you need to install the Hadoop-Gremlin plugin into the Gremlin Console http://tinkerpop.apache.org/docs/3.3.3/reference/#_installing_hadoop_gremlin
:plugin use tinkerpop.hadoop
Well, using that as well didn't optimize reindexing time much, but there was certain reduction in reinding step. Also, as a workaround, we restricted index creation to vertexLabel to reduce scope of reindexing and making it finish in tangiable time.
Hope this helps someone else who also needs some help on reindexing


Re: Upgraded Janus Version to 0.4 and tinkerpop gremlin-server/console version to 3.4.1

Oleksandr Porunov <alexand...@...>
 

First of all version `0.4.0` doesn't support ES version 7. Version 0.5.0 supports ES 7 but it isn't released yet. You may build your own jars from source code to support ES version 7.
Also, I see that you are upgrading directly from "0.2.0" to "0.4.0". I personally didn't test it and would suggest to upgrade to version "0.3.0" and then "0.4.0".
See this upgrade instructions for 0.3.0: https://docs.janusgraph.org/changelog#upgrade-instructions_1
See this upgrade instructions for 0.4.0: https://docs.janusgraph.org/changelog#upgrade-instructions

On Friday, October 18, 2019 at 5:19:55 PM UTC+3, Baskar Vangili wrote:
I have upgraded janus version to 0.4.0 and tinker pop version to 3.4.1. 
Index Backend ES version: 7.3.2
Cassandra Version: 3.3.0

After the upgrade, I am getting this error. Any idea what's happening here? 

"error":"com.google.common.util.concurrent.UncheckedExecutionException: org.janusgraph.core.JanusGraphException: StorageBackend version is incompatible with current JanusGraph version: storage [0.4.0] vs. runtime [0.2.0]\n\tat com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)\n\tat com.google.common.cache.LocalCache.get(LocalCache.java:3937)\n\tat com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941)\n\tat com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)




Re: janusgraph 0.5.0 release date

Oleksandr Porunov <alexand...@...>
 

You can build your own jars from source code (master branch is 0.5.0 branch). "mvn clean install -DskipTests=true". Then you can use necessary jars in your project. Just don't include CI jars in your project.
Also, "vavr" and "high-scale-lib" may be necessary also.
For example, if you put necesary jars in "libs" folder in your project and you are using "Gradle" then you may use next dependencies:
compile group: 'io.vavr', name: 'vavr', version: '0.9.2'
compile group: 'com.github.stephenc.high-scale-lib', name: 'high-scale-lib', version: '1.1.1'
compile fileTree(dir: 'libs', include: ['*.jar'])

On Friday, October 18, 2019 at 5:19:55 PM UTC+3, Baskar Vangili wrote:
We are using Elastic Search 7.3.2 as index backend. Janusgraph latest version 0.4.0 doesn't support this version but 0.5.0 supports. When is the release date for 0.5.0? If it is far, any workaround which I can make in 0.4.0 to support ES 7.3.2?


Concurrent TimeoutException on connection to gremlin server remotely

sarthak...@...
 

Hi,
I have a Gremlin Server Running v3.3.3
I am connecting to it remotely to run my gremlin queries via Java. But recently I'm bombarded with this error

`org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists`

Initially, when I faced this issue, I used to restart gremlin service and it used to work again but this that doesn't solve the problem anymore. I'm not sure what is the issue here

Here is my remote-objects.yaml file
```
hosts: [fci-graph-writer-gremlin]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
connectionPool: {
  channelizer: Channelizer.WebSocketChannelizer,
  maxContentLength: 81928192
}
```

gremlin-server.yaml
```
host: 0
port: 8182
scriptEvaluationTimeout: 120000
threadPoolWorker: 4
gremlinPool: 16
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  fci: conf/janusgraph-hbase.properties,
  insights: conf/janusgraph-insights-hbase.properties
}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}},
    scripts: [scripts/empty-sample.groovy], 
    staticImports: ['org.opencypher.gremlin.process.traversal.CustomPredicates.*']}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
  - { className: org.opencypher.gremlin.server.op.cypher.CypherOpProcessor, config: { sessionTimeout: 28800000}}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: false, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 81928192
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
```

pom.xml
```

<dependency>

<groupId>org.janusgraph</groupId>

<artifactId>janusgraph-all</artifactId>

<version>0.3.1</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>gremlin-driver</artifactId>

<version>3.3.3</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>tinkergraph-gremlin</artifactId>

<version>3.3.3</version>

</dependency>
```

I'm really stuck here. Any help is appreciated. Thanks!!


Option storage.transactions does not work

nicolas...@...
 

Hello,
For my tests, I want to use embedded janusGraph with inmemory storage but without transaction supports. I found the option storage.transactions in https://docs.janusgraph.org/v0.3/basics/configuration-reference/ but it seems to have no effect. Should this option be compatible with  inmemory storage ?

Note: I do not want transactions as in production, I used a remote connection and thus, the auto commit mode is used. 

Regards,
Nicolas


Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks

dimitar....@...
 

Hello,

I have setup Janusgraph 0.4.0 with Hadoop 2.9.0 and Spark 2.4.4 in a K8s cluster.
I connect to Janusgraph from gremlin console and execute: 
gremlin> og
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
==>1889

It takes 25min to do the count! The same time took when there were no vertices - e.g. -> 0.  Spark job shows that there were 513 tasks run! Number of task is always constant 513 no matter of the number of vertices.
I have set "spark.sql.shuffle.partitions=4" at spark job's environment, but again the number of Spark tasks was 513! My assumption is that Janusgraph somehow specifies this number of tasks when submits the job to Spark.
The questions are:
- Why Janusgraph job submitted to Spark is always palatalized to 513 tasks? 
- How to manage the number of tasks which are created for a Janusgrap job? 
- How to minimize the execution time of OLAP query for this small graph (OLTP query takes less than a second to execute)?

Thanks,
Dimitar


janusgraph 0.5.0 release date

Baskar Vangili <vanb...@...>
 

We are using Elastic Search 7.3.2 as index backend. Janusgraph latest version 0.4.0 doesn't support this version but 0.5.0 supports. When is the release date for 0.5.0? If it is far, any workaround which I can make in 0.4.0 to support ES 7.3.2?


Upgraded Janus Version to 0.4 and tinkerpop gremlin-server/console version to 3.4.1

Baskar Vangili <vanb...@...>
 

I have upgraded janus version to 0.4.0 and tinker pop version to 3.4.1. 
Index Backend ES version: 7.3.2
Cassandra Version: 3.3.0

After the upgrade, I am getting this error. Any idea what's happening here? 

"error":"com.google.common.util.concurrent.UncheckedExecutionException: org.janusgraph.core.JanusGraphException: StorageBackend version is incompatible with current JanusGraph version: storage [0.4.0] vs. runtime [0.2.0]\n\tat com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)\n\tat com.google.common.cache.LocalCache.get(LocalCache.java:3937)\n\tat com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941)\n\tat com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)



2441 - 2460 of 6665