Date   

Re: [DISCUSS] JanusGraph 0.2.3 release

writetoph1lm <writet...@...>
 

Hi All, 

We're currently using JG 0.2.2 in production and would like to push out changes to the next 0.2.x release.
CLA is signed and we're good to go. 
If everybody is ok I can open PR for 0.2 branch and then port changes to master in separate PR.

Thanks

вторник, 23 апреля 2019 г., 16:40:08 UTC-4 пользователь Chris Hupman написал:

Hi all,

This morning I went through and tagged all the issues and pull requests that I feel belong in the 0.2.3 release. Currently there are only 2 open pull requests targeting the 0.2 branch. I can't think of anything else that needs to be in the final 0.2.x release, but would love some additional input about whether or not anything else should be included. Currently I'm targeting May 22nd for the release date.

I also added issues to the milestone for my stretch goal of generating rpm and deb packages as release artifacts.  

Here is a link to the release milestone. Please provide any and all feedback.

Chris


Ports/Backports

writet...@...
 

Hi Janusgraph Team, 

We've just become a contributors and would like to push our changes to public repository.
The thing is that all changes are based on 0.2 version - the version we use in production.
How it'd suitable to review and (back)port the changes?
- Start from 0.2 branch and then port to master?
- (Vice versa) Start from master and then backport to 0.2?

Thanks


Re: Improve JanusGraph main page

Oleksandr Porunov <alexand...@...>
 

Asking for the review of this PR: https://github.com/JanusGraph/janusgraph.org/pull/64


Re: [DISCUSS] JanusGraph 0.2.3 release

Chris Hupman <chris...@...>
 

If anyone has cycles I would really like another set of eyes on PR #1479. I pretty much completely rewrote gremlin-server.sh so that it would be fully backwards compatible, but additionally support start, stop, and being run as a service in systemV and SystemD. 

I've done testing on centOS 6 and 7 as well as Ubuntu18 to cover testing on SystemD and SystemV. Even if it's just downloading the gremlin-server.sh script and making sure it works in place the way you're used to I would greatly appreciate it. 

I started to work on creating a noarch rpm using the changes made in this PR. I'm really looking forward to replacing a ton of the setup in the dockerfile with a single apt install. 


On Sunday, May 5, 2019 at 7:09:45 AM UTC-7, Oleksandr Porunov wrote:
Thank you Chris for starting this discussion. The milestone and release day looks good to me. I am back from vacation so I will be able to review and test release artifacts when they are ready.

Oleksandr

On Tuesday, April 23, 2019 at 11:40:08 PM UTC+3, Chris Hupman wrote:
Hi all,

This morning I went through and tagged all the issues and pull requests that I feel belong in the 0.2.3 release. Currently there are only 2 open pull requests targeting the 0.2 branch. I can't think of anything else that needs to be in the final 0.2.x release, but would love some additional input about whether or not anything else should be included. Currently I'm targeting May 22nd for the release date.

I also added issues to the milestone for my stretch goal of generating rpm and deb packages as release artifacts.  

Here is a link to the release milestone. Please provide any and all feedback.

Chris


Re: [DISCUSS] JanusGraph 0.2.3 release

Oleksandr Porunov <alexand...@...>
 

Thank you Chris for starting this discussion. The milestone and release day looks good to me. I am back from vacation so I will be able to review and test release artifacts when they are ready.

Oleksandr


On Tuesday, April 23, 2019 at 11:40:08 PM UTC+3, Chris Hupman wrote:
Hi all,

This morning I went through and tagged all the issues and pull requests that I feel belong in the 0.2.3 release. Currently there are only 2 open pull requests targeting the 0.2 branch. I can't think of anything else that needs to be in the final 0.2.x release, but would love some additional input about whether or not anything else should be included. Currently I'm targeting May 22nd for the release date.

I also added issues to the milestone for my stretch goal of generating rpm and deb packages as release artifacts.  

Here is a link to the release milestone. Please provide any and all feedback.

Chris


Re: CQL for OLAP issue with Syclla as backed both Local and Yarn Mode

Chris Hupman <chris...@...>
 

Hello,

This channel is really just to discuss matters relevant to the development of JanusGraph. We mainly discuss things like proposals for new features, release planning, and general administration of the codebase. You'll have much better luck posting this in janusgraph-users.

Cheers,

Chris


On Thursday, May 2, 2019 at 4:37:00 AM UTC-7, rak...@... wrote:
Hi All,

I am unable to run any analytics (OLAP) on JanusGraph with Syclla as backend. 
I tried both in Local and Yarn mode on AWS EMR cluster
  • In Yarn mode, it Throws an exception 10:07:58 ERROR org.apache.spark.SparkContext  - Error initializing SparkContext.
  • In Local mode, It runs perfectly around 500 tasks and gives an empty output (This I tried with SparkGraphComputer it gives the result)
I build the distribution archive from here (from branch Issue_985_spark_via_cql)

Following are the properties given in conf/hadoop-graph/read-cql.properties:

# Copyright 2019 JanusGraph Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cql
janusgraphmr.ioformat.conf.storage.hostname=X.0.X.1
janusgraphmr.ioformat.conf.storage.port=9042
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=graph1
storage.cassandra.keyspace=graph1

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner

#
# SparkGraphComputer Configuration
#
#spark.master=spark://X.X.X.X:7077
spark.master=yarn
spark.submit.deployMode=client
spark.yarn.jars=/usr/lib/spark/jars/
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator




Full stack error while running in yarn mode:

ava.lang.IllegalStateException: org.apache.spark.SparkException: Unable to load YARN support
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:88)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:68)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:214)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:460)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:196)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)
at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:89)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:146)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:453)
Caused by: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Unable to load YARN support
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:68)
... 56 more
Caused by: org.apache.spark.SparkException: Unable to load YARN support
at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:405)
at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:400)
at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:400)
at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:425)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2387)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:156)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:351)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:257)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:432)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.SparkContext.getOrCreate(SparkContext.scala)
at org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:52)
at org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:60)
at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:233)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:401)
... 18 more


Is there anything required as classpath or required jars? also whats the problem with local mode? 
Do we have any alternative for this purpose (analytics on Janusgraph using spark), Currently I am running connected component using graphframes.

you help is appreciated, thanks in advance :)  




CQL for OLAP issue with Syclla as backed both Local and Yarn Mode

rakesh...@...
 

Hi All,

I am unable to run any analytics (OLAP) on JanusGraph with Syclla as backend. 
I tried both in Local and Yarn mode on AWS EMR cluster
  • In Yarn mode, it Throws an exception 10:07:58 ERROR org.apache.spark.SparkContext  - Error initializing SparkContext.
  • In Local mode, It runs perfectly around 500 tasks and gives an empty output (This I tried with SparkGraphComputer it gives the result)
I build the distribution archive from here (from branch Issue_985_spark_via_cql)

Following are the properties given in conf/hadoop-graph/read-cql.properties:

# Copyright 2019 JanusGraph Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
#
# JanusGraph Cassandra InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=cql
janusgraphmr.ioformat.conf.storage.hostname=X.0.X.1
janusgraphmr.ioformat.conf.storage.port=9042
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=graph1
storage.cassandra.keyspace=graph1

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner

#
# SparkGraphComputer Configuration
#
#spark.master=spark://X.X.X.X:7077
spark.master=yarn
spark.submit.deployMode=client
spark.yarn.jars=/usr/lib/spark/jars/
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator




Full stack error while running in yarn mode:

ava.lang.IllegalStateException: org.apache.spark.SparkException: Unable to load YARN support
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:88)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.ComputerResultStep.processNextStart(ComputerResultStep.java:68)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:214)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1034)
at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:460)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:196)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:130)
at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:89)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:146)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)
at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:453)
Caused by: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Unable to load YARN support
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.VertexProgramStep.processNextStart(VertexProgramStep.java:68)
... 56 more
Caused by: org.apache.spark.SparkException: Unable to load YARN support
at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:405)
at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:400)
at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:400)
at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:425)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2387)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:156)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:351)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:257)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:432)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.SparkContext.getOrCreate(SparkContext.scala)
at org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:52)
at org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:60)
at org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:233)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:401)
... 18 more


Is there anything required as classpath or required jars? also whats the problem with local mode? 
Do we have any alternative for this purpose (analytics on Janusgraph using spark), Currently I am running connected component using graphframes.

you help is appreciated, thanks in advance :)  




JanusGraph doesn't remove threadlocal tx after a transaction commit?

huangx...@...
 

By Java VisualVm, I found some StandardJanusGraphTx  "isOpen" field is false while they still live in memory.

I read the source code. A graph has a threadLocal var named txs and the reference chain is graph -> txs -> StandardJanusGraphTx.

When a transaction commit, JanusGraph doesn't call txs.remove() .In memory, the worst result:  txNum = threadNum × graphNum.

We usually remove threadlocal var in order to avoid OOM, but JanusGraph doesn't do that ,why?



Thanks.


Re: [DISCUSS] Don't use Scroll API for ElasticSearch requests

mike....@...
 

I'm no expert in how the Scroll Api is being leveraged in Janus currently but given the guidance from the elastic docs seen below it may be advantageous to continue leveraging the Scroll api for OLAP type ish queries. Given your evaluation of  the difficulty in choosing whether or not to leverage the Scroll Api vs. not I would advocate for functionality that would allow the user to opt-in or out which ever the community prefers in order to continue using the Scroll Api. My vote would be to keep the existing Api and possibly if it's not already done leverage an abstraction to allow various elastic query Api's to be used depending on the users preference/configuration.


"While a search request returns a single “page” of results, the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database.

Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration."


On Monday, April 15, 2019 at 1:30:58 PM UTC-4, Oleksandr Porunov wrote:
Currently we are using Scroll API for realtime search requests when using ElasticSearch as an index backend. In my experience it often creates more than 500 parallel cursors (sometimes more then 10000 cursors). Sure, we can decrease keep-alive parameter "index.[X].elasticsearch.scroll-keep-alive" to keep cursors for less than 60 seconds but I don't think that it is a wise solution. 

Statements from the ElasticSearch documentation:
>> Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration.

>> The Scroll api is recommended for efficient deep scrolling but scroll contexts are costly and it is not recommended to use it for real time user requests.

In addition, I can say that ElasticSearch 7.0.0 (released on 10 April) by default limits the amount of open cursors to 500.

Pros which I see if we remove usage of Scroll API in JanusGraph:
- All real time queries will be faster
- Less overhead on the ElasticSearch side (we don't keep open contexts)

Cons which I see if we remove usage of Scroll API in JanusGraph:
- The use would need to be aware of queries like .limit(1000000) or queries without limit because they may hit a lot of results and that is we may have some problems on ElasticSearch side.

Even considering the con of removing usage of Scroll API I think we should remove Scroll API usage because it is much simpler to write your Gremlin query with `limit` usage than dealing with too many open contexts (just by opinion).

Possible solutions to deal with the con:
- Warn users about possible problems for queries which hit many entities in ElasticSearch. Suggest to use "limit" and some data processing techniques.
- Imitate scroll usage by using "search_after" (this one could be hard to implement and not applicable to queries without sorting by unique parameter / parameters).

What other community members think about it?
Do you see any other pros of using Scroll API which I missed? Are you OK with removing usage of Scroll API?    


[DISCUSS] JanusGraph 0.2.3 release

Chris Hupman <chris...@...>
 

Hi all,

This morning I went through and tagged all the issues and pull requests that I feel belong in the 0.2.3 release. Currently there are only 2 open pull requests targeting the 0.2 branch. I can't think of anything else that needs to be in the final 0.2.x release, but would love some additional input about whether or not anything else should be included. Currently I'm targeting May 22nd for the release date.

I also added issues to the milestone for my stretch goal of generating rpm and deb packages as release artifacts.  

Here is a link to the release milestone. Please provide any and all feedback.

Chris


Re: [DISCUSS] 0.2.3 release and 0.3 branch

Oleksandr Porunov <alexand...@...>
 

Thank you Chris for sharing the checklist! 
I am going to vacation in the end of this week, so I think I won't be able to review release artifacts but I think other members will be able to review it. If the release won't be finished in next 2 weeks I think I will be able to review it.

Regards,
Oleksandr

On Tuesday, April 23, 2019 at 9:26:51 PM UTC+3, Chris Hupman wrote:
Hi guys,

I just had a short call with Jason about what is required for releases and wanted to share my notes. I'm back from vacation and plan to get the process started this week.

JanusGraph Release checklist
  • Start up new janusgraph-dev thread on release proposing what should be included in the release and asking for additional feedback and suggestions
  • Make sure all PRs and issues added since last release are associated to the milestone
  • Complete all items associated with milestone or move to a new milestone if necessary
  • Write up a synopsis of changes made in release for release page and vote thread
  • validate all changes have been merged upstream
  • create janusgraph-dev vote thread and get required votes
  • Tag release
  • Draft release and upload artifacts
  • Upload to sonatype
Cheers,

Chris

On Thursday, April 11, 2019 at 11:24:42 AM UTC-7, Jan Jansen wrote:
Hi
It would be cool to get the documentation ready for the release of version 0.3.2.
Therfefore, It would be cool, if some of the JanusGraph folks checks the documentation.

Afterwards, We can merge it, so we can integrate all changes of 0.3 docs. Followed by preparation for 0.4.

Dear
Jan Jansen


Re: [DISCUSS] 0.2.3 release and 0.3 branch

Chris Hupman <chris...@...>
 

Hi guys,

I just had a short call with Jason about what is required for releases and wanted to share my notes. I'm back from vacation and plan to get the process started this week.

JanusGraph Release checklist
  • Start up new janusgraph-dev thread on release proposing what should be included in the release and asking for additional feedback and suggestions
  • Make sure all PRs and issues added since last release are associated to the milestone
  • Complete all items associated with milestone or move to a new milestone if necessary
  • Write up a synopsis of changes made in release for release page and vote thread
  • validate all changes have been merged upstream
  • create janusgraph-dev vote thread and get required votes
  • Tag release
  • Draft release and upload artifacts
  • Upload to sonatype
Cheers,

Chris


On Thursday, April 11, 2019 at 11:24:42 AM UTC-7, Jan Jansen wrote:
Hi
It would be cool to get the documentation ready for the release of version 0.3.2.
Therfefore, It would be cool, if some of the JanusGraph folks checks the documentation.

Afterwards, We can merge it, so we can integrate all changes of 0.3 docs. Followed by preparation for 0.4.

Dear
Jan Jansen


Re: [DISCUSS] Rethink JanusGrpah Schema Management

Ryan Stauffer <ry...@...>
 

Circling back around on this.  I'm going to dig into those two branches/features over the next few days

Thanks,
Ryan


On Thu, Apr 11, 2019 at 11:30 AM 'Jan Jansen' via JanusGraph developers <janusgr...@...> wrote:
Ryan,

It would be cool to get some help. A good starting point would be make the graph readonly using a TinkerpopStrategy. This implemenation should be straight forward.
My current state on task 3 and 4 are in following branches:

Thanks for your offer.

Greetings
Jan

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/af141fb1-fe85-41fc-99cc-ea2974e2d31f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[DISCUSS] Don't use Scroll API for ElasticSearch requests

Oleksandr Porunov <alexand...@...>
 

Currently we are using Scroll API for realtime search requests when using ElasticSearch as an index backend. In my experience it often creates more than 500 parallel cursors (sometimes more then 10000 cursors). Sure, we can decrease keep-alive parameter "index.[X].elasticsearch.scroll-keep-alive" to keep cursors for less than 60 seconds but I don't think that it is a wise solution. 

Statements from the ElasticSearch documentation:
>> Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration.

>> The Scroll api is recommended for efficient deep scrolling but scroll contexts are costly and it is not recommended to use it for real time user requests.

In addition, I can say that ElasticSearch 7.0.0 (released on 10 April) by default limits the amount of open cursors to 500.

Pros which I see if we remove usage of Scroll API in JanusGraph:
- All real time queries will be faster
- Less overhead on the ElasticSearch side (we don't keep open contexts)

Cons which I see if we remove usage of Scroll API in JanusGraph:
- The use would need to be aware of queries like .limit(1000000) or queries without limit because they may hit a lot of results and that is we may have some problems on ElasticSearch side.

Even considering the con of removing usage of Scroll API I think we should remove Scroll API usage because it is much simpler to write your Gremlin query with `limit` usage than dealing with too many open contexts (just by opinion).

Possible solutions to deal with the con:
- Warn users about possible problems for queries which hit many entities in ElasticSearch. Suggest to use "limit" and some data processing techniques.
- Imitate scroll usage by using "search_after" (this one could be hard to implement and not applicable to queries without sorting by unique parameter / parameters).

What other community members think about it?
Do you see any other pros of using Scroll API which I missed? Are you OK with removing usage of Scroll API?    


Re: JG recovery is not working with a 2 node scyllaDB cluster as backend

Ryan Stauffer <ry...@...>
 

No problem at all, glad it worked out!

A good reference as well is the Scylla community slack channel - the engineers on there are very helpful in troubleshooting, and anyone using open source is welcome to join the conversation. 

On Fri, Apr 12, 2019 at 4:31 AM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

The above method works successfully for JG recovery, thank you so much Ryan for the timely assistance, and sharing this crucial & fundamental information.

Thanks a lot once again

Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Fri, 12 Apr 2019 at 02:03, Ryan Stauffer <ry...@...> wrote:
For terminology, I'm going to call your 2 clusters "source" and "replica".  "Source" is the Scylla cluster that you want to backup, and "replica" is the cluster that you want to copy the data to.  For this to work, you need a mapping of "source" node to exactly one "replica" node.

Ex:
Source cluster = 3 Nodes (source1, source2, source3)
Replica cluster = 3 Nodes (replica1, replica2, replica3)

Where we have a mapping:
source1 -> replica1
source2 -> replica2
source3 -> replica3

The end result of this process will be that the replica cluster tokens mirror those of the source cluster. 

1. Start with the replica cluster shutdown. ($ sudo systemctl stop scylla-server)

2. On each node of the source cluster, run the following:
HOST_IP=`grep -e '^listen_address' /etc/scylla/scylla.yaml | awk '{ print $NF }'`
nodetool ring | grep $HOST_IP | awk '{print $NF ","}' | xargs | sed 's/,$//g'

This produces a comma-separated list of tokens for that particular node...

3. Take this list of tokens and plug it into the corresponding replica node's scylla.yaml file under the initial_token: property.

Now finish the normal restore procedure on the replica cluster (replacing the keyspace data w/ the back-upped data from the source cluster)

4. Start up the replica cluster, run nodetool repair, and once everything's up, you should be good to go...

Good luck!




On Thu, Apr 11, 2019 at 1:06 PM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Please send me the commands whenever possible

Thanks a lot 

On Thu, 11 Apr 2019 at 11:29 PM, Saurabh Verma <saurab...@...> wrote:
Hey Ryan

It would be really great if you could guide me along these lines you mentioned above.

I would be available if you need any more information.

Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 23:25, Ryan Stauffer <ry...@...> wrote:
Missed your response about the different cluster. That’s the underlying issue then.

Because of the way Scylla works, you’ll need to make sure that the cluster on which the restore occurs has the same token distribution as the source cluster.  If you need help doing this let me know and I can send over some commands later once I get back to my computer. 

On Thu, Apr 11, 2019 at 10:51 AM Ryan Stauffer <ry...@...> wrote:
So you backed up data on each node of “cluster1”, and then restored data on each node of “cluster1”?

On Thu, Apr 11, 2019 at 10:03 AM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Thanks for the response.

Below answers to your questions:-

1. Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing? - The keyspace is same idgraph1

2. Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)? - Yes I am running this as mentioned at https://docs.scylladb.com/operating-scylla/procedures/backup-restore/

3. Is the data intact? - I compared the names & sizes of edgestore, graphindex & janusgraph_ids tables on source and destination machines, both are name listing & sizes are exactly the same in both source and destination clusters

Still getting the same error.

'Could not find type for id: 3597'

When I checked g.V(3597) in source cluster, it represents 'node' label in the source cluster, but g.V(3597) gives nothing.

Thanks
Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 19:18, Ryan Stauffer <ry...@...> wrote:
Big picture, my understanding is that you're trying to backup and restore the underlying Scylla keyspace ("idgraph1"), right (procedure described here https://docs.scylladb.com/operating-scylla/procedures/backup-restore/)?  If that restore isn't successful, or incomplete, you can end up with some "interesting" behavior from JG.

The error you're getting implies that the underlying keyspace isn't fully intact, and rows are missing, so let's take a look at the underlying backup/restore procedure.  I'd also make sure that there are no JG instances trying to communicate with Scylla during this process.

Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing?  Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)?

Thanks,
Ryan  


On Wed, Apr 10, 2019 at 10:11 AM SAURABH VERMA <saurabh...@...> wrote:
Hi all

I am trying to recover a JG cluster backed by scyllaDB using steps at https://groups.google.com/d/msg/aureliusgraphs/WyJpzZ4Wcuw/AW4-1GXRfI0J

I am always getting the error as below

Could not find type for id: 2313

I am following below steps:-

- systemctl stop scylla-server
- rm -rf /var/lib/scylla/data/*
- rm -rf /var/lib/scylla/commitlog/*
- systemctl start scylla-server
- schema registration
- systemctl stop scylla-server
- data copy
- sudo chown -R scylla:scylla /var/lib/scylla/data/idgraph1
- systemctl start scylla-server
- nodetool repair

Please guide me whats the correct sequence of steps for recovery or any other way to recover JG data?

Thank
Saurabh

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/82ac6954-6bc7-4ae8-a06b-e82fbd3cc091%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJcfEffszhXiQrbzPNK1cTPNUGyKu4sQ76r3_nWjacQ7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CA%2BCnsnnQUpTOL%3DoUMuM7mKJLoFnDHSZ2GghGNM9xtqbDdRsoLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Saurabh Verma 
Inline images
                            
Principal Engineer

m: +917976984604
skype: saurabh.verma-zeotap

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJedHuifED30Uv9LEsqxyjOomfKHvNe_10bStb5LcJOMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUK4edwJJyfrQumx7xFkb2tp5GgX-QTbAmv5qUQeTmEn6Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855


Re: JG recovery is not working with a 2 node scyllaDB cluster as backend

Saurabh Verma <saurab...@...>
 

Hi Ryan

The above method works successfully for JG recovery, thank you so much Ryan for the timely assistance, and sharing this crucial & fundamental information.

Thanks a lot once again
Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Fri, 12 Apr 2019 at 02:03, Ryan Stauffer <ry...@...> wrote:
For terminology, I'm going to call your 2 clusters "source" and "replica".  "Source" is the Scylla cluster that you want to backup, and "replica" is the cluster that you want to copy the data to.  For this to work, you need a mapping of "source" node to exactly one "replica" node.

Ex:
Source cluster = 3 Nodes (source1, source2, source3)
Replica cluster = 3 Nodes (replica1, replica2, replica3)

Where we have a mapping:
source1 -> replica1
source2 -> replica2
source3 -> replica3

The end result of this process will be that the replica cluster tokens mirror those of the source cluster. 

1. Start with the replica cluster shutdown. ($ sudo systemctl stop scylla-server)

2. On each node of the source cluster, run the following:
HOST_IP=`grep -e '^listen_address' /etc/scylla/scylla.yaml | awk '{ print $NF }'`
nodetool ring | grep $HOST_IP | awk '{print $NF ","}' | xargs | sed 's/,$//g'

This produces a comma-separated list of tokens for that particular node...

3. Take this list of tokens and plug it into the corresponding replica node's scylla.yaml file under the initial_token: property.

Now finish the normal restore procedure on the replica cluster (replacing the keyspace data w/ the back-upped data from the source cluster)

4. Start up the replica cluster, run nodetool repair, and once everything's up, you should be good to go...

Good luck!




On Thu, Apr 11, 2019 at 1:06 PM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Please send me the commands whenever possible

Thanks a lot 

On Thu, 11 Apr 2019 at 11:29 PM, Saurabh Verma <saurab...@...> wrote:
Hey Ryan

It would be really great if you could guide me along these lines you mentioned above.

I would be available if you need any more information.

Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 23:25, Ryan Stauffer <ry...@...> wrote:
Missed your response about the different cluster. That’s the underlying issue then.

Because of the way Scylla works, you’ll need to make sure that the cluster on which the restore occurs has the same token distribution as the source cluster.  If you need help doing this let me know and I can send over some commands later once I get back to my computer. 

On Thu, Apr 11, 2019 at 10:51 AM Ryan Stauffer <ry...@...> wrote:
So you backed up data on each node of “cluster1”, and then restored data on each node of “cluster1”?

On Thu, Apr 11, 2019 at 10:03 AM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Thanks for the response.

Below answers to your questions:-

1. Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing? - The keyspace is same idgraph1

2. Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)? - Yes I am running this as mentioned at https://docs.scylladb.com/operating-scylla/procedures/backup-restore/

3. Is the data intact? - I compared the names & sizes of edgestore, graphindex & janusgraph_ids tables on source and destination machines, both are name listing & sizes are exactly the same in both source and destination clusters

Still getting the same error.

'Could not find type for id: 3597'

When I checked g.V(3597) in source cluster, it represents 'node' label in the source cluster, but g.V(3597) gives nothing.

Thanks
Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 19:18, Ryan Stauffer <ry...@...> wrote:
Big picture, my understanding is that you're trying to backup and restore the underlying Scylla keyspace ("idgraph1"), right (procedure described here https://docs.scylladb.com/operating-scylla/procedures/backup-restore/)?  If that restore isn't successful, or incomplete, you can end up with some "interesting" behavior from JG.

The error you're getting implies that the underlying keyspace isn't fully intact, and rows are missing, so let's take a look at the underlying backup/restore procedure.  I'd also make sure that there are no JG instances trying to communicate with Scylla during this process.

Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing?  Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)?

Thanks,
Ryan  


On Wed, Apr 10, 2019 at 10:11 AM SAURABH VERMA <saurabh...@...> wrote:
Hi all

I am trying to recover a JG cluster backed by scyllaDB using steps at https://groups.google.com/d/msg/aureliusgraphs/WyJpzZ4Wcuw/AW4-1GXRfI0J

I am always getting the error as below

Could not find type for id: 2313

I am following below steps:-

- systemctl stop scylla-server
- rm -rf /var/lib/scylla/data/*
- rm -rf /var/lib/scylla/commitlog/*
- systemctl start scylla-server
- schema registration
- systemctl stop scylla-server
- data copy
- sudo chown -R scylla:scylla /var/lib/scylla/data/idgraph1
- systemctl start scylla-server
- nodetool repair

Please guide me whats the correct sequence of steps for recovery or any other way to recover JG data?

Thank
Saurabh

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/82ac6954-6bc7-4ae8-a06b-e82fbd3cc091%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJcfEffszhXiQrbzPNK1cTPNUGyKu4sQ76r3_nWjacQ7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CA%2BCnsnnQUpTOL%3DoUMuM7mKJLoFnDHSZ2GghGNM9xtqbDdRsoLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Saurabh Verma 
Inline images
                            
Principal Engineer

m: +917976984604
skype: saurabh.verma-zeotap

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJedHuifED30Uv9LEsqxyjOomfKHvNe_10bStb5LcJOMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CA%2BCnsn%3DpcutXXi4WA9Rnh4kM090pxQqUGw8pzpf8GjbqjX%3DH9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: JG recovery is not working with a 2 node scyllaDB cluster as backend

Ryan Stauffer <ry...@...>
 

For terminology, I'm going to call your 2 clusters "source" and "replica".  "Source" is the Scylla cluster that you want to backup, and "replica" is the cluster that you want to copy the data to.  For this to work, you need a mapping of "source" node to exactly one "replica" node.

Ex:
Source cluster = 3 Nodes (source1, source2, source3)
Replica cluster = 3 Nodes (replica1, replica2, replica3)

Where we have a mapping:
source1 -> replica1
source2 -> replica2
source3 -> replica3

The end result of this process will be that the replica cluster tokens mirror those of the source cluster. 

1. Start with the replica cluster shutdown. ($ sudo systemctl stop scylla-server)

2. On each node of the source cluster, run the following:
HOST_IP=`grep -e '^listen_address' /etc/scylla/scylla.yaml | awk '{ print $NF }'`
nodetool ring | grep $HOST_IP | awk '{print $NF ","}' | xargs | sed 's/,$//g'

This produces a comma-separated list of tokens for that particular node...

3. Take this list of tokens and plug it into the corresponding replica node's scylla.yaml file under the initial_token: property.

Now finish the normal restore procedure on the replica cluster (replacing the keyspace data w/ the back-upped data from the source cluster)

4. Start up the replica cluster, run nodetool repair, and once everything's up, you should be good to go...

Good luck!




On Thu, Apr 11, 2019 at 1:06 PM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Please send me the commands whenever possible

Thanks a lot 

On Thu, 11 Apr 2019 at 11:29 PM, Saurabh Verma <saurab...@...> wrote:
Hey Ryan

It would be really great if you could guide me along these lines you mentioned above.

I would be available if you need any more information.

Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 23:25, Ryan Stauffer <ry...@...> wrote:
Missed your response about the different cluster. That’s the underlying issue then.

Because of the way Scylla works, you’ll need to make sure that the cluster on which the restore occurs has the same token distribution as the source cluster.  If you need help doing this let me know and I can send over some commands later once I get back to my computer. 

On Thu, Apr 11, 2019 at 10:51 AM Ryan Stauffer <ry...@...> wrote:
So you backed up data on each node of “cluster1”, and then restored data on each node of “cluster1”?

On Thu, Apr 11, 2019 at 10:03 AM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Thanks for the response.

Below answers to your questions:-

1. Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing? - The keyspace is same idgraph1

2. Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)? - Yes I am running this as mentioned at https://docs.scylladb.com/operating-scylla/procedures/backup-restore/

3. Is the data intact? - I compared the names & sizes of edgestore, graphindex & janusgraph_ids tables on source and destination machines, both are name listing & sizes are exactly the same in both source and destination clusters

Still getting the same error.

'Could not find type for id: 3597'

When I checked g.V(3597) in source cluster, it represents 'node' label in the source cluster, but g.V(3597) gives nothing.

Thanks
Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 19:18, Ryan Stauffer <ry...@...> wrote:
Big picture, my understanding is that you're trying to backup and restore the underlying Scylla keyspace ("idgraph1"), right (procedure described here https://docs.scylladb.com/operating-scylla/procedures/backup-restore/)?  If that restore isn't successful, or incomplete, you can end up with some "interesting" behavior from JG.

The error you're getting implies that the underlying keyspace isn't fully intact, and rows are missing, so let's take a look at the underlying backup/restore procedure.  I'd also make sure that there are no JG instances trying to communicate with Scylla during this process.

Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing?  Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)?

Thanks,
Ryan  


On Wed, Apr 10, 2019 at 10:11 AM SAURABH VERMA <saurabh...@...> wrote:
Hi all

I am trying to recover a JG cluster backed by scyllaDB using steps at https://groups.google.com/d/msg/aureliusgraphs/WyJpzZ4Wcuw/AW4-1GXRfI0J

I am always getting the error as below

Could not find type for id: 2313

I am following below steps:-

- systemctl stop scylla-server
- rm -rf /var/lib/scylla/data/*
- rm -rf /var/lib/scylla/commitlog/*
- systemctl start scylla-server
- schema registration
- systemctl stop scylla-server
- data copy
- sudo chown -R scylla:scylla /var/lib/scylla/data/idgraph1
- systemctl start scylla-server
- nodetool repair

Please guide me whats the correct sequence of steps for recovery or any other way to recover JG data?

Thank
Saurabh

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/82ac6954-6bc7-4ae8-a06b-e82fbd3cc091%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJcfEffszhXiQrbzPNK1cTPNUGyKu4sQ76r3_nWjacQ7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CA%2BCnsnnQUpTOL%3DoUMuM7mKJLoFnDHSZ2GghGNM9xtqbDdRsoLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Saurabh Verma 
Inline images
                            
Principal Engineer

m: +917976984604
skype: saurabh.verma-zeotap

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJedHuifED30Uv9LEsqxyjOomfKHvNe_10bStb5LcJOMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: JG recovery is not working with a 2 node scyllaDB cluster as backend

Saurabh Verma <saurab...@...>
 

Hi Ryan

Please send me the commands whenever possible

Thanks a lot 

On Thu, 11 Apr 2019 at 11:29 PM, Saurabh Verma <saurab...@...> wrote:
Hey Ryan

It would be really great if you could guide me along these lines you mentioned above.

I would be available if you need any more information.

Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 23:25, Ryan Stauffer <ry...@...> wrote:
Missed your response about the different cluster. That’s the underlying issue then.

Because of the way Scylla works, you’ll need to make sure that the cluster on which the restore occurs has the same token distribution as the source cluster.  If you need help doing this let me know and I can send over some commands later once I get back to my computer. 

On Thu, Apr 11, 2019 at 10:51 AM Ryan Stauffer <ry...@...> wrote:
So you backed up data on each node of “cluster1”, and then restored data on each node of “cluster1”?

On Thu, Apr 11, 2019 at 10:03 AM Saurabh Verma <saurab...@...> wrote:
Hi Ryan

Thanks for the response.

Below answers to your questions:-

1. Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing? - The keyspace is same idgraph1

2. Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)? - Yes I am running this as mentioned at https://docs.scylladb.com/operating-scylla/procedures/backup-restore/

3. Is the data intact? - I compared the names & sizes of edgestore, graphindex & janusgraph_ids tables on source and destination machines, both are name listing & sizes are exactly the same in both source and destination clusters

Still getting the same error.

'Could not find type for id: 3597'

When I checked g.V(3597) in source cluster, it represents 'node' label in the source cluster, but g.V(3597) gives nothing.

Thanks
Saurabh Verma 
PE

 


*DISCLAIMER - This email message and any attachments to it are confidential and intended solely for the addressee(s). If you are not the intended recipient, please delete the message and notify the sender immediately. Also, you are notified that any unauthorized disclosure, use, copying, or storage of this message or its attachments is strictly prohibited. Imprint



 


On Thu, 11 Apr 2019 at 19:18, Ryan Stauffer <ry...@...> wrote:
Big picture, my understanding is that you're trying to backup and restore the underlying Scylla keyspace ("idgraph1"), right (procedure described here https://docs.scylladb.com/operating-scylla/procedures/backup-restore/)?  If that restore isn't successful, or incomplete, you can end up with some "interesting" behavior from JG.

The error you're getting implies that the underlying keyspace isn't fully intact, and rows are missing, so let's take a look at the underlying backup/restore procedure.  I'd also make sure that there are no JG instances trying to communicate with Scylla during this process.

Are you backing up and restoring the same keyspace on the same Scylla cluster, or is one of those variables changing?  Are you running the backup and restore on all nodes of the scylla cluster (backup and restore is a per-node operation)?

Thanks,
Ryan  


On Wed, Apr 10, 2019 at 10:11 AM SAURABH VERMA <saurabh...@...> wrote:
Hi all

I am trying to recover a JG cluster backed by scyllaDB using steps at https://groups.google.com/d/msg/aureliusgraphs/WyJpzZ4Wcuw/AW4-1GXRfI0J

I am always getting the error as below

Could not find type for id: 2313

I am following below steps:-

- systemctl stop scylla-server
- rm -rf /var/lib/scylla/data/*
- rm -rf /var/lib/scylla/commitlog/*
- systemctl start scylla-server
- schema registration
- systemctl stop scylla-server
- data copy
- sudo chown -R scylla:scylla /var/lib/scylla/data/idgraph1
- systemctl start scylla-server
- nodetool repair

Please guide me whats the correct sequence of steps for recovery or any other way to recover JG data?

Thank
Saurabh

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/82ac6954-6bc7-4ae8-a06b-e82fbd3cc091%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CAPW2RUJcfEffszhXiQrbzPNK1cTPNUGyKu4sQ76r3_nWjacQ7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855
--
Ryan Stauffer
Founder, Enharmonic, Inc.
415-684-3855

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To post to this group, send email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/CA%2BCnsnnQUpTOL%3DoUMuM7mKJLoFnDHSZ2GghGNM9xtqbDdRsoLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Saurabh Verma 
Inline images
                            
Principal Engineer

m: +917976984604
skype: saurabh.verma-zeotap


Re: [DISCUSS] Rethink JanusGrpah Schema Management

Jan Jansen <faro...@...>
 

Ryan,

It would be cool to get some help. A good starting point would be make the graph readonly using a TinkerpopStrategy. This implemenation should be straight forward.
My current state on task 3 and 4 are in following branches:
  • Enforcement in https://github.com/GDATASoftwareAG/janusgraph/tree/schema-strategy
  • accessiblity in https://github.com/GDATASoftwareAG/janusgraph/tree/schema (Very early state).

Thanks for your offer.

Greetings
Jan


Re: [DISCUSS] 0.2.3 release and 0.3 branch

Jan Jansen <faro...@...>
 

Hi
It would be cool to get the documentation ready for the release of version 0.3.2.
Therfefore, It would be cool, if some of the JanusGraph folks checks the documentation.

Afterwards, We can merge it, so we can integrate all changes of 0.3 docs. Followed by preparation for 0.4.

Dear
Jan Jansen

521 - 540 of 1601