Re: The use of vertex-centric index

How can I use the index to get the top 500 edges of the amount descending sort faster?

You already specified the vertex-centrex index on the amount key to be ordered while creating the index. By explicitly reordering the results in the traversal, the index cannot take effect because the reordering needs alls vertices to be retrieved instead of just the first 500.

HTH,    Marc

Hi JanusGraph team,

I have created a vertex-centric indexes for vertices. As follows, now I want to use the index to get the information of the top 500 edges in descending sort. However, I find that the execution time is the same as that without vertex index. How can I use the index to sort faster and extract the information of the first 500 edges more quickly?

Here's the graph I've built：

Here's the data I inserted, building a starting point, a million edges associated with it, and 100 endpoints,

graph_conf = 'janusgraph-cql-es-server-test2.properties'
graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
int start_value = 1
int end_value = 1000000

cloumns = line.split(',', -1)
amount = amount.toBigDecimal()
tx_index = log_index.toInteger()
for (int i = start_value; i <= end_value; i++) {
Date ts = new Date((timestamp.toLong() - i) * 1000)
.property('amount', amount + i)
.property('tx_hash', tx_hash)
.property('tx_index', tx_index + i)
.property('created_time', ts)
.next()

if (i % 20000 == 0) {
println("[total:\${i}]")
System.sleep(500)
g.tx().commit()
graph.close()
System.sleep(5000)

graph = JanusGraphFactory.open(graph_conf)
g = graph.traversal()
System.sleep(5000)
}
g.tx().commit()
}

graph.close()

Here are my query criteria:

Re: Janusgraph Authentication cannot create users

See the section "Credentials graph DSL" in:
So, you instantiate the CredentialsDB GraphTraversalSource using:

credentials = graph.traversal(CredentialTraversalSource.class)

where graph is the JanusGraph instance holding your CredentialsDb (the TinkerPop ref docs refer to TinkerGraph which is not applicable here).

Also are there any other ways of creating users??

On Monday, July 20, 2020 at 3:29:54 PM UTC+5:30, sparshneel chanchlani wrote:
Hi,
I am actually trying to add authentication to Janusgraph. I am actually referring the link below
below is may credentials DB config:

gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cql
storage.hostname= 10.xx.xx.xx
storage.port= 9042
storage.cql.keyspace=creds_db
storage.cql.write-consistency-level=LOCAL_QUORUM
cluster.max-partitions=32
storage.lock.wait-time=5000
storage.lock.retries=5
ids.block-size=100000

Actually when i start the gremlin-server the creds_db and the graphDB creates successfully. The issue, i am not able to create the credentials using Credentials(graph) groovy script, i am trying through gremlin-consle see below.

g1 = JanusGraphFactory.open('conf/gremlin-server/janusgraph-server-credentials.properties')
==>standardjanusgraph[cql:[]]
gremlin> creds = Credentials(g1)
No signature of method: groovysh_evaluate.Credentials() is applicable for argument types: (org.janusgraph.graphdb.database.StandardJanusGraph) values: [standardjanusgraph[cql:[]

Actaully the groovysh_evaluate script does not support standard graph as parameters. What should be my credentials.properties for cassandra??

Thanks,
Sparshneel

Re: when i use Janusgraph in Apache Atlas, i found an error

Hi Pavel,

I do not recognize the way you want to register classes for serialization by JanusGraph towards gremlin driver, but this may be due to my limited knowledge on this issue. JanusGraph itself registers the additional classes it has defined in the following way:

So, this would involve defining your own IoRegistry class and configuring it for gremlin server (and optionally for the remote-objects.yaml for gremlin driver).

Hello,

I've got the same issue with the latest version of JanusGraph and Atlas from master branch. Did you manage somehow appropriate type/serializer registration to produce GraphSON output? I'd like to visualise graph via Cytoscape or Graphexp. Thanks for any advice!

I've tried already - gremlin config (using Scylla and ES):
attributes.custom.attribute1.attribute-class=org.apache.atlas.typesystem.types.DataTypes.TypeCategory
attributes.custom.attribute1.serializer-class=org.apache.atlas.repository.graphdb.janus.serializer.TypeCategorySerializer
attributes.custom.attribute2.attribute-class=java.util.ArrayList
attributes.custom.attribute2.serializer-class=org.janusgraph.graphdb.database.serialize.attribute.SerializableSerializer
attributes.custom.attribute3.attribute-class=java.math.BigInteger
attributes.custom.attribute3.serializer-class=org.apache.atlas.repository.graphdb.janus.serializer.BigIntegerSerializer
attributes.custom.attribute4.attribute-class=java.math.BigDecimal
attributes.custom.attribute4.serializer-class=org.apache.atlas.repository.graphdb.janus.serializer.BigDecimalSerializer

then from gremlin cli:
graph.io(IoCore.graphson()).writeGraph("/atlas.json")

resulting into:
org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not find a type identifier for the class : class org.apache.atlas.typesystem.types.DataTypes\$TypeCategory. Make sure the value to serialize has a type identifier registered for its class.
Hi,

See a similar question on:

HTH,   Marc

hello i am new in JanusGraph. When i use Janusgraph in Apache Atlas, i found a question, the error is :
`Could not find a type identifier for the class : class org.apache.atlas.typesystem.types.DataTypes\$TypeCategory. Make sure the value to serialize has a type identifier registered for its class.org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not find a type identifier for the class : class org.apache.atlas.typesystem.types.DataTypes\$TypeCategory. Make sure the value to serialize has a type identifier registered for its class.at org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509)at org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482)at org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319)at org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3893)at org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:3164)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertex(GraphSONWriter.java:82)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertices(GraphSONWriter.java:110)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeGraph(GraphSONWriter.java:71)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONIo.writeGraph(GraphSONIo.java:83)at org.apache.tinkerpop.gremlin.structure.io.Io\$writeGraph.call(Unknown Source)at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:128)at groovysh_evaluate.run(groovysh_evaluate:3)at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:71)at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:196)at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super\$3\$execute(GremlinGroovysh.groovy)at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super\$2\$work(InteractiveShellRunner.groovy)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:130)at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super\$2\$run(InteractiveShellRunner.groovy)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.`

how can i solve it? thank you very much

Re: Java APIs for Create/Open/Drop graph using ConfiguredGraphFactory - janusgraph

Hi Anumodh,

Interesting question. Using ConfiguredGraphFactory without gremlin server is relevant when you build your own REST endpoints for a graph application.

While the ref docs may not address this use case, the javadocs for ConfiguredGraphFactory seem pretty self-explanatory. Did you checkout the following example graph properties file:
including the line:
gremlin.graph=org.janusgraph.core.ConfiguredGraphFactory

What was the point where things became unclear?

Hi JanusGraph team,

We are exploring the possibility of janusgraph in our company. We are planning to use dynamic graph creation/deletion using ConfiguredGraphFactory. We have deployed 2 janusgraph instances. Our plan is to write java library to creating, deletion and other management activities like Index creation etc . I haven't yet found a way to create/delete a graph using java apis by using ConfiguredGraphFactory.

From my investigation, only way to do so using by connecting to gremlin server and sending string commands like
- ConfiguredGraphFactory.create(graphName)
- ConfiguredGraphFactory.drop(graphName)

Thanks,
anumodh

HTH,    Marc

Java APIs for Create/Open/Drop graph using ConfiguredGraphFactory - janusgraph

Hi JanusGraph team,

We are exploring the possibility of janusgraph in our company. We are planning to use dynamic graph creation/deletion using ConfiguredGraphFactory. We have deployed 2 janusgraph instances. Our plan is to write java library to creating, deletion and other management activities like Index creation etc . I haven't yet found a way to create/delete a graph using java apis by using ConfiguredGraphFactory.

From my investigation, only way to do so using by connecting to gremlin server and sending string commands like
- ConfiguredGraphFactory.create(graphName)
- ConfiguredGraphFactory.drop(graphName)

Thanks,
anumodh

Problem with 2 property keys (conflict with property keys ?)

Hi everyone,

I have a problem with 2 property keys, named:
"ex_value" String
"feed_name"String

I don't know why, on every vertices I created,  ex_value property name is  replaced by feed_name.

=>3686404256
>g.V(3686404256).valueMap(true)
=>[id:3686404256,label:test_example, name:[test1], feed_name:[test1]]

feed_name property appear in result, but I didn't used this property, I used ex_value

>mgmt.openManagement()
>mgmt.getPropertyKey("ex_value")
=>feed_name
the result is wrong

>mgmt.printPropertyKeys()
there is 2 feed_name properties, ex_value doesn't appear

I don't know how I can resolve this problem.

I use TinkerPop’s Hadoop-Gremlin to import data.
Take grateful-dead.txt as an example, I want to import vertex without edges.

But I get error like this, "java.util.NoSuchElementException".

Re: Unable to drop Remote JanusGraph

Hi,
Have you tried instead of drop the graph, to do a g.V().drop().iterate()
Please note, that in this case janusGraph schema would still exist and depending on the backend, this solution can be much longer.

Another solution would be to drop the graph, then replace in global bindings, the graph and the g reference.
The global bindings are set in org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor and are initialized in org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.getAsBindings(). As the implementation of GraphManager is configurable, it should be possible to develop an extension that allows you to retrieve the global bindings and then replace the  graph and the g reference.

Hi,

I was using a remote server and using inmemory backend for the remote server - I was still able to query the data, Even after dropping it - the data didn't get removed and I had to restart it.

Hi,
> How do we know if the graph has been dropped?
As I am using Cassandra and ES as backend, I can see that data as been dropped by doing a direct request to these components.

>   Is there any way of clearing all the data, the schema without restarting the server.
I have found any yet but I do not look into the gremlin server code.  I do not have this requirement, so I do search for it. May be using a proxy design pattern on graph object may do the trick.

Nicolas

Hi Nicolas,

How do we know if the graph has been dropped? You said you have to restart the server after doing that. I am doing the same, but I don't want to. Is there any way of clearing all the data, the schema without restarting the server.

Also, JanusGraphFactory.drop(getJanusGraph()) is the same as JanusGraphFactory.drop(graph) - because that function passes the same object and it still doesn't work. I don't want to restart the server as it's a remote server. How do I achieve that?

Hi,
I am able to drop remotely  the graph using the script:
JanusGraphFactory.drop(graph);[]

After the script, I need to restart janusGraph in order to re-create the graph.
Could you sent the janusgraph logs with the stack trace ?

Also, instead of  JanusGraphFactory.drop(getJanusGraph());  , could you try with JanusGraphFactory.drop(graph);

Kind regards,
Nicolas
Hi,

I'm looking to drop a graph, but the JanusgraphFactory.drop(graph) doesn't work for me. I'm using JAVA to connect to the remote server and load/remove data.

Using the below properties file  for remote-properties gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

Below file for remote-objects -

hosts: [hostname]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
}
}
Using this to open the graph -
conf = new PropertiesConfiguration(propFileName);

// using the remote graph for queries
graph = JanusGraphFactory.open("inmemory");
g = graph.traversal().withRemote(conf);
return g;

Using this to drop the graph:
if (graph != null) {
LOGGER.info("Dropping janusgraph function");
JanusGraphFactory.drop(getJanusGraph());
}

Any help would be appreciated.
Thanks!

Re: Janusgraph with YARN and HBASE

Perfect!!!
That's it!
Thank you, very much!!!

Hi!

Try this
`spark.io.compression.codec=snappy`

Hello, we have a Cluster with CLOUDERA CDH 6.3.2 and I'm trying to run Janusgraph on the Cluster with YARN and HBASE, but without success.
(it's OK with SPARK Local)

Version SPARK 2.4.2
HBASE: 2.1.0-cdh6.3.2
Janusgraph (v 0.5.2 and v0.4.1)

I did a lot of searching, but I didn't find any recent references, and they all use older versions of SPARK and Janusgraph.

Some examples:

According to these references, I followed the following steps:

1. Copy the following files to the Janusgraph "lib" directory:
1. spark-yarn-2.11-2.4.0.jar
2. scala-reflect-2.10.5.jar
4. guice-servlet-3.0.jar
2. Generate a "/tmp/spark-gremlin-0.5.2.zip" file containing all the .jar files from "janusgraph / lib /".

`        janusgraphmr.ioformat.conf.storage.hostname=XXX.XXX.XXX.XXX  spark.master= yarn #spark.deploy-mode=client spark.submit.deployMode=client spark.executor.memory=1g spark.yarn.dist.jars=/tmp/spark-gremlin-0-5-2.zip spark.yarn.archive=/tmp/spark-gremlin-0-5-2.zip spark.yarn.appMasterEnv.CLASSPATH=./__spark_libs__/*:[hadoop_conf_dir] spark.executor.extraClassPath=./__spark_libs__/*:/[hadoop_conf_dir] spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native `

Then I ran the following commands:
`graph = GraphFactory.open(conf/hadoop-graph/test.properties)g = graph.traversal().withComputer(SparkGraphComputer) g.V().count()`
Can someone help me?
a) Are these problems related to version incompatibility?
b) Has anyone successfully used similar infrastructure?
c) Would anyone know how to determine a correct version of the necessary libraries?
d) Any suggestion?

Thank you all !!!

Below is a copy of the Yarn Log from my last attempt.

`ERROR org.apache.spark.scheduler.TaskSetManager  - Task 0 in stage 0.0 failed 4 times; aborting joborg.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, [SERVER_NAME], executor 1): java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)Vat org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$6.apply(TorrentBroadcast.scala:304)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$6.apply(TorrentBroadcast.scala:304)at scala.Option.map(Option.scala:146)at org.apache.spark.broadcast.TorrentBroadcast\$.unBlockifyObject(TorrentBroadcast.scala:304)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$readBroadcastBlock\$1\$\$anonfun\$apply\$2.apply(TorrentBroadcast.scala:235)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$readBroadcastBlock\$1.apply(TorrentBroadcast.scala:211)at org.apache.spark.util.Utils\$.tryOrIOException(Utils.scala:1326)at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:207)at org.apache.spark.broadcast.TorrentBroadcast._value\$lzycompute(TorrentBroadcast.scala:66)at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:89)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)at org.apache.spark.scheduler.Task.run(Task.scala:121)at org.apache.spark.executor.Executor\$TaskRunner\$\$anonfun\$10.apply(Executor.scala:402)at org.apache.spark.util.Utils\$.tryWithSafeFinally(Utils.scala:1360)at org.apache.spark.executor.Executor\$TaskRunner.run(Executor.scala:408)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor\$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748) `

Thank you!!

Re: Janusgraph with YARN and HBASE

Hi!

Try this
`spark.io.compression.codec=snappy`

Hello, we have a Cluster with CLOUDERA CDH 6.3.2 and I'm trying to run Janusgraph on the Cluster with YARN and HBASE, but without success.
(it's OK with SPARK Local)

Version SPARK 2.4.2
HBASE: 2.1.0-cdh6.3.2
Janusgraph (v 0.5.2 and v0.4.1)

I did a lot of searching, but I didn't find any recent references, and they all use older versions of SPARK and Janusgraph.

Some examples:

According to these references, I followed the following steps:

1. Copy the following files to the Janusgraph "lib" directory:
1. spark-yarn-2.11-2.4.0.jar
2. scala-reflect-2.10.5.jar
4. guice-servlet-3.0.jar
2. Generate a "/tmp/spark-gremlin-0.5.2.zip" file containing all the .jar files from "janusgraph / lib /".

`        janusgraphmr.ioformat.conf.storage.hostname=XXX.XXX.XXX.XXX  spark.master= yarn #spark.deploy-mode=client spark.submit.deployMode=client spark.executor.memory=1g spark.yarn.dist.jars=/tmp/spark-gremlin-0-5-2.zip spark.yarn.archive=/tmp/spark-gremlin-0-5-2.zip spark.yarn.appMasterEnv.CLASSPATH=./__spark_libs__/*:[hadoop_conf_dir] spark.executor.extraClassPath=./__spark_libs__/*:/[hadoop_conf_dir] spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native `

Then I ran the following commands:
`graph = GraphFactory.open(conf/hadoop-graph/test.properties)g = graph.traversal().withComputer(SparkGraphComputer) g.V().count()`
Can someone help me?
a) Are these problems related to version incompatibility?
b) Has anyone successfully used similar infrastructure?
c) Would anyone know how to determine a correct version of the necessary libraries?
d) Any suggestion?

Thank you all !!!

Below is a copy of the Yarn Log from my last attempt.

`ERROR org.apache.spark.scheduler.TaskSetManager  - Task 0 in stage 0.0 failed 4 times; aborting joborg.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, [SERVER_NAME], executor 1): java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)Vat org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$6.apply(TorrentBroadcast.scala:304)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$6.apply(TorrentBroadcast.scala:304)at scala.Option.map(Option.scala:146)at org.apache.spark.broadcast.TorrentBroadcast\$.unBlockifyObject(TorrentBroadcast.scala:304)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$readBroadcastBlock\$1\$\$anonfun\$apply\$2.apply(TorrentBroadcast.scala:235)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$readBroadcastBlock\$1.apply(TorrentBroadcast.scala:211)at org.apache.spark.util.Utils\$.tryOrIOException(Utils.scala:1326)at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:207)at org.apache.spark.broadcast.TorrentBroadcast._value\$lzycompute(TorrentBroadcast.scala:66)at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:89)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)at org.apache.spark.scheduler.Task.run(Task.scala:121)at org.apache.spark.executor.Executor\$TaskRunner\$\$anonfun\$10.apply(Executor.scala:402)at org.apache.spark.util.Utils\$.tryWithSafeFinally(Utils.scala:1360)at org.apache.spark.executor.Executor\$TaskRunner.run(Executor.scala:408)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor\$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748) `

Thank you!!

I use TinkerPop’s Hadoop-Gremlin to import data.
Take grateful-dead.txt as an example, I want to import vertex without edges.

But I get error like this, "java.util.NoSuchElementException".

Hi,

I'm trying to connect to a remote inmemory janusgraph server.

This is my .properties file :
gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
# cluster file has the remote server configuration
gremlin.remote.driver.clusterFile=./remote-objects.yaml
# source name is the global graph traversal source defined on the server
gremlin.remote.driver.sourceName=g

I've both the .properties files and the remote-objects.yaml in the resources directory of the application.
However, when I try to access the remote-objects.yaml file from the .properties file. It's giving me an error. Anyone know how do I mention the path of the yaml file in the .properties file for a spring boot application?

Thanks!

Janusgraph with YARN and HBASE

Hello, we have a Cluster with CLOUDERA CDH 6.3.2 and I'm trying to run Janusgraph on the Cluster with YARN and HBASE, but without success.
(it's OK with SPARK Local)

Version SPARK 2.4.2
HBASE: 2.1.0-cdh6.3.2
Janusgraph (v 0.5.2 and v0.4.1)

I did a lot of searching, but I didn't find any recent references, and they all use older versions of SPARK and Janusgraph.

Some examples:
2) http://tinkerpop.apache.org/docs/current/recipes/#olap-spark-yarn
3) http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html

According to these references, I followed the following steps:

1. Copy the following files to the Janusgraph "lib" directory:
1. spark-yarn-2.11-2.4.0.jar
2. scala-reflect-2.10.5.jar
4. guice-servlet-3.0.jar
2. Generate a "/tmp/spark-gremlin-0.5.2.zip" file containing all the .jar files from "janusgraph / lib /".

`        janusgraphmr.ioformat.conf.storage.hostname=XXX.XXX.XXX.XXX  spark.master= yarn #spark.deploy-mode=client spark.submit.deployMode=client spark.executor.memory=1g spark.yarn.dist.jars=/tmp/spark-gremlin-0-5-2.zip spark.yarn.archive=/tmp/spark-gremlin-0-5-2.zip spark.yarn.appMasterEnv.CLASSPATH=./__spark_libs__/*:[hadoop_conf_dir] spark.executor.extraClassPath=./__spark_libs__/*:/[hadoop_conf_dir] spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native `

Then I ran the following commands:
`graph = GraphFactory.open(conf/hadoop-graph/test.properties)g = graph.traversal().withComputer(SparkGraphComputer) g.V().count()`
Can someone help me?
a) Are these problems related to version incompatibility?
b) Has anyone successfully used similar infrastructure?
c) Would anyone know how to determine a correct version of the necessary libraries?
d) Any suggestion?

Thank you all !!!

Below is a copy of the Yarn Log from my last attempt.

`ERROR org.apache.spark.scheduler.TaskSetManager  - Task 0 in stage 0.0 failed 4 times; aborting joborg.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, [SERVER_NAME], executor 1): java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)Vat org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$6.apply(TorrentBroadcast.scala:304)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$6.apply(TorrentBroadcast.scala:304)at scala.Option.map(Option.scala:146)at org.apache.spark.broadcast.TorrentBroadcast\$.unBlockifyObject(TorrentBroadcast.scala:304)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$readBroadcastBlock\$1\$\$anonfun\$apply\$2.apply(TorrentBroadcast.scala:235)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.broadcast.TorrentBroadcast\$\$anonfun\$readBroadcastBlock\$1.apply(TorrentBroadcast.scala:211)at org.apache.spark.util.Utils\$.tryOrIOException(Utils.scala:1326)at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:207)at org.apache.spark.broadcast.TorrentBroadcast._value\$lzycompute(TorrentBroadcast.scala:66)at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:89)at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)at org.apache.spark.scheduler.Task.run(Task.scala:121)at org.apache.spark.executor.Executor\$TaskRunner\$\$anonfun\$10.apply(Executor.scala:402)at org.apache.spark.util.Utils\$.tryWithSafeFinally(Utils.scala:1360)at org.apache.spark.executor.Executor\$TaskRunner.run(Executor.scala:408)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor\$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748) `

Thank you!!

Re: when i use Janusgraph in Apache Atlas, i found an error

Hello,

I've got the same issue with the latest version of JanusGraph and Atlas from master branch. Did you manage somehow appropriate type/serializer registration to produce GraphSON output? I'd like to visualise graph via Cytoscape or Graphexp. Thanks for any advice!

I've tried already - gremlin config (using Scylla and ES):
attributes.custom.attribute1.attribute-class=org.apache.atlas.typesystem.types.DataTypes.TypeCategory
attributes.custom.attribute1.serializer-class=org.apache.atlas.repository.graphdb.janus.serializer.TypeCategorySerializer
attributes.custom.attribute2.attribute-class=java.util.ArrayList
attributes.custom.attribute2.serializer-class=org.janusgraph.graphdb.database.serialize.attribute.SerializableSerializer
attributes.custom.attribute3.attribute-class=java.math.BigInteger
attributes.custom.attribute3.serializer-class=org.apache.atlas.repository.graphdb.janus.serializer.BigIntegerSerializer
attributes.custom.attribute4.attribute-class=java.math.BigDecimal
attributes.custom.attribute4.serializer-class=org.apache.atlas.repository.graphdb.janus.serializer.BigDecimalSerializer

then from gremlin cli:
graph.io(IoCore.graphson()).writeGraph("/atlas.json")

resulting into:
org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not find a type identifier for the class : class org.apache.atlas.typesystem.types.DataTypes\$TypeCategory. Make sure the value to serialize has a type identifier registered for its class.
Hi,

See a similar question on:

HTH,   Marc

hello i am new in JanusGraph. When i use Janusgraph in Apache Atlas, i found a question, the error is :
`Could not find a type identifier for the class : class org.apache.atlas.typesystem.types.DataTypes\$TypeCategory. Make sure the value to serialize has a type identifier registered for its class.org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not find a type identifier for the class : class org.apache.atlas.typesystem.types.DataTypes\$TypeCategory. Make sure the value to serialize has a type identifier registered for its class.at org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509)at org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482)at org.apache.tinkerpop.shaded.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319)at org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3893)at org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:3164)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertex(GraphSONWriter.java:82)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertices(GraphSONWriter.java:110)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeGraph(GraphSONWriter.java:71)at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONIo.writeGraph(GraphSONIo.java:83)at org.apache.tinkerpop.gremlin.structure.io.Io\$writeGraph.call(Unknown Source)at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:128)at groovysh_evaluate.run(groovysh_evaluate:3)at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:236)at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:71)at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:196)at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super\$3\$execute(GremlinGroovysh.groovy)at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super\$2\$work(InteractiveShellRunner.groovy)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1225)at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:145)at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:165)at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:130)at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super\$2\$run(InteractiveShellRunner.groovy)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.`

how can i solve it? thank you very much

Re: JanusGraph vs. Titan performance

Hello Lucie,

Seems that Persistit and Cassandra backends cannot be compared directly, as far as this one article benchmarked the Cassandra vs Persistit on TitanDB only - https://getmanta.com/blog/metadata-repository-benchmark-graph-database-titan/ - Persistit is way more performant, although isn't it a single machine only setup? I personally doubt that it will be easily comparable to Cassandra.

Also Cassandra 2.x is really a legacy, personally I would not consider it as a storage backend for production JanusGraph deployment (about 2 years ago, when I started to get acquainted with JanusGraph - we used Cassandra 3.x from the very beginning).

Hi,

I have a question regarding the performance of basic operations in JanusGraph. We use obsolete Titan in our project (with Persistit storage backend, so the Titan version is 0.4.4) and we wanted to compare it with JanusGraph in order to get rid of the deprecated database but to maintain (or better improve) our current performance.

A brief report about our results is provided here: https://github.com/svitaluc/graphbench/tree/master/results
The source code of the benchmark can be found on the upper level here https://github.com/svitaluc/graphbench.

Based on the results I have a few questions:

- How is it possible that the import with JanusGraph is so slow in comparison with Titan? Is there any additional setting that we perhaps missed?

- Getting vertices with their neighbors/edges have tremendously worse performance than in the case of Titan. How is this possible? It seems to me like the model was changed and the edges are no longer stored in the same row because I do not think that there would be such a difference between Persistit and Cassandra. On the other hand, there would be no reason for the model change of course.

- Is the reasoning about cache behavior correct? Is there a part of cache that cannot be influenced? Maybe in the case of JanusGraph, this issue is negligible.

Thank you for any hint and response!

Cheers,
Lucie
JanusGraph vs. Titan performance

Hi,

I have a question regarding the performance of basic operations in JanusGraph. We use obsolete Titan in our project (with Persistit storage backend, so the Titan version is 0.4.4) and we wanted to compare it with JanusGraph in order to get rid of the deprecated database but to maintain (or better improve) our current performance.

A brief report about our results is provided here: https://github.com/svitaluc/graphbench/tree/master/results
The source code of the benchmark can be found on the upper level here https://github.com/svitaluc/graphbench.

Based on the results I have a few questions:

- How is it possible that the import with JanusGraph is so slow in comparison with Titan? Is there any additional setting that we perhaps missed?

- Getting vertices with their neighbors/edges have tremendously worse performance than in the case of Titan. How is this possible? It seems to me like the model was changed and the edges are no longer stored in the same row because I do not think that there would be such a difference between Persistit and Cassandra. On the other hand, there would be no reason for the model change of course.

- Is the reasoning about cache behavior correct? Is there a part of cache that cannot be influenced? Maybe in the case of JanusGraph, this issue is negligible.

Thank you for any hint and response!

Cheers,
Lucie

Re: Unable to drop Remote JanusGraph

Dipen Jain <dip...@...>

Hi,

I was using a remote server and using inmemory backend for the remote server - I was still able to query the data, Even after dropping it - the data didn't get removed and I had to restart it.

Hi,
> How do we know if the graph has been dropped?
As I am using Cassandra and ES as backend, I can see that data as been dropped by doing a direct request to these components.

>   Is there any way of clearing all the data, the schema without restarting the server.
I have found any yet but I do not look into the gremlin server code.  I do not have this requirement, so I do search for it. May be using a proxy design pattern on graph object may do the trick.

Nicolas

Hi Nicolas,

How do we know if the graph has been dropped? You said you have to restart the server after doing that. I am doing the same, but I don't want to. Is there any way of clearing all the data, the schema without restarting the server.

Also, JanusGraphFactory.drop(getJanusGraph()) is the same as JanusGraphFactory.drop(graph) - because that function passes the same object and it still doesn't work. I don't want to restart the server as it's a remote server. How do I achieve that?

Hi,
I am able to drop remotely  the graph using the script:
JanusGraphFactory.drop(graph);[]

After the script, I need to restart janusGraph in order to re-create the graph.
Could you sent the janusgraph logs with the stack trace ?

Also, instead of  JanusGraphFactory.drop(getJanusGraph());  , could you try with JanusGraphFactory.drop(graph);

Kind regards,
Nicolas
Hi,

I'm looking to drop a graph, but the JanusgraphFactory.drop(graph) doesn't work for me. I'm using JAVA to connect to the remote server and load/remove data.

Using the below properties file  for remote-properties gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

Below file for remote-objects -

hosts: [hostname]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
}
}
Using this to open the graph -
conf = new PropertiesConfiguration(propFileName);

// using the remote graph for queries
graph = JanusGraphFactory.open("inmemory");
g = graph.traversal().withRemote(conf);
return g;

Using this to drop the graph:
if (graph != null) {
LOGGER.info("Dropping janusgraph function");
JanusGraphFactory.drop(getJanusGraph());
}

Any help would be appreciated.
Thanks!

Re: Unable to drop Remote JanusGraph

Hi,

I want to do it programmatically like how I mentioned in the post - something which is a gremlin query equivalent.

Try ./janusgraph.sh clean

Hi,
> How do we know if the graph has been dropped?
As I am using Cassandra and ES as backend, I can see that data as been dropped by doing a direct request to these components.

>   Is there any way of clearing all the data, the schema without restarting the server.
I have found any yet but I do not look into the gremlin server code.  I do not have this requirement, so I do search for it. May be using a proxy design pattern on graph object may do the trick.

Nicolas

Hi Nicolas,

How do we know if the graph has been dropped? You said you have to restart the server after doing that. I am doing the same, but I don't want to. Is there any way of clearing all the data, the schema without restarting the server.

Also, JanusGraphFactory.drop(getJanusGraph()) is the same as JanusGraphFactory.drop(graph) - because that function passes the same object and it still doesn't work. I don't want to restart the server as it's a remote server. How do I achieve that?

Hi,
I am able to drop remotely  the graph using the script:
JanusGraphFactory.drop(graph);[]

After the script, I need to restart janusGraph in order to re-create the graph.
Could you sent the janusgraph logs with the stack trace ?

Also, instead of  JanusGraphFactory.drop(getJanusGraph());  , could you try with JanusGraphFactory.drop(graph);

Kind regards,
Nicolas
Hi,

I'm looking to drop a graph, but the JanusgraphFactory.drop(graph) doesn't work for me. I'm using JAVA to connect to the remote server and load/remove data.

Using the below properties file  for remote-properties gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

Below file for remote-objects -

hosts: [hostname]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
}
}
Using this to open the graph -
conf = new PropertiesConfiguration(propFileName);

// using the remote graph for queries
graph = JanusGraphFactory.open("inmemory");
g = graph.traversal().withRemote(conf);
return g;

Using this to drop the graph:
if (graph != null) {
LOGGER.info("Dropping janusgraph function");
JanusGraphFactory.drop(getJanusGraph());
}

Any help would be appreciated.
Thanks!

