Date   

Re: Caused by: org.janusgraph.core.JanusGraphException: A JanusGraph graph with the same instance id [0a000439355-0b2b58ca5c222] is already open. Might required forced shutdown.

hadoopmarc@...
 

Hi Srinivas,

This took a while, because the exception message is not really helpful. There are lots of unanswered threads about this issue (search for forced shutdown and janusgraph instance). Can you try the following:
mgmt = graph.openManagement()
ids = mgmt.
getOpenInstances()
// for all ids but the current_id
// current_id = ???? worst case you kill your current instance and have to start the gremlin console again...
mgmt.
forceCloseInstance(ghost_id)


Re: Caused by: org.janusgraph.core.JanusGraphException: A JanusGraph graph with the same instance id [0a000439355-0b2b58ca5c222] is already open. Might required forced shutdown.

Real Life Adventure
 

Hi,
         Tried above mentioned config changes.but of no use .still facing same issue.
          it is helpful if someone can reproduce and mitigate the issue.

Thanks,
M.Srinivas


On Fri, 12 Mar 2021 at 13:19, <hadoopmarc@...> wrote:
Hi Srinivas,

If you read https://docs.janusgraph.org/basics/configured-graph-factory/#configurationmanagementgraph you will see that the configured graph configs and their graphnames are stored in the system tables. Therefore, I I suspect it has to do with the names of graphs in the system table and the graphs section of your yaml config file. So, we will just have to try a few things:
  1. what happens if you leave out the line ¨graph: /etc/opt/janusgraph/janusgraph.properties,¨ ?
  2. what happens if you replace graph: with uniquename: in the graphs section?
  3. does the /etc/opt/janusgraph/janusgraph.properties contain a line with graph.graphname=uniquename ?
Best wishes,    Marc


Re: Threads are unresponsive for some time after a particular amount of data transfer(119MB)

Vinayak Bali
 

Hi Marc,

I am using cluster mode to connect to janusgraph after creating the gremlin query. A sample of code is as follows:
Cluster cluster = Cluster.build().addContactPoint("xx.xx.xx.xx")
        .port(8182)
        .serializer(serializer)
        .resultIterationBatchSize(512)
        .maxContentLength(maxContentLength)
        .create();
Client connect = cluster.connect();
ResultSet submit = connect.submit(gremlin, options);
I went through many blogs, but not any useful information to run it in embedded mode. Also, is there any way to load data from the backend to in-memory to speed up performance?
Request you to guide me to solve the issues:
1. Connecting in embedded mode to bypass the gremlin driver issue.
2. Loading up data in backend into in-memory to speed up performance. 

Thanks & Regards,
Vinayak

On Mon, Mar 15, 2021 at 12:21 PM <hadoopmarc@...> wrote:
Hi Vinayak,

For embedded use of janusgraph, see:
https://docs.janusgraph.org/getting-started/basic-usage/#loading-with-an-index-backend
and replace the properties file with the one currently used by gremlin server.

With embedded use, you can simply do (if your graph is not too large):
vertices = g.V().toList()
edges = g.E().toList()
subGraph = g.E().subgraph('sub').cap('sub').next()

Best wishes,   Marc


JanusGraph 0.5.3 SparkGraph Computer with YARN Error - java.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$GetNewApplicationRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google.protobuf.Message

kndoan94@...
 
Edited

Hello!

I am currently trying to set up SparkGraphComputer using JanusGraph with a CQL storage and ElasticSearch Index backend, and am receiving an error when trying to complete a simple vertex count traversal in the gremlin console:

gremlin> hadoop_graph = GraphFactory.open('conf/hadoop-graph/olap/olap-cassandra-HadoopGraph-YARN.properties')
gremlin> hg = hadoop_graph.traversal().withComputer(SparkGraphComputer)
gremlin> hg.V().count()
>> ERROR org.apache.spark.SparkContext  - Error initializing SparkContext.
java.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$GetNewApplicationRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google. protobuf.Message
Relevant cluster details
  • JanusGraph Version 0.5.3
  • Spark-Gremlin Version 3.4.6 
  • AWS EMR Release 5.23.0
  • Spark Version 2.4.0
  • Hadoop 2.8.5
  • Cassandra/CQL version 3.11.10 
Implementation Details
Properties File
###########
# Gremlin #
###########

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output


###################
# Index - Elastic #
###################

janusgraphmr.ioformat.conf.index.search.backend=elasticsearch

#Elastic Basic Auth
janusgraphmr.ioformat.conf.index.search.elasticsearch.http.auth.basic.password=[password]
janusgraphmr.ioformat.conf.index.search.elasticsearch.http.auth.basic.username=[username]
janusgraphmr.ioformat.conf.index.search.elasticsearch.http.auth.type=[authtype]


#Hosts
janusgraphmr.ioformat.conf.index.search.hostname=[hosts]
janusgraphmr.ioformat.conf.index.search.index-name=[myindexname]

metrics.console.interval=60000
metrics.enabled= false


#################
# Storage - CQL #
#################

schema.default=none

janusgraphmr.ioformat.conf.storage.backend=cql
janusgraphmr.ioformat.conf.storage.batch-loading=true
janusgraphmr.ioformat.conf.storage.buffer-size=10000
janusgraphmr.ioformat.conf.storage.cql.keyspace=[keyspace]


#HOSTS and PORTS
janusgraphmr.ioformat.conf.storage.hostname=[hosts]
janusgraphmr.ioformat.conf.storage.password=[password]
janusgraphmr.ioformat.conf.storage.username=[username]
cassandra.output.native.port=9042

#InputFormat configuration
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true

####################################
# SparkGraphComputer Configuration #
####################################
spark.master=yarn
spark.submit.deployMode=client

#Spark Job Configurations
spark.sql.shuffle.partitions=1000
spark.dynamicAllocation.enabled=true
spark.shuffle.service.enabled=true
spark.driver.maxResultSize=2G

# Gremlin and Serializer Configuration
gremlin.spark.persistContext=true
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
spark.executor.extraClassPath=/home/hadoop/janusgraph-full-0.5.3/*

# Special Yarn Configuration (WIP)
spark.yarn.archives=/home/hadoop/janusgraph-full-0.5.3/lib.zip
spark.yarn.jars=/home/hadoop/janusgraph-full-0.5.3/lib/*
spark.yarn.appMasterEnv.CLASSPATH=/etc/hadoop/conf:./lib.zip/*:
spark.yarn.dist.archives=/home/hadoop/janusgraph-full-0.5.3/lib.zip
spark.yarn.dist.files=/home/hadoop/janusgraph-full-0.5.3/lib/janusgraph-cql-0.5.3.jar
spark.yarn.shuffle.stopOnFailure=true


#Spark Driver and Executors
spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native
spark.executor.extraClassPath=janusgraph-cql-0.5.3.jar:./lib.zip/*:./lib/keys/*


In addition to this - I've also tried the same configuration file and implementation listed steps above on an older AWS EMR release, which was running Hadoop 2.7:
  • JanusGraph Version 0.5.3
  • Spark-Gremlin Version 3.4.6 
  • AWS EMR Release 5.11.4
  • Spark Version 2.2.1
  • Hadoop 2.7.3
  • Cassandra/CQL version 3.11.10 

And received a similar error in the gremlin console:

java.lang.IllegalStateException: java.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$GetNewApplicationRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google.protobuf.Message



Does anyone have any tips on resolving these issues, or has any experience with implementing JanusGraph 0.5.3 on AWS EMR?

Thanks!


ScriptExecutor Deprecated but Used in gremlin.bat

fredrick.eisele@...
 

When I look at https://github.com/JanusGraph/janusgraph/blob/ce79ec50e0c882c9ccc62e73a2054bcdb2304ece/janusgraph-dist/src/assembly/static/bin/gremlin.bat#L119

I see use of the deprecated http://tinkerpop.apache.org/javadocs/3.2.10/full/org/apache/tinkerpop/gremlin/groovy/jsr223/ScriptExecutor.html
By  http://tinkerpop.apache.org/javadocs/3.4.10/full/org/apache/tinkerpop/gremlin/groovy/jsr223 the class has been removed.

It seems the fix is to replace
java %JAVA_OPTIONS% %JAVA_ARGS% -cp %CP% org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor %strg%
...with... java %JAVA_OPTIONS% %JAVA_ARGS% -cp "%LIBDIR%*;%EXTDIR%;" org.apache.tinkerpop.gremlin.console.Console %*

Should an issue be opened to fix this?

 


Re: JMX authentication for cassandra

hadoopmarc@...
 

Hi Vinayak,

This question is probably better addressed to:
https://cassandra.apache.org/community/

as I cannot remember having seen this discussed in the JanusGraph community.

Best wishes,

Marc


Re: Threads are unresponsive for some time after a particular amount of data transfer(119MB)

hadoopmarc@...
 

Hi Vinayak,

For embedded use of janusgraph, see:
https://docs.janusgraph.org/getting-started/basic-usage/#loading-with-an-index-backend
and replace the properties file with the one currently used by gremlin server.

With embedded use, you can simply do (if your graph is not too large):
vertices = g.V().toList()
edges = g.E().toList()
subGraph = g.E().subgraph('sub').cap('sub').next()

Best wishes,   Marc


Re: Threads are unresponsive for some time after a particular amount of data transfer(119MB)

Vinayak Bali
 

Hi Marc,

I went through some blogs but didn't get a method to connect to janusgraph using embedded mode using java. We are using Cassandra as a backend and cql to connect to it. Not sure how I will be achieving the following:
1. Connection to janusgraph from java in embedded mode with data already present in Cassandra(cql).
2. Is there any way to get the data from Cassandra into in-memory??
Please share blogs or other approaches to successfully test the above.

Thanks & Regards,
Vinayak

On Fri, Mar 12, 2021 at 9:38 PM <hadoopmarc@...> wrote:
Hi Vinayak,

As the link shows, the issue is an issue in TinkerPop, so it cannot be solved here. Of course, you can look for workarounds. As sending result sets of multiple hundreds of Mb is not a typical client operation, you might consider opening the graph in embedded mode, that is without using gremlin server.

Best wishes,   Marc


Re: JMX authentication for cassandra

Vinayak Bali
 

Hi Marc,

The article was useful and complete the JMX authentication successfully. But when I allow password authentication for Cassandra by changing the following lines in Cassandra.yaml, it stops working.

Before: 
authenticator: AllowAllAuthenticator
authorizer: AllowAllAuthorizer
After:
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer

# Authentication backend, implementing IAuthenticator; used to identify users
# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllAuthenticator,
# PasswordAuthenticator}.
#
# - AllowAllAuthenticator performs no checks - set it to disable authentication.
# - PasswordAuthenticator relies on username/password pairs to authenticate
#   users. It keeps usernames and hashed passwords in system_auth.credentials table.
#   Please increase system_auth keyspace replication factor if you use this authenticator.
# Authorization backend, implementing IAuthorizer; used to limit access/provide permissions
# Out of the box, Cassandra provides org.apache.cassandra.auth.{AllowAllAuthorizer,
# CassandraAuthorizer}.
#
# - AllowAllAuthorizer allows any action to any user - set it to disable authorization.
# - CassandraAuthorizer stores permissions in system_auth.permissions table. Please
#   increase system_auth keyspace replication factor if you use this authorizer.

The comments here suggest increasing replication factor, but I don't think that's the issue. Please suggest a blog or changes to be made to enable password authentication for Cassandra.

Thanks & Regards,
Vinayak 


Re: .JanusGraph/Elastic - Too many dynamic script compilations error for LIST type properties

Abhay Pandit
 

Hi Naresh,

I too used to get this exception. This was solved after moving to Janusgraph v0.5.2.

Hope this helps you.

Thanks,
Abhay


On Sat, 13 Mar 2021 at 22:20, <hadoopmarc@...> wrote:
Hi Naresh,

Yes, elasticsearch, I should have recognized the "painless" scripting! This can mean the following things:
  • your use case is maybe unusual, would it be possible to introduce a groupby step in spark that first gathers all property updates for a vertex into one update call?
  • the default value of script.max_compilations_rate may be really too low for your use case, so it is worth a try increasing it (elasticsearch docs do not discourage it). I think this should be done outside janusgraph, just using the elastic API's.
  • the janusgraph code for calling elasticsearch with scripts is suboptimal; I did not investigate this other than checking for existing issues (none). This option will not help you now; if you want to create an issue on janusgraph github, please specify how your system setup is, what the update rates are, etc. You would also have to check whether your issue also holds for janusgraph 0.4.1 or 0.5.3 because 0.3.x is end of life.
Best wishes,    Marc


Re: .JanusGraph/Elastic - Too many dynamic script compilations error for LIST type properties

hadoopmarc@...
 

Hi Naresh,

Yes, elasticsearch, I should have recognized the "painless" scripting! This can mean the following things:
  • your use case is maybe unusual, would it be possible to introduce a groupby step in spark that first gathers all property updates for a vertex into one update call?
  • the default value of script.max_compilations_rate may be really too low for your use case, so it is worth a try increasing it (elasticsearch docs do not discourage it). I think this should be done outside janusgraph, just using the elastic API's.
  • the janusgraph code for calling elasticsearch with scripts is suboptimal; I did not investigate this other than checking for existing issues (none). This option will not help you now; if you want to create an issue on janusgraph github, please specify how your system setup is, what the update rates are, etc. You would also have to check whether your issue also holds for janusgraph 0.4.1 or 0.5.3 because 0.3.x is end of life.
Best wishes,    Marc


Re: .JanusGraph/Elastic - Too many dynamic script compilations error for LIST type properties

Naresh Babu Y
 

Hello Marc,
Thanks for quick reply.

Am not using gremlin server. 

Am using spark, and read all messages per batch
Then open JanusGraph transaction add batch records and commit it.

Here is the details..
JanusGraph version: 0.3.2
Storage system: Hbase
Index : elastic


Please let me know if you have any clue at JanusGraph transaction level/any configuration (because am not using gremlin server)

Thanks,
Naresh


On Sat, 13 Mar 2021, 9:54 pm , <hadoopmarc@...> wrote:
Hi Naresh,

I guess that the script that the error message refers to, is the script that your client executes remotely at gremlin server. You may want to study:
https://tinkerpop.apache.org/docs/current/reference/#parameterized-scripts

which, depending on how you coded the frequent updates, can dramatically diminish the time spent on script compilation by gremlin server. This is also what the exception messages means with "use indexed, or scripts with parameters instead".

Best wishes,    Marc


Re: .JanusGraph/Elastic - Too many dynamic script compilations error for LIST type properties

hadoopmarc@...
 

Hi Naresh,

I guess that the script that the error message refers to, is the script that your client executes remotely at gremlin server. You may want to study:
https://tinkerpop.apache.org/docs/current/reference/#parameterized-scripts

which, depending on how you coded the frequent updates, can dramatically diminish the time spent on script compilation by gremlin server. This is also what the exception messages means with "use indexed, or scripts with parameters instead".

Best wishes,    Marc


Re: Incomplete javadoc

hadoopmarc@...
 

Hi Boxuan,

Thanks for pointing this out. Now I can provide this link when needed.

Then, there are still broken links to RelationIdentifier in janusgraph-core, see the last link in my original post.

Best wishes,    Marc


.JanusGraph/Elastic - Too many dynamic script compilations error for LIST type properties

Naresh Babu Y
 

Hi,
we are using janusgraph ( version 0.3.2) with elastic 6.

when updating a node/vertex with property of LIST cardinality which is mixed index frequently getting below exception and data is not stored/updated.
{type=illegal_argument_exception, reason=failed to execute script, caused_by={type=general_script_exception, reason=Failed to compile inline script 
[if(ctx._source["property123"] == null) ctx._source["property123"] = [];ctx._source["property123"].add("jkkhhj#1");] using lang [painless], caused_by={type=circuit_breaking_exception, reason=[script] Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting, bytes_wanted=0, bytes_limit=0}}}

we have requirement to update property of LIST type frequently, but changing max_compilations_rate to large number is not a good idea.

please let me know, if any other option to handle this in janusgraph?

Thanks,
Naresh


Re: Count Query Optimization

Boxuan Li
 

Apart from rewriting the query, there are some config options (https://docs.janusgraph.org/basics/configuration-reference/#query) worth trying:

1) Turn on query.batch
2) Turn off 
query.fast-property


Re: Count Query Optimization

AMIYA KUMAR SAHOO
 

Hi Marc,

Vinayak query has a filter on inV property (property1 = B), hence I did not stop at edge itself.

If this kind of query is frequent, decision can be made if the same value makes sense to keep duplicate at both vertex and edge. That will help eliminate the traversal to the out vertex.

Regards,
Amiya


Re: Incomplete javadoc

Boxuan Li
 

On Mar 12, 2021, at 11:56 PM, hadoopmarc@... wrote:



Re: Threads are unresponsive for some time after a particular amount of data transfer(119MB)

hadoopmarc@...
 

Hi Vinayak,

As the link shows, the issue is an issue in TinkerPop, so it cannot be solved here. Of course, you can look for workarounds. As sending result sets of multiple hundreds of Mb is not a typical client operation, you might consider opening the graph in embedded mode, that is without using gremlin server.

Best wishes,   Marc


Incomplete javadoc

hadoopmarc@...
 

921 - 940 of 6661