Date   

Re: Error when running JanusGraph with YARN and CQL

Varun Ganesh <operatio...@...>
 

Thanks a lot for responding Marc.

Yes, I had initially tried setting spark.yarn.archive with the path to spark-gremlin.zip. However with this approach, the containers were failing with the message "Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher".

I'm yet to understand the differences between the spark.yarn.archive and the HADOOP_GREMLIN_LIBS approaches. Will update this thread as I find out more.

Thank you,
Varun

On Friday, December 11, 2020 at 2:05:35 AM UTC-5 HadoopMarc wrote:
Hi Varun,

Good job. However, your last solution will only work with everything running on a single machine. So, indeed, there is something wrong with the contents of spark-gremlin.zip or with the way it is put in the executor's local working directory. Note that you already put /Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar explicitly on the executor classpath while it should have been available already through ./spark-gremlin.zip/*

O, I think I see now what is different. You have used spark.yarn.dist.archives, while the TinkerPop recipes use spark.yarn.archive. They behave differently in yes/no extracting the jars from the zip. I guess either can be used, provided it is done consistently. You can use the environment tab in Spark web UI to inspect how things are picked up by spark.

Best wishes,    Marc

Op donderdag 10 december 2020 om 20:23:32 UTC+1 schreef Varun Ganesh:
Answering my own question. I was able fix the above error and successfully run the count job after explicitly adding /Users/my_comp/Downloads/janusgraph-0.5.2/lib/* to spark.executor.extraClassPath

But I am not yet sure as to why that was needed. I had assumed that adding spark-gremlin.zip to the path would have provided the required dependencies.

On Thursday, December 10, 2020 at 1:00:24 PM UTC-5 Varun Ganesh wrote:
An update on this, I tried setting the env var below:

export HADOOP_GREMLIN_LIBS=$GREMLIN_HOME/lib

After doing this I was able to successfully run the tinkerpop-modern.kryo example from the Recipes documentation
(though the guide at http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html explicitly asks us to ignore this)

Unfortunately, it is still not working with CQL. But the error is now different. Please see below:

12:46:33 ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 9, 192.168.1.160, executor 2): java.lang.NoClassDefFoundError: org/janusgraph/hadoop/formats/util/HadoopInputFormat
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
... (skipping)
Caused by: java.lang.ClassNotFoundException: org.janusgraph.hadoop.formats.util.HadoopInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 130 more

Is there some additional dependency that I may need to add?

Thanks in advance!
On Wednesday, December 9, 2020 at 11:49:29 PM UTC-5 Varun Ganesh wrote:
Hello,

I am trying to run SparkGraphComputer on a JanusGraph backed by Cassandra and ElasticSearch. I have previously verified that I am able to run SparkGraphComputer on a local Spark standalone cluster.

I am now trying to run it on YARN. I have a local YARN cluster running and I have verified that it can run Spark jobs.

I followed the following links:

And here is my read-cql-yarn.properties file:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true

#
# JanusGraph Cassandra InputFormat configuration
#
# These properties defines the connection properties which were used while write data to JanusGraph.
janusgraphmr.ioformat.conf.storage.backend=cql
# This specifies the hostname & port for Cassandra data store.
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9042
# This specifies the keyspace where data is stored.
janusgraphmr.ioformat.conf.storage.cql.keyspace=janusgraph
# This defines the indexing backend configuration used while writing data to JanusGraph.
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch
janusgraphmr.ioformat.conf.index.search.hostname=127.0.0.1
# Use the appropriate properties for the backend when using a different storage backend (HBase) or indexing backend (Solr).

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true

#
# SparkGraphComputer Configuration
#
spark.master=yarn
spark.submit.deployMode=client
spark.executor.memory=1g

spark.yarn.dist.archives=/tmp/spark-gremlin.zip
spark.yarn.dist.files=/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar
spark.yarn.appMasterEnv.CLASSPATH=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:./spark-gremlin.zip/*
spark.executor.extraClassPath=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar:./spark-gremlin.zip/*

spark.driver.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64

spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

After a bunch of trial and error, I was able to get it to a point where I see containers starting up on my YARN Resource manager UI (port 8088)

Here is the code I am running (it's a simple count):
gremlin> graph = GraphFactory.open('conf/hadoop-graph/read-cql-yarn.properties')
==>hadoopgraph[cqlinputformat->nulloutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cqlinputformat->nulloutputformat], sparkgraphcomputer]
gremlin> g.V().count()

However I am encountering the following failure:
18:49:03 ERROR org.apache.spark.scheduler.TaskSetManager - Task 2 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 10, 192.168.1.160, executor 1): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2862)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1682)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2366)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2290)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2148)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1647)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:483)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:441)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Would really appricate it if someone could shed some light on this error and advise on next steps!

Thank you!


Re: How to improve traversal query performance

HadoopMarc <bi...@...>
 

Hi Manabu,

Yes, providing an example graph works much better in exploring the problem space.  I am afraid, though, that I did not find much that will help you out.
  • on a single machine with cassandra and using gremlin console with embedded janusgraph, the total query times stated by profile() deviated significantly from the experienced wall clock times even when everything had a cold start:
                                                                    total profile (ms)               System.currentTimeMillis(ms)
    repeat query, no query-batch             1113                                      1775
    repeat query                                           622                                       1096  
    repeat query (warm caches)                 40                                         360
    Did you do any wall clock query performance tests on your production system with warmed caches? Results might be better - or worse - than suggested by profile().
  • the values 5, 161, 9, 8,... under the repeat step add up to the number of touched edges (2828) in the graph. For this generated graph the number of traversers (2925) is dominated by this number of edges. Trying to bulk any intermediate results using sack will have little effect (in contrary to what I suggested earlier). From another perspective, you can check that the following query without any path references still results in the same number of 2925 traversers:
    g.V().has('serial', within(startIds)).repeat(inE('assembled').outV()).emit().profile()
  • other people wanting to play with this graph should use the following line in Manabu's code:
    columns = line.split(' ', -1)
So, concluding, there does not seem to be much you can do about the query: you simply want a large resultset from a traversal with multiple steps. Depending on the size of you graph, you could hod the graph in memory using the inmemory backend, or you could replace cassandra with cql and put on it on infrastructure with SSD storage. Of course, you could also precompute and store results, or split up the query with repeat().times(1), repeat().times(2), etc. for faster intermediate results.

Best wishes,    Marc


Op dinsdag 8 december 2020 om 08:56:03 UTC+1 schreef Manabu Kotani:

Hi Marc,

Profile outputs I tried.

1. g.V().has('serial',within('XXXXXX','YYYYYY')).inE('assembled').outV()
----------------------------------------------------------------------
gremlin> g.V().has('serial', within('1654145144','1648418968','1652445288','1654952168','1653379120','1654325440','1653383216','1658298568','1649680536','1649819672','1654964456','1649729552','1656103144','1655460032','1656111336','1654669360')).inE('assembled').outV().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[serial.within([1654145144, 1...                    16          16          18.860    63.13
    \_condition=((serial = 1654145144 OR serial = 1648418968 OR serial = 1652445288 OR serial = 1654952168 OR
               serial = 1653379120 OR serial = 1654325440 OR serial = 1653383216 OR serial = 1658298568 OR se
               rial = 1649680536 OR serial = 1649819672 OR serial = 1654964456 OR serial = 1649729552 OR seri
               al = 1656103144 OR serial = 1655460032 OR serial = 1656111336 OR serial = 1654669360))
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[16]@2000
    \_index=bySerial
  optimization                                                                                 0.058
  optimization                                                                                 0.694
  backend-query                                                       16                      17.823
    \_query=bySerial:multiKSQ[16]@2000
    \_limit=2000
JanusGraphVertexStep(IN,[assembled],vertex)                           73          73          11.016    36.87
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_multi=true
    \_vertices=16
  optimization                                                                                 0.205
  backend-query                                                       73                       9.332
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
                                            >TOTAL                     -           -          29.877        -

2. g.V().has('serial',within('XXXXXX','YYYYYY')).as('a').in('assembled').inE('assembled').where(outV(), neq('a')).outV()            // query not tested
----------------------------------------------------------------------
gremlin> g.V().has('serial', within('1654145144','1648418968','1652445288','1654952168','1653379120','1654325440','1653383216','1658298568','1649680536','1649819672','1654964456','1649729552','1656103144','1655460032','1656111336','1654669360')).as('a').in('assembled').inE('assembled').where(outV().is(neq('a'))).outV().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[serial.within([1654145144, 1...                    16          16          19.980    26.52
    \_condition=((serial = 1654145144 OR serial = 1648418968 OR serial = 1652445288 OR serial = 1654952168 OR
               serial = 1653379120 OR serial = 1654325440 OR serial = 1653383216 OR serial = 1658298568 OR se
               rial = 1649680536 OR serial = 1649819672 OR serial = 1654964456 OR serial = 1649729552 OR seri
               al = 1656103144 OR serial = 1655460032 OR serial = 1656111336 OR serial = 1654669360))
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[16]@2000
    \_index=bySerial
  optimization                                                                                 0.026
  optimization                                                                                 0.588
  backend-query                                                       16                      18.813
    \_query=bySerial:multiKSQ[16]@2000
    \_limit=2000
JanusGraphVertexStep(IN,[assembled],vertex)                           73          73           6.521     8.66
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_multi=true
    \_vertices=16
  optimization                                                                                 0.154
  backend-query                                                       73                       5.310
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
JanusGraphVertexStep(IN,[assembled],edge)                           2578        2578          20.170    26.77
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_multi=true
    \_vertices=59
  optimization                                                                                 0.032
  backend-query                                                     2578                      10.266
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
TraversalFilterStep([EdgeVertexStep(OUT), Profi...                  2578        2578          21.824    28.97
  EdgeVertexStep(OUT)                                               2578        2578           4.776
  IsStep(neq(a))                                                                               6.172
EdgeVertexStep(OUT)                                                 2578        2578           6.842     9.08
                                            >TOTAL                     -           -          75.338        -


3. Results I want to get.
----------------------------------------------------------------------
g.V().has('serial', within('1654145144','1648418968','1652445288','1654952168','1653379120','1654325440','1653383216','1658298568','1649680536','1649819672','1654964456','1649729552','1656103144','1655460032','1656111336','1654669360')).as('a').repeat(inE('assembled').as('b').outV().as('c').simplePath()).emit().select('a').id().as('parent').select('b').values('work_date').as('work_date').select('c').values('serial').as('child').select('parent','child','work_date').order().by('parent').by('child').by('work_date').profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[serial.within([1654145144, 1...                    16          16          24.028     4.95
    \_condition=((serial = 1654145144 OR serial = 1648418968 OR serial = 1652445288 OR serial = 1654952168 OR
               serial = 1653379120 OR serial = 1654325440 OR serial = 1653383216 OR serial = 1658298568 OR se
               rial = 1649680536 OR serial = 1649819672 OR serial = 1654964456 OR serial = 1649729552 OR seri
               al = 1656103144 OR serial = 1655460032 OR serial = 1656111336 OR serial = 1654669360))
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[16]@2000
    \_index=bySerial
  optimization                                                                                 0.074
  optimization                                                                                 1.256
  backend-query                                                       16                     312.583
    \_query=bySerial:multiKSQ[16]@2000
    \_limit=2000
RepeatStep([JanusGraphVertexStep(IN,[assembled]...                  2925        2925         272.924    56.26
  JanusGraphVertexStep(IN,[assembled],edge)@[b]                     2925        2925         223.728
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_multi=true
    \_vertices=9
    optimization                                                                               0.203
    backend-query                                                      5                       1.557
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.020
    backend-query                                                    161                       2.356
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.125
    backend-query                                                      9                      25.853
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.023
    backend-query                                                      8                       2.168
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.024
    backend-query                                                      0                       1.808
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      5                       1.354
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.024
    backend-query                                                    161                       1.989
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.040
    backend-query                                                      9                       3.490
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      0                       2.231
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.061
    backend-query                                                      5                       1.877
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.027
    backend-query                                                    161                       4.645
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.056
    backend-query                                                     10                       2.554
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.073
    backend-query                                                      9                       4.274
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      0                       1.199
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      5                       1.165
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.025
    backend-query                                                    161                       8.010
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.032
    backend-query                                                      9                       1.542
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.032
    backend-query                                                      4                       5.402
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.038
    backend-query                                                      5                       4.173
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.029
    backend-query                                                    161                       4.113
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.056
    backend-query                                                      9                       1.617
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.035
    backend-query                                                      0                       1.517
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.044
    backend-query                                                      5                       1.522
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.045
    backend-query                                                    161                       1.985
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.029
    backend-query                                                      9                       1.435
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                      0                       1.034
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.021
    backend-query                                                      3                       1.108
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.024
    backend-query                                                    161                       1.785
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.036
    backend-query                                                      9                       7.190
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.032
    backend-query                                                      8                      12.321
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.044
    backend-query                                                      0                       1.926
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.032
    backend-query                                                      5                       1.782
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                    161                       3.398
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.031
    backend-query                                                      9                       1.412
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      0                       1.212
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.035
    backend-query                                                      5                       1.283
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                    161                       2.149
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.035
    backend-query                                                      9                       1.415
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.025
    backend-query                                                      1                       1.214
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.025
    backend-query                                                      3                       1.313
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.027
    backend-query                                                    161                       2.004
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.038
    backend-query                                                      9                       8.265
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.031
    backend-query                                                      0                       1.718
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                      5                       1.489
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.029
    backend-query                                                    161                       2.066
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.029
    backend-query                                                      9                       1.361
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.024
    backend-query                                                      2                       1.454
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      5                       1.234
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.025
    backend-query                                                    161                       1.819
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.027
    backend-query                                                      9                       1.361
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      0                       1.136
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                      5                       1.265
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.029
    backend-query                                                    161                      10.425
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.040
    backend-query                                                      9                       2.437
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.050
    backend-query                                                      0                       1.462
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.049
    backend-query                                                      5                       2.208
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                    163                       2.415
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.035
    backend-query                                                      9                       1.252
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.027
    backend-query                                                      0                       1.164
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.032
    backend-query                                                      4                       1.335
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                    161                       1.944
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.028
    backend-query                                                      9                       1.473
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.026
    backend-query                                                      0                       1.114
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.025
    backend-query                                                      3                       1.279
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.025
    backend-query                                                    161                       1.867
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.033
    backend-query                                                      9                       1.463
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.027
    backend-query                                                      0                       1.169
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
  EdgeVertexStep(OUT)@[c]                                           2925        2925           7.733
  PathFilterStep(simple)                                            2925        2925          10.508
  JanusGraphMultiQueryStep(RepeatEndStep)                           2925        2925          14.827
  RepeatEndStep                                                     2925        2925           9.754
SelectOneStep(last,a)                                               2925        2925           8.340     1.72
IdStep@[parent]                                                     2925        2925           7.347     1.51
SelectOneStep(last,b)                                               2925        2925           8.690     1.79
JanusGraphPropertiesStep([work_date],value)@[wo...                  2925        2925          35.051     7.22
SelectOneStep(last,c)                                               2925        2925           9.512     1.96
JanusGraphPropertiesStep([serial],value)@[child]                    2925        2925          79.337    16.35
    \_condition=type[serial]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@811c505d
    \_multi=true
    \_vertices=302
  optimization                                                                                 0.044
  backend-query                                                      302                      53.962
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@811c505d
SelectStep(last,[parent, child, work_date])                         2925        2925          10.705     2.21
OrderGlobalStep([[value(parent), asc], [value(c...                  2925        2925          29.210     6.02
                                            >TOTAL                     -           -         485.149        -


Best regards,
Manabu

2020年11月26日木曜日 16:07:27 UTC+9 HadoopMarc:
Hi Manabu,

OK, I think for this graph structure your initial query is fine for getting the right output results. Still, to better understand the impact on the performance of using sack() you might want to split up your query during experimentation:
  1. g.V().has('serial',within('XXXXXX','YYYYYY')).inE('assembled').outV()
  2. g.V().has('serial',within('XXXXXX','YYYYYY')).as('a').in('assembled').inE('assembled').where(outV(), neq('a')).outV()            // query not tested

Note that I did not use the SimplePath() step, but because it probably precludes the bulking after adding appropriate sack() steps.
If you want others to step in for getting the sack() steps right, please provide the gremlin steps to create your sample graph and the query you have already tried with query output and profile() output.

Best wishes,      Marc


Op donderdag 26 november 2020 om 00:29:33 UTC+1 schreef Manabu Kotani:
Hi Marc,

Sorry, I forgot an attachment (Image of tree structure).

Relationships between vertices and edges are like below.
(Label:item, serial:A)<--[Label:assembled, work_date:2020-11-24]--(Label:item, serial:B)
(Label:item, serial:A)<--[Label:assembled, work_date:2020-11-25]--(Label:item, serial:C)  
(Label:item, serial:B)<--[Label:assembled, work_date:2020-11-23]--(Label:item, serial:D)    
(Label:item, serial:B)<--[Label:assembled, work_date:2020-11-22]--(Label:item, serial:E)      
(Label:item, serial:C)<--[Label:assembled, work_date:2020-11-21]--(Label:item, serial:E)        
(Label:item, serial:C)<--[Label:assembled, work_date:2020-11-20]--(Label:item, serial:F)          



Best regards,
Manabu

2020年11月25日水曜日 19:55:54 UTC+9 HadoopMarc:
Hi Manabu,

What edge are present between vertices A,B,C,D,E,F?

If there are only edges A-B, A-C, A-D, A-E, A-F, you do not need repeat().

Best wishes,    Marc


Op woensdag 25 november 2020 om 09:22:34 UTC+1 schreef Manabu Kotani:
Hi Marc,

Thank you for your quick reply.

Sorry for the lack of my explanation.
I have a graph like below. (There are 3 levels in this figure, but not necessarily 3 levels.)


When query by "A" for property "serial", then I would like to get results like these.
1. A, B, 2020-11-24
2. A, C, 2020-11-25
3. A, D, 2020-11-23
4. A, E, 2020-11-21
5. A, E, 2020-11-22
6. A, F, 2020-11-20

In this situation, how shoud I use until() step?

Sorry for my low comprehension, I've just started to learn Gremlin.

Best regards,
Manabu

2020年11月25日水曜日 15:46:35 UTC+9 HadoopMarc:
Hi Manabu,

repeat()/simplePath()/emit() can have valid uses, although normally you combine it with the times() or until() step to limit the number of repeats. The profile from your query suggests that the repeat step never takes effect, that is, each traversal takes only a single step from parent to child. The repeat step is not wrong in itself, but if it is not necessary you do not want it to be there if you do not know its impact on performance.

Best wishes,    Marc

Op dinsdag 24 november 2020 om 08:43:36 UTC+1 schreef Manabu Kotani:
Hi Marc,

Thank you for your reply.

I'm reading ref docs that you referred about sack()/barrier(). But, I've not able to understand yet.

One question.
What means this you advised? repeat()/simplePath()/emit() steps should not be used?

  • in the current traversal the repeat(), simplePath() and emit() steps have no effect if the children do not assemble children themselves. So you can leave these steps out for clarity and to be sure that they do not influence the janusgraph execution plan
Best regards,
Manabu

2020年11月21日土曜日 20:13:28 UTC+9 HadoopMarc:
Hi Manabu,

Without knowing if the considerations below will really help, you may try the following:
Best wishes,    Marc

Op donderdag 19 november 2020 om 02:37:48 UTC+1 schreef Manabu Kotani:
Hi All,

I'm testing traversal query performance.
My query (please see below) takes about 1.8sec.

Is there solution for improve performance (faster than 1.8sec)?
I hope that takes less than 500ms.

1.Environment:
JanusGraph (0.5.2) + Cassandra (3.11.0) on Docker Desktop (Windows)

2.Schema:
------------------------------------------------------------------------------------------------
Vertex Label Name              | Partitioned | Static                                             |
---------------------------------------------------------------------------------------------------
item                           | false       | false                                              |
---------------------------------------------------------------------------------------------------
Edge Label Name                | Directed    | Unidirected | Multiplicity                         |
---------------------------------------------------------------------------------------------------
assembled                      | true        | false       | MULTI                                |
---------------------------------------------------------------------------------------------------
Property Key Name              | Cardinality | Data Type                                          |
---------------------------------------------------------------------------------------------------
serial                         | SINGLE      | class java.lang.String                             |
work_date                      | SINGLE      | class java.util.Date                               |
---------------------------------------------------------------------------------------------------
Vertex Index Name              | Type        | Unique    | Backing        | Key:           Status |
---------------------------------------------------------------------------------------------------
bySerial                       | Composite   | false     | internalindex  | serial:       ENABLED |
---------------------------------------------------------------------------------------------------
Edge Index (VCI) Name          | Type        | Unique    | Backing        | Key:           Status |
---------------------------------------------------------------------------------------------------
byWorkDate                     | Composite   | false     | internalindex  | work_date:    ENABLED |
---------------------------------------------------------------------------------------------------
Relation Index                 | Type        | Direction | Sort Key       | Order    |     Status |
---------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------  

3.Query:
g.V().has('serial',within('XXXXXX','YYYYYY',....<- 100 search keys).as('a')
.repeat(inE('assembled').as('b').outV().as('c').simplePath())
.emit()
.select('a').values('serial').as('parent')
.select('b').values('work_date').as('work_date')
.select('c').values('serial').as('child')
.select('parent','child','work_date')
.order().by('parent').by('child').by('work_date')
----------------------------------------------------------------------------------------------------------- 
 
4.Query Profile:
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[serial.within([XXXXXX...                   100         100         159.582     8.89
    \_condition=((serial = XXXXXX OR serial = YYYYYY OR .... <- 100 search keys))
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[100]@2000
    \_index=bySerial
  optimization                                                                                 0.018
  optimization                                                                                 6.744
  backend-query                                                      100                    1074.225
    \_query=bySerial:multiKSQ[100]@2000
    \_limit=2000
RepeatStep([JanusGraphVertexStep(IN,[assembled]...                 20669       20669         857.001    47.74
  JanusGraphVertexStep(IN,[assembled],edge)@[b]                    20669       20669         633.529
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_multi=true
    \_vertices=204
    optimization                                                                               0.477
    backend-query                                                    228                       2.076
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.150
    backend-query                                                      0                      43.366
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.093
    backend-query                                                    229                       1.978
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.107
    backend-query                                                      0                      32.738
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.111
    backend-query                                                    229                       1.577
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.107
    backend-query                                                      0                      17.827
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.085
    backend-query                                                    229                       1.517
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.108
    backend-query                                                      0                       5.729
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.071
    backend-query                                                    228                       1.993
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.083
    backend-query                                                      0                       3.335
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.150
    backend-query                                                    229                       1.890
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.137
    backend-query                                                      0                      32.593
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.110
    backend-query                                                    229                       2.253
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.069
    backend-query                                                    230                       1.624
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                      0                      12.797
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.116
    backend-query                                                    229                       1.579
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.090
    backend-query                                                      0                       5.764
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.107
    backend-query                                                    229                       1.651
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.134
    backend-query                                                      0                      22.327
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.074
    backend-query                                                    229                       1.756
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.075
    backend-query                                                      0                      11.145
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.069
    backend-query                                                    229                       1.947
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.086
    backend-query                                                      0                       3.727
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.100
    backend-query                                                    116                       1.492
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.085
    backend-query                                                      0                      27.159
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.132
    backend-query                                                    229                       1.524
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.100
    backend-query                                                      0                       7.173
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.075
    backend-query                                                    230                       1.880
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.114
    backend-query                                                      0                       3.696
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.085
    backend-query                                                    228                       1.645
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.143
    backend-query                                                      0                       2.924
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.105
    backend-query                                                    229                       2.010
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.316
    backend-query                                                      0                       3.806
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.095
    backend-query                                                    230                       1.854
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.185
    backend-query                                                    229                       1.936
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.099
    backend-query                                                      0                       2.135
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                    231                       1.479
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.067
    backend-query                                                      0                       5.907
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.069
    backend-query                                                      1                       1.129
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.109
    backend-query                                                      0                       1.069
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.082
    backend-query                                                    231                       1.245
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.072
    backend-query                                                      0                       1.175
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.064
    backend-query                                                    229                       1.308
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.078
    backend-query                                                      0                       7.058
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.145
    backend-query                                                    231                       1.655
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.115
    backend-query                                                      0                       3.946
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.067
    backend-query                                                    117                       1.231
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.063
    backend-query                                                      0                      11.856
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.065
    backend-query                                                    230                       1.606
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.072
    backend-query                                                      0                       6.973
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                    229                       1.445
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.088
    backend-query                                                    230                       1.836
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.098
    backend-query                                                      0                       2.552
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.088
    backend-query                                                    116                       1.450
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.060
    backend-query                                                      0                       4.072
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.062
    backend-query                                                    229                       1.421
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.058
    backend-query                                                      0                       2.342
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.058
    backend-query                                                    229                       0.999
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                      0                       1.847
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.063
    backend-query                                                    229                       1.171
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.064
    backend-query                                                      0                       0.999
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.051
    backend-query                                                    228                       0.991
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                      0                       2.107
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.215
    backend-query                                                    116                       1.678
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.069
    backend-query                                                    229                       1.578
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.081
    backend-query                                                      0                       3.649
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.096
    backend-query                                                    229                       1.619
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.066
    backend-query                                                    228                       1.549
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                    116                       1.610
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.154
    backend-query                                                    228                       1.746
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.092
    backend-query                                                      0                       2.958
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.093
    backend-query                                                    232                       1.698
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.143
    backend-query                                                    229                       1.719
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.081
    backend-query                                                      0                       2.809
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.065
    backend-query                                                    229                       1.410
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.082
    backend-query                                                    229                       1.458
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.109
    backend-query                                                    228                       1.651
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.066
    backend-query                                                    228                       1.417
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.111
    backend-query                                                    117                       1.536
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.188
    backend-query                                                      0                       1.660
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.132
    backend-query                                                    229                       2.361
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.110
    backend-query                                                      0                       2.384
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.140
    backend-query                                                    229                       1.680
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.067
    backend-query                                                    230                       1.342
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                      0                       3.129
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.118
    backend-query                                                    231                       1.397
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.169
    backend-query                                                      0                       5.665
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.114
    backend-query                                                    116                       1.780
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.128
    backend-query                                                      0                       2.316
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.108
    backend-query                                                    229                       1.521
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.083
    backend-query                                                    231                       1.508
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.074
    backend-query                                                      0                       2.327
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.092
    backend-query                                                    116                       1.509
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.285
    backend-query                                                      0                       2.007
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.079
    backend-query                                                    116                       1.245
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.134
    backend-query                                                    230                       1.521
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.092
    backend-query                                                      1                       1.278
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.064
    backend-query                                                      0                       1.104
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.076
    backend-query                                                    231                       1.287
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.079
    backend-query                                                    229                       1.768
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.098
    backend-query                                                      0                       2.570
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.110
    backend-query                                                    116                       1.489
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.061
    backend-query                                                      0                       1.756
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.055
    backend-query                                                    229                       1.133
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.060
    backend-query                                                    116                       1.241
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.056
    backend-query                                                      0                       2.435
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.056
    backend-query                                                    228                       1.099
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.061
    backend-query                                                      0                       1.017
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.080
    backend-query                                                    229                       1.217
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.065
    backend-query                                                    230                       1.448
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.065
    backend-query                                                    229                       1.546
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.079
    backend-query                                                    230                       1.955
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.165
    backend-query                                                      0                       3.284
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.102
    backend-query                                                    229                       1.936
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.081
    backend-query                                                      0                       4.640
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.072
    backend-query                                                    229                       1.384
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.062
    backend-query                                                      0                       2.224
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.088
    backend-query                                                    116                       1.419
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.069
    backend-query                                                      0                       2.289
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                    231                       1.474
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.071
    backend-query                                                    229                       1.646
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.072
    backend-query                                                      0                       1.408
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.068
    backend-query                                                    230                       1.974
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.090
    backend-query                                                    229                       1.923
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.151
    backend-query                                                    230                       2.211
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.074
    backend-query                                                    230                       1.234
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.059
    backend-query                                                      0                       1.695
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.125
    backend-query                                                    230                       1.199
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.064
    backend-query                                                      0                       1.089
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.057
    backend-query                                                    116                       1.807
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.085
    backend-query                                                      0                       1.299
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.074
    backend-query                                                    228                       1.397
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.081
    backend-query                                                    228                       1.776
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.079
    backend-query                                                      0                       1.980
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.101
    backend-query                                                    229                       1.571
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                    231                       1.483
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.064
    backend-query                                                      0                       2.260
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.060
    backend-query                                                    230                       1.471
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.070
    backend-query                                                    232                       1.305
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.060
    backend-query                                                    229                       1.246
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.063
    backend-query                                                    229                       1.093
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.053
    backend-query                                                    229                       1.420
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.062
    backend-query                                                    226                       1.596
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.145
    backend-query                                                      0                       2.730
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.059
    backend-query                                                    229                       1.550
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.076
    backend-query                                                    231                       1.622
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.058
    backend-query                                                    117                       1.224
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.108
    backend-query                                                      0                       2.025
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.062
    backend-query                                                    230                       1.251
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.058
    backend-query                                                    230                       1.223
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.068
    backend-query                                                    116                       1.224
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.051
    backend-query                                                      0                       0.937
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.045
    backend-query                                                    116                       1.597
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.058
    backend-query                                                    228                       1.595
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.063
    backend-query                                                      0                       3.238
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.058
    backend-query                                                    229                       1.573
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.078
    backend-query                                                    231                       1.894
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.092
    backend-query                                                    230                       1.717
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    optimization                                                                               0.061
    backend-query                                                    231                       1.302
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
  EdgeVertexStep(OUT)@[c]                                          20669       20669          39.223
  PathFilterStep(simple)                                           20669       20669          44.905
  JanusGraphMultiQueryStep(RepeatEndStep)                          20669       20669          65.528
  RepeatEndStep                                                    20669       20669          39.443
SelectOneStep(last,a)                                              20669       20669          44.574     2.48
JanusGraphPropertiesStep([serial],value)@[parent]                  20669       20669          92.515     5.15
    \_condition=type[serial]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@811c505d
    \_multi=true
    \_vertices=100
  optimization                                                                                 0.090
  backend-query                                                      100                      12.807
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@811c505d
SelectOneStep(last,b)                                              20669       20669          41.753     2.33
JanusGraphPropertiesStep([work_date],value)@[wo...                 20669       20669          98.648     5.50
SelectOneStep(last,c)                                              20669       20669          41.674     2.32
JanusGraphPropertiesStep([serial],value)@[child]                   20669       20669         246.094    13.71
    \_condition=type[serial]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@811c505d
    \_multi=true
    \_vertices=1392
  optimization                                                                                 0.060
  backend-query                                                     1392                     136.281
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@811c505d
SelectStep(last,[parent, child, work_date])                        20669       20669          49.139     2.74
OrderGlobalStep([[value(parent), asc], [value(c...                 20669       20669         164.034     9.14
                                            >TOTAL                     -           -        1795.018        -
-----------------------------------------------------------------------------------------------------------

Sorry for my poor English.
Thanks,
Manabu


Re: Profile() seems inconsisten with System.currentTimeMillis

HadoopMarc <bi...@...>
 

In the mean time I found that the difference between profile() and currentTimeMillis can be much larger. Apparently, the profile() step takes into account that for real queries, vertices are not present in the database cache and assumes some time duration to retrieve a vertex or properties from the backend. Is there any documentation on these assumptions?

Best wishes,    Marc

Op vrijdag 11 december 2020 om 09:58:21 UTC+1 schreef HadoopMarc:


Hi,

Can anyone explain why the total duration displayed by the profile() step is more than twice as large as the time difference clocked with System.currentTimeMillis?
see below, For those who wonder, the query without profile() also takes about 300 msec.

Thanks,      Marc

gremlin> start = System.currentTimeMillis()
==>1607676127027
gremlin> g.V().has('serial', within('1654145144','1648418968','1652445288','1654952168','1653379120', '1654325440','1653383216','1658298568','1649680536','1649819672','1654964456','1649729552', '1656103144','1655460032','1656111336','1654669360')).inE('assembled').outV().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[serial.within([1654145144, 1...                    16          16           0,486    59,26
    \_condition=((serial = 1654145144 OR serial = 1648418968 OR serial = 1652445288 OR serial = 1654952168 OR
               serial = 1653379120 OR serial = 1654325440 OR serial = 1653383216 OR serial = 1658298568 OR se
               rial = 1649680536 OR serial = 1649819672 OR serial = 1654964456 OR serial = 1649729552 OR seri
               al = 1656103144 OR serial = 1655460032 OR serial = 1656111336 OR serial = 1654669360))
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[16]@2000
    \_index=bySerial
  optimization                                                                                 0,009
  optimization                                                                                 0,267
JanusGraphVertexStep(IN,[assembled],vertex)                           73          73           0,334    40,74
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_vertices=1
  optimization                                                                                 0,037
  optimization                                                                                 0,008
  optimization                                                                                 0,005
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,017
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
                                            >TOTAL                     -           -           0,820        -
gremlin> System.currentTimeMillis() - start
==>322



Profile() seems inconsisten with System.currentTimeMillis

HadoopMarc <bi...@...>
 


Hi,

Can anyone explain why the total duration displayed by the profile() step is more than twice as large as the time difference clocked with System.currentTimeMillis?
see below, For those who wonder, the query without profile() also takes about 300 msec.

Thanks,      Marc

gremlin> start = System.currentTimeMillis()
==>1607676127027
gremlin> g.V().has('serial', within('1654145144','1648418968','1652445288','1654952168','1653379120', '1654325440','1653383216','1658298568','1649680536','1649819672','1654964456','1649729552', '1656103144','1655460032','1656111336','1654669360')).inE('assembled').outV().profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[serial.within([1654145144, 1...                    16          16           0,486    59,26
    \_condition=((serial = 1654145144 OR serial = 1648418968 OR serial = 1652445288 OR serial = 1654952168 OR
               serial = 1653379120 OR serial = 1654325440 OR serial = 1653383216 OR serial = 1658298568 OR se
               rial = 1649680536 OR serial = 1649819672 OR serial = 1654964456 OR serial = 1649729552 OR seri
               al = 1656103144 OR serial = 1655460032 OR serial = 1656111336 OR serial = 1654669360))
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[16]@2000
    \_index=bySerial
  optimization                                                                                 0,009
  optimization                                                                                 0,267
JanusGraphVertexStep(IN,[assembled],vertex)                           73          73           0,334    40,74
    \_condition=type[assembled]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812bd43d
    \_vertices=1
  optimization                                                                                 0,037
  optimization                                                                                 0,008
  optimization                                                                                 0,005
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,017
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
  optimization                                                                                 0,004
                                            >TOTAL                     -           -           0,820        -
gremlin> System.currentTimeMillis() - start
==>322



Re: Running OLAP on HBase with SparkGraphComputer fails with Error Container killed by YARN for exceeding memory limits

HadoopMarc <bi...@...>
 

Hi Roy,

I think I would first check whether the skew is absent if you count the rows reading the HBase table directly from spark (so, without using janusgraph), e.g.:

https://stackoverflow.com/questions/42019905/how-to-use-newapihadooprdd-spark-in-java-to-read-hbase-data

If this works all right, than you know that somehow in janusgraph HBaseInputFormat the mappers do not get the right key ranges to read from.

I also though about the storage.hbase.region-count property of janusgraph-hbase. If you would specify this at 40 while creating the graph, janusgraph-hbase would create many small regions that will be compacted by HBase later on. But maybe this creates a different structure in the row keys that can be leveraged by the hbase.mapreduce.tableinput.mappers.per.region.

Best wishes,     Marc


Op woensdag 9 december 2020 om 17:16:35 UTC+1 schreef Roy Yu:

Hi Marc, 

The parameter  hbase.mapreduce.tableinput.mappers.per.region  can be effective. I set it to 40, and there are 40 tasks processing every region. But here comes the new promblem--the data skew. I use g.E().count() to count all the edges of the graph. During counting one region, one spark task containing all 2.6GB data, while other 39 tasks containing 0 data. The task failed again.  I checked my data. There are some vertices which have more 1 million incident edges.  So I tried to solve this promblem using vertex cut(https://docs.janusgraph.org/advanced-topics/partitioning/), my graph schema is something like  [mgmt.makeVertexLabel('product').partition().make() ]. But when I using MR to load data to the new graph, it consumed more than 10 times when the attemp without using partition(), from the hbase table detail page, I found the data loading process was busy reading data from  and writing data to the first region. The first region became the hot spot. I guess it relates to vertex ids. Could help me again?

On Tuesday, December 8, 2020 at 3:13:42 PM UTC+8 HadoopMarc wrote:
Hi Roy,

As I mentioned, I did not keep up with possibly new janusgraph-hbase features. From the HBase source, I see that HBase now has a "hbase.mapreduce.tableinput.mappers.per.region" config parameter.


It should not be too difficult to adapt the janusgraph HBaseInputFormat to leverage this feature (or maybe it even works without change???).

Best wishes,

Marc

Op dinsdag 8 december 2020 om 04:21:19 UTC+1 schreef Roy Yu:
you seem to run on cloud infra that reduces your requested 40 Gb to 33 Gb (see https://databricks.com/session_na20/running-apache-spark-on-kubernetes-best-practices-and-pitfalls). Fact of life. 
---------------------
Sorry Marc I misled you. Error Message was generated when I set spark.executor.memory to 30G, when it failed, I increased spark.executor.memory  to 40G, it failed either. I felt desperate and come here to ask for help
On Tuesday, December 8, 2020 at 10:35:19 AM UTC+8 Roy Yu wrote:
Hi Marc

Thanks for your immediate response.
I've tried to set spark.yarn.executor.memoryOverhead=10G and re-run the task, and it stilled failed. From the spark task UI, I saw 80% of processing time is Full GC time. As you said, 2.6GB(GZ compressed) exploding is  my root cause. Now I'm trying to reduce my region size to 1GB, if that will still fail, I'm gonna config the hbase hfile not using compressed format.
This was my first time running janusgraph OLAP, and I think this is a common promblom, as HBase region size 2.6GB(compressed) is not large, 20GB is very common in our production. If the community dose not solve the promblem, the Janusgraph HBase based OLAP solution cannot be adopted by other companies either.

On Tuesday, December 8, 2020 at 12:40:40 AM UTC+8 HadoopMarc wrote:
Hi Roy,

There seem to be three things bothering you here:
  1. you did not specify spark.yarn.executor.memoryOverhead, as the exception message says. Easily solved.
  2. you seem to run on cloud infra that reduces your requested 40 Gb to 33 Gb (see https://databricks.com/session_na20/running-apache-spark-on-kubernetes-best-practices-and-pitfalls). Fact of life.
  3. the janusgraph HBaseInputFormat use sentire HBase regions as hadoop partitions, which are fed into spark tasks. The 2.6Gb region size is for compressed binary data which explodes when expanded into java objects. This is your real problem.
I did not follow the latest status of janusgraph-hbase features for the HBaseInputFormat, but you have to somehow use spark with smaller partitions than an entire HBase region.
A long time ago, I had success with skipping the HBaseInputFormat and have spark executors connect to JanusGraph themselves. That is not a quick solution, though.

Best wishes,

Marc

Op maandag 7 december 2020 om 14:10:55 UTC+1 schreef Roy Yu:
Error message:
ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 33.1 GB of 33 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled because of YARN-4714. 

 graph conifg:
spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:MaxGCPauseMillis=500 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/mnt/data_1/log/spark2/gc-spark%p.log
spark.executor.cores=1
spark.executor.memory=40960m
spark.executor.instances=3

Region info:
hdfs dfs -du -h /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc
67     134    /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/.regioninfo
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/.tmp
2.6 G  5.1 G  /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/e
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/f
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/g
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/h
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/i
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/l
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/m
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/recovered.edits
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/s
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/t
root@~$

Anybody who can help me?


Re: Error when running JanusGraph with YARN and CQL

HadoopMarc <bi...@...>
 

Hi Varun,

Good job. However, your last solution will only work with everything running on a single machine. So, indeed, there is something wrong with the contents of spark-gremlin.zip or with the way it is put in the executor's local working directory. Note that you already put /Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar explicitly on the executor classpath while it should have been available already through ./spark-gremlin.zip/*

O, I think I see now what is different. You have used spark.yarn.dist.archives, while the TinkerPop recipes use spark.yarn.archive. They behave differently in yes/no extracting the jars from the zip. I guess either can be used, provided it is done consistently. You can use the environment tab in Spark web UI to inspect how things are picked up by spark.

Best wishes,    Marc

Op donderdag 10 december 2020 om 20:23:32 UTC+1 schreef Varun Ganesh:

Answering my own question. I was able fix the above error and successfully run the count job after explicitly adding /Users/my_comp/Downloads/janusgraph-0.5.2/lib/* to spark.executor.extraClassPath

But I am not yet sure as to why that was needed. I had assumed that adding spark-gremlin.zip to the path would have provided the required dependencies.

On Thursday, December 10, 2020 at 1:00:24 PM UTC-5 Varun Ganesh wrote:
An update on this, I tried setting the env var below:

export HADOOP_GREMLIN_LIBS=$GREMLIN_HOME/lib

After doing this I was able to successfully run the tinkerpop-modern.kryo example from the Recipes documentation
(though the guide at http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html explicitly asks us to ignore this)

Unfortunately, it is still not working with CQL. But the error is now different. Please see below:

12:46:33 ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 9, 192.168.1.160, executor 2): java.lang.NoClassDefFoundError: org/janusgraph/hadoop/formats/util/HadoopInputFormat
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
... (skipping)
Caused by: java.lang.ClassNotFoundException: org.janusgraph.hadoop.formats.util.HadoopInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 130 more

Is there some additional dependency that I may need to add?

Thanks in advance!
On Wednesday, December 9, 2020 at 11:49:29 PM UTC-5 Varun Ganesh wrote:
Hello,

I am trying to run SparkGraphComputer on a JanusGraph backed by Cassandra and ElasticSearch. I have previously verified that I am able to run SparkGraphComputer on a local Spark standalone cluster.

I am now trying to run it on YARN. I have a local YARN cluster running and I have verified that it can run Spark jobs.

I followed the following links:

And here is my read-cql-yarn.properties file:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true

#
# JanusGraph Cassandra InputFormat configuration
#
# These properties defines the connection properties which were used while write data to JanusGraph.
janusgraphmr.ioformat.conf.storage.backend=cql
# This specifies the hostname & port for Cassandra data store.
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9042
# This specifies the keyspace where data is stored.
janusgraphmr.ioformat.conf.storage.cql.keyspace=janusgraph
# This defines the indexing backend configuration used while writing data to JanusGraph.
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch
janusgraphmr.ioformat.conf.index.search.hostname=127.0.0.1
# Use the appropriate properties for the backend when using a different storage backend (HBase) or indexing backend (Solr).

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true

#
# SparkGraphComputer Configuration
#
spark.master=yarn
spark.submit.deployMode=client
spark.executor.memory=1g

spark.yarn.dist.archives=/tmp/spark-gremlin.zip
spark.yarn.dist.files=/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar
spark.yarn.appMasterEnv.CLASSPATH=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:./spark-gremlin.zip/*
spark.executor.extraClassPath=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar:./spark-gremlin.zip/*

spark.driver.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64

spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

After a bunch of trial and error, I was able to get it to a point where I see containers starting up on my YARN Resource manager UI (port 8088)

Here is the code I am running (it's a simple count):
gremlin> graph = GraphFactory.open('conf/hadoop-graph/read-cql-yarn.properties')
==>hadoopgraph[cqlinputformat->nulloutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cqlinputformat->nulloutputformat], sparkgraphcomputer]
gremlin> g.V().count()

However I am encountering the following failure:
18:49:03 ERROR org.apache.spark.scheduler.TaskSetManager - Task 2 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 10, 192.168.1.160, executor 1): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2862)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1682)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2366)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2290)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2148)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1647)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:483)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:441)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Would really appricate it if someone could shed some light on this error and advise on next steps!

Thank you!


Re: Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks

Varun Ganesh <operatio...@...>
 

Thank you Marc. I was able to reduce the tasks by adjusting the `num_tokens` settings on Cassandra. Still unsure about why each task takes so long though. Hoping that this a per-task overhead that stays the same as we process larger datasets.

On Saturday, December 5, 2020 at 3:20:17 PM UTC-5 HadoopMarc wrote:
Hi Varun,

Not a solution, but someone in the thread below explained the 257 magic number for OLAP on a Cassandra cluster:

Marc


Op vrijdag 4 december 2020 om 20:48:28 UTC+1 schreef Varun Ganesh:
Hi,

I am facing this same issue. I am using SparkGraphComputer to read from Janusgraph backed by cassandra. `g.V().count()` takes about 3 minutes to load just two rows that I have in the graph.

I see that about 257 tasks are created. In my case, I am seeing parallelism in the spark cluster that I am using but each task seems to take about ~5 seconds on average and there is no obvious reason why.

(I can attach a page from the Spark UI and also the properties file I am using, but I am unable to find the option to)

Would appreciate any input on solving this. Thank you!

Varun

On Friday, December 6, 2019 at 5:04:55 AM UTC-5 s...@... wrote:
Hi Dimitar,

I'm experiencing the same problem of having some seemingly uncontrollable static number of Spark task - did you ever figure out how to fix this?

Thanks,
Sture


On Friday, October 18, 2019 at 4:19:19 PM UTC+2, dim...@... wrote:
Hello,

I have setup Janusgraph 0.4.0 with Hadoop 2.9.0 and Spark 2.4.4 in a K8s cluster.
I connect to Janusgraph from gremlin console and execute: 
gremlin> og
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
==>1889

It takes 25min to do the count! The same time took when there were no vertices - e.g. -> 0.  Spark job shows that there were 513 tasks run! Number of task is always constant 513 no matter of the number of vertices.
I have set "spark.sql.shuffle.partitions=4" at spark job's environment, but again the number of Spark tasks was 513! My assumption is that Janusgraph somehow specifies this number of tasks when submits the job to Spark.
The questions are:
- Why Janusgraph job submitted to Spark is always palatalized to 513 tasks? 
- How to manage the number of tasks which are created for a Janusgrap job? 
- How to minimize the execution time of OLAP query for this small graph (OLTP query takes less than a second to execute)?

Thanks,
Dimitar


Re: Error when running JanusGraph with YARN and CQL

Varun Ganesh <operatio...@...>
 

Answering my own question. I was able fix the above error and successfully run the count job after explicitly adding /Users/my_comp/Downloads/janusgraph-0.5.2/lib/* to spark.executor.extraClassPath

But I am not yet sure as to why that was needed. I had assumed that adding spark-gremlin.zip to the path would have provided the required dependencies.


On Thursday, December 10, 2020 at 1:00:24 PM UTC-5 Varun Ganesh wrote:
An update on this, I tried setting the env var below:

export HADOOP_GREMLIN_LIBS=$GREMLIN_HOME/lib

After doing this I was able to successfully run the tinkerpop-modern.kryo example from the Recipes documentation
(though the guide at http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html explicitly asks us to ignore this)

Unfortunately, it is still not working with CQL. But the error is now different. Please see below:

12:46:33 ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 9, 192.168.1.160, executor 2): java.lang.NoClassDefFoundError: org/janusgraph/hadoop/formats/util/HadoopInputFormat
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
... (skipping)
Caused by: java.lang.ClassNotFoundException: org.janusgraph.hadoop.formats.util.HadoopInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 130 more

Is there some additional dependency that I may need to add?

Thanks in advance!
On Wednesday, December 9, 2020 at 11:49:29 PM UTC-5 Varun Ganesh wrote:
Hello,

I am trying to run SparkGraphComputer on a JanusGraph backed by Cassandra and ElasticSearch. I have previously verified that I am able to run SparkGraphComputer on a local Spark standalone cluster.

I am now trying to run it on YARN. I have a local YARN cluster running and I have verified that it can run Spark jobs.

I followed the following links:

And here is my read-cql-yarn.properties file:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true

#
# JanusGraph Cassandra InputFormat configuration
#
# These properties defines the connection properties which were used while write data to JanusGraph.
janusgraphmr.ioformat.conf.storage.backend=cql
# This specifies the hostname & port for Cassandra data store.
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9042
# This specifies the keyspace where data is stored.
janusgraphmr.ioformat.conf.storage.cql.keyspace=janusgraph
# This defines the indexing backend configuration used while writing data to JanusGraph.
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch
janusgraphmr.ioformat.conf.index.search.hostname=127.0.0.1
# Use the appropriate properties for the backend when using a different storage backend (HBase) or indexing backend (Solr).

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true

#
# SparkGraphComputer Configuration
#
spark.master=yarn
spark.submit.deployMode=client
spark.executor.memory=1g

spark.yarn.dist.archives=/tmp/spark-gremlin.zip
spark.yarn.dist.files=/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar
spark.yarn.appMasterEnv.CLASSPATH=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:./spark-gremlin.zip/*
spark.executor.extraClassPath=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar:./spark-gremlin.zip/*

spark.driver.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64

spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

After a bunch of trial and error, I was able to get it to a point where I see containers starting up on my YARN Resource manager UI (port 8088)

Here is the code I am running (it's a simple count):
gremlin> graph = GraphFactory.open('conf/hadoop-graph/read-cql-yarn.properties')
==>hadoopgraph[cqlinputformat->nulloutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cqlinputformat->nulloutputformat], sparkgraphcomputer]
gremlin> g.V().count()

However I am encountering the following failure:
18:49:03 ERROR org.apache.spark.scheduler.TaskSetManager - Task 2 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 10, 192.168.1.160, executor 1): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2862)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1682)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2366)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2290)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2148)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1647)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:483)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:441)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Would really appricate it if someone could shed some light on this error and advise on next steps!

Thank you!


Re: Error when running JanusGraph with YARN and CQL

Varun Ganesh <operatio...@...>
 

An update on this, I tried setting the env var below:

export HADOOP_GREMLIN_LIBS=$GREMLIN_HOME/lib

After doing this I was able to successfully run the tinkerpop-modern.kryo example from the Recipes documentation
(though the guide at http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html explicitly asks us to ignore this)

Unfortunately, it is still not working with CQL. But the error is now different. Please see below:

12:46:33 ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 9, 192.168.1.160, executor 2): java.lang.NoClassDefFoundError: org/janusgraph/hadoop/formats/util/HadoopInputFormat
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
... (skipping)
Caused by: java.lang.ClassNotFoundException: org.janusgraph.hadoop.formats.util.HadoopInputFormat
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 130 more

Is there some additional dependency that I may need to add?

Thanks in advance!

On Wednesday, December 9, 2020 at 11:49:29 PM UTC-5 Varun Ganesh wrote:
Hello,

I am trying to run SparkGraphComputer on a JanusGraph backed by Cassandra and ElasticSearch. I have previously verified that I am able to run SparkGraphComputer on a local Spark standalone cluster.

I am now trying to run it on YARN. I have a local YARN cluster running and I have verified that it can run Spark jobs.

I followed the following links:

And here is my read-cql-yarn.properties file:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true

#
# JanusGraph Cassandra InputFormat configuration
#
# These properties defines the connection properties which were used while write data to JanusGraph.
janusgraphmr.ioformat.conf.storage.backend=cql
# This specifies the hostname & port for Cassandra data store.
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9042
# This specifies the keyspace where data is stored.
janusgraphmr.ioformat.conf.storage.cql.keyspace=janusgraph
# This defines the indexing backend configuration used while writing data to JanusGraph.
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch
janusgraphmr.ioformat.conf.index.search.hostname=127.0.0.1
# Use the appropriate properties for the backend when using a different storage backend (HBase) or indexing backend (Solr).

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true

#
# SparkGraphComputer Configuration
#
spark.master=yarn
spark.submit.deployMode=client
spark.executor.memory=1g

spark.yarn.dist.archives=/tmp/spark-gremlin.zip
spark.yarn.dist.files=/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar
spark.yarn.appMasterEnv.CLASSPATH=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:./spark-gremlin.zip/*
spark.executor.extraClassPath=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar:./spark-gremlin.zip/*

spark.driver.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64

spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

After a bunch of trial and error, I was able to get it to a point where I see containers starting up on my YARN Resource manager UI (port 8088)

Here is the code I am running (it's a simple count):
gremlin> graph = GraphFactory.open('conf/hadoop-graph/read-cql-yarn.properties')
==>hadoopgraph[cqlinputformat->nulloutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cqlinputformat->nulloutputformat], sparkgraphcomputer]
gremlin> g.V().count()

However I am encountering the following failure:
18:49:03 ERROR org.apache.spark.scheduler.TaskSetManager - Task 2 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 10, 192.168.1.160, executor 1): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2862)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1682)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2366)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2290)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2148)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1647)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:483)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:441)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Would really appricate it if someone could shed some light on this error and advise on next steps!

Thank you!


Re: How to open the same graph multiple times and not get the same object?

BO XUAN LI <libo...@...>
 

Thanks for sharing! I personally only use MapReduce and not sure if there is an existing solution for Spark.

> if there is any danger in opening multiple separate graph instances and using them to modify the graph

Opening multiple graph instances on the same JVM seems atypical, but I don’t see any problem. It would be great if you can share back in case you see any issue.

Best regards,
Boxuan

On Dec 10, 2020, at 4:14 AM, Mladen Marović <mladen...@...> wrote:

Hello Boxuan,

I need to support reindexing very large graphs. To my knowledge, the only feasible way that's supported is via the `MapReduceIndexManagement` class. This is not ideal for me as I'd like to utilise an existing Apache Spark cluster to run this job, and `MapReduceIndexManagement` is a Hadoop/MapReduce implementation. Therefore, I started writing a `SparkIndexManagement` class that's supposed to be a drop-in replacement that offers Spark support.

The basic structure of the code that processes a single partition should be something like this:

        public ScanMetrics processPartition(Iterator<Tuple2<NullWritable, VertexWritable>> vertices) {
            if (partition.hasNext()) {
                // open the graph
                JanusGraph graph = JanusGraphFactory.open(getGraphConfiguration());

                // prepare for partition processing
                job.workerIterationStart(graph, getJobConfiguration(), metrics);
                
                // find and process each vertex
                vertices.forEachRemaining(
                    tuple -> {
                        ...
                        JanusGraphVertex vertex = ...  // load the vertex
                        job.process(vertex, metrics);
                        ...
                    }
                );
                
                // finish processing the partition
                job.workerIterationEnd(metrics);
            }

            ...
        }

At first everything seemed quite straightforward, so I implemented a quick-and-dirty solution as a proof of concept. However, after running the first buildable solution, I came upon an unexpected error: "java.lang.IllegalArgumentException: The transaction has already been closed". The confusing part was that the implementation worked when I ran the local Spark cluster as "local[1]" (which spawns only one worker thread), but when running it as "local[*]" (which spawns multiple worker threads, one per core), the error would always appear, although not always on the same task.

After some digging, I seem to have found the main cause. Loading the graph data by using `org.janusgraph.hadoop.formats.cql.CqlInputFormat` in the `SparkContext.newAPIHadoopRDD()` call returns a `JavaPairRDD<NullWritable, VertexWritable>` with several partitions, as expected. The graph used to read vertices in this input format is opened via `JanusGraphFactory.open()`. After iterating through all vertices returned by the partition, the underlying graph is closed in a final `release()` call for that partition. This makes sense because that partition is done with reading. However, when processing that partition, I need to open a graph to pass to `IndexRepairJob.workerIterationStart()`, and also create a separate read-only transaction (fromt that same graph) to fetch the vertex properly and pass it to `IndexRepairJob.process()`. `IndexRepairJob` also creates a write transaction to make some changes to the graph.

This would all work fine in MapReduce because there, the first `map()` step is run in its entirety first, which means that reindexing/vertex is done only after ALL partitions have been read and the `CqlInputFormat` finished its part. I don't have much experience in MapReduce, but that's how I understand it to work - a single map() result is first written on disk, and then that result is read from the disk to be the input to the subsequent map() call. On the other hand, Spark optimizes the map-reduce paradigm by chaining subsequent map() calls to keep objects in memory as much as possible. So, when this runs on a "local[*]" cluster, or a Spark executor with multiple cores, and the graph is opened via JanusGraphFactory.open(), all threads in that executor share the graph object. Each thread runs on a different RDD partition, but they can be at different phases of the reindexing process (different map() steps) at the same time. When one thread closes the graph for whatever reason (e.g. when `CqlInputFormat` finishes reading a partition), other threads simply blow up.

For example, if I have partitions/tasks with 300, 600 and 900 vertices and they all run on a single 3-core Spark executor, they'll be processed in parallel by three separate threads. The first thread will process 300 vertices and, upon iterating the final vertex, will close the underlying graph (as part of the `CqlInputFormat` implementation, from what I gathered). Closing the graph immediately closes all opened transactions. However, the same graph is used in other threads as well in parallel. The second thread might have only finished processing 350 vertices at the time the first closed the graph, so the next time it tries to write something, it crashes because it uses a transaction that's already closed.

The ideal solution should be to open separate graph instances of the same graph, one in `CqlInputFormat`, and the other that is passed to `IndexRepairJob.workerIterationStart()`, for each task. In that case, if one graph is closed, no other tasks or processing phases would be affected. I tried that out today by opening the graph using the `StandardJanusGraph` constructor (at least in my part of the code) and so far that worked well because in most of my test runs the job completed successfully. The runs that failed occurred during debugging, when the execution was stuck on a breakpoint for a while, so maybe there were some timeouts involved or something. This remains to be tested. I also strongly suspect that the problem still remains, at least in theory, because `CqlInputFormat` still uses the `JanusGraphFactory.open()` call, but the probability for that is reduced, at least in the environment and on the data I'm currently testing on. I haven't analyzed the `CqlInputFormat` code fully to understand how it behaves in that case yet.

Admittedly, I could provide my own InputFormat class, or at least subclass it and try to hack and slash and make it work somehow, but that seriously complicates everything and defeats the purpose of everything I'm trying to do here.

Another workaround would be to limit each Spark executor to use only one core, but that seems wasteful and is definitely something I would try to avoid.

I probably missed a lot of details, but that's the general idea and my conclusions so far. Feel free to correct me if I missed anything or wrote anything wrong, as well as point me in the right direction if such an implementation already exists and I just didn't come across it. 

Best regards,

Mladen

PS An additional question here would be to see if there is any danger in opening multiple separate graph instances and using them to modify the graph, but as this is already done in the current MapReduce implementation anyway, and all my transactions are opened as read-only, I'm guessing that shouldn't pose a problem here.

On Wednesday, December 9, 2020 at 4:32:10 PM UTC+1 libo...@connect.hku.hk wrote:
Hi Mladen,

Agree with Marc, that's something you could try. If possible, could you share the reason why you have to open the same graph multiple times with different graph objects? If there is no other solution to your problem then this can be a feature request.

Best regards,
Boxuan
On Wednesday, December 9, 2020 at 2:50:48 PM UTC+8 HadoopMarc wrote:
Hi Mladen,

The constructor of StandardJanusGraph seems worth a try:


HTH,   Marc

Op dinsdag 8 december 2020 om 19:34:55 UTC+1 schreef Mladen Marović:
Hello,

I'm writing a Java program that, for various implementation details, needs to open the same graph multiple times. Currently I'm using JanusGraphFactory.open(...), but this always looks up the graph by its name in the static JanusGraphManager instance and returns the same object.

is there a way to create two different object instances of the same Janusgraph graph? These instances need to be completely separate, so that closing one graph does not close transactions created using the other graph. I checked the documentation and inspected the code directly while debugging, but couldn't find anything useful.

Thanks in advance,

Mladen

-- 
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/PTO0ExGyOWg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/99d50894-d7a0-4f31-ae18-1fe43bea430fn%40googlegroups.com.


Re: OLAP, Hadoop, Spark and Cassandra

Mladen Marović <mladen...@...>
 

A slight correction and clarification of my previous post - the total number of partitions/splits is exactly equal to total_number_of_tokens + 1. In a 3-node cassandra cluster where each node has 256 tokens (if set to default), this would result in a total of 769 partitions, in a single-node cluster this would be 257, etc. There is no "1 task that collects results, or something similar".

This makes sense when you consider that Cassandra partitions data using 64-bit row key hashes, that the total range of 64-bit integer hash values is equal to [-2^63, 2^63 - 1], and that tokens are simply 64-bit integer values used to determine what data partitions a node gets. Splitting that range with n different tokens always gives n + 1 subsets. A log excerpt from a 1-node cassandra cluster with 16 tokens confirms this:

18720 [Executor task launch worker for task 0] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-4815577940669380240, '-2942172956248108515] @[master])
18720 [Executor task launch worker for task 1] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((7326109958794842850, '7391123213565411179] @[master])
18721 [Executor task launch worker for task 3] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-2942172956248108515, '-2847854446434006096] @[master])
18740 [Executor task launch worker for task 2] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-9223372036854775808, '-8839354777455528291] @[master])
28369 [Executor task launch worker for task 4] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((4104296217363716109, '7326109958794842850] @[master])
28651 [Executor task launch worker for task 5] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((8156279557766590813, '-9223372036854775808] @[master])
34467 [Executor task launch worker for task 6] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-6978843450179888845, '-5467974851507832526] @[master])
54235 [Executor task launch worker for task 7] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((2164465249293820494, '3738744141825711063] @[master])
56122 [Executor task launch worker for task 8] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-2847854446434006096, '180444324727144184] @[master])
60564 [Executor task launch worker for task 9] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((180444324727144184, '720824306927062455] @[master])
74783 [Executor task launch worker for task 10] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-8839354777455528291, '-7732322859452179159] @[master])
78171 [Executor task launch worker for task 11] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-7732322859452179159, '-6978843450179888845] @[master])
79362 [Executor task launch worker for task 12] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((3738744141825711063, '4104296217363716109] @[master])
91036 [Executor task launch worker for task 13] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((-5467974851507832526, '-4815577940669380240] @[master])
92250 [Executor task launch worker for task 14] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((1437322944493769078, '2164465249293820494] @[master])
92363 [Executor task launch worker for task 15] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((720824306927062455, '1437322944493769078] @[master])
94339 [Executor task launch worker for task 16] INFO  org.apache.spark.rdd.NewHadoopRDD  - Input split: ColumnFamilySplit((7391123213565411179, '8156279557766590813] @[master])

Best regards,

Mladen

On Tuesday, December 1, 2020 at 8:05:19 AM UTC+1 HadoopMarc wrote:
Hi Mladen,

Interesting read! Spark is not very sensitive to the number of tasks. I believe that for OLAP on HadoopGraph the optimum is for partitions of 256 Mb or so. Larger is difficult to hold in memory for reasonably sized executors. Smaller gives too much overhead. OLAP with janusgraph-hbase is much harder, because the partition size is determined by the HBase regions which need to be large (10GB). Also note that the entire graph needs to fit into the total memory of all executors  because graph traversing is shuffle-heavy and spilling to disk will take endlessly.

Best wishes,    Marc

Op maandag 30 november 2020 om 19:09:15 UTC+1 schreef Mladen Marović:
I know I'm quite late to the party, but for future reference - the number of input partitions in Spark depends on the partitioning of the source. In case of cassandra, partitioning is determined by the number of tokens each node gets (as configured by `num_tokens` in `cassandra.yaml`), which is set to 256 by default. So, if you have a 3-node cassandra cluster, by default each node should get 256 tokens, which would result in 3*256 = 768 tokens total. Since Spark reads directly from cassandra (if you're using `org.janusgraph.hadoop.formats.cql.CqlInputFormat`), that translates to 768 partitions in the input Spark RDD, or 768 tasks during processing. Add to that 1 task that collects results, or something similar, and you end up at 769. At least that was my experience.

The default value of 256 for `num_tokens` made sense in older versions, but in cassandra 3.x a new token allocation algorithm was implemented to improve performance for operations requiring token-range scans, which is precisely what Spark does. I experimented a bit with smaller values (e.g. 16) and managed to drastically reduce the number of tasks when scanning the entire graph. For further, reading, I recommend this article.



On Thursday, December 5, 2019 at 9:28:26 AM UTC+1 s...@... wrote:
Answering my own question - turned out I had had a mixup of keyspaces used between the two instances

Default the conf/hadoop-graph/read-cql.properties reads

janusgraphmr.ioformat.conf.storage.cassandra.keyspace

While for CQL it should read

janusgraphmr.ioformat.conf.storage.cql.keyspace

Also - as I made a 'named' (ve_graph) graph I had to point to that one rather than the janusgraph keyspace.

Problem 1 solved. Now to the next - how can I lower the number of 'partitions' Spark is using (here 796  '... on localhost (executor driver) (769/769)')?  

On Wednesday, December 4, 2019 at 11:46:42 PM UTC+1, Sture Lygren wrote:
Hi,

I'm trying to get JanusGraph 0.4.0 with a Cassandra (CQL) backend setup and running as OLAP while still keeping OLTP active in order to do graph updates. I've been searching high and low for some guidance, but so far without any luck. Hopefully someone here could tune in and help?

Here's where I'm at currently

  • local Hadoop running according to https://old-docs.janusgraph.org/0.4.0/hadoop-tp3.html
  • gremlin server started as /bin/gremlin-server.sh conf/gremlin-server/gremlin-server-configuration.yaml
  • gremlin-server-configuration.yaml points to init.groovy script doing the traversal mappings for OLTP and OLAP
def globals = [:]
ve = ConfiguredGraphFactory.open("ve_graph")
OLAPGraph = GraphFactory.open('conf/hadoop-graph/read-cql.properties')
globals << [g : ve.traversal(), sg: OLAPGraph.traversal().withComputer(org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer)]
  • conf/hadoop-graph/read-cql.properties reads
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cql
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9042
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
  • Running the gremlin shell I have
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/sture/Scripts/janusgraph-0.4.0-hadoop2/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/sture/Scripts/janusgraph-0.4.0-hadoop2/lib/logback-classic-1.1.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
plugin activated: tinkerpop.server
plugin activated: tinkerpop.tinkergraph
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.utilities
plugin activated: janusgraph.imports
gremlin> :remote connect tinkerpop.server conf/remote.yaml session
==>Configured localhost/127.0.0.1:8182-[655848fc-b46e-40be-8174-f0dc42cdabd4]
gremlin> :remote console
==>All scripts will now be sent to Gremlin Server - [localhost/127.0.0.1:8182]-[655848fc-b46e-40be-8174-f0dc42cdabd4] - type ':remote console' to return to local mode
gremlin> g
==>graphtraversalsource[standardjanusgraph[cql:[127.0.0.1]], standard]
gremlin>
gremlin> sg
==>graphtraversalsource[hadoopgraph[cqlinputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().has('lbl','System').count()
==>68
gremlin> sg.V().has('lbl','System').count()
  • The job is running for some time and while finishing the gremlin-server.log reads
253856 [Executor task launch worker for task 768] INFO  org.apache.spark.executor.Executor  - Finished task 768.0 in stage 0.0 (TID 768). 2388 bytes result sent to driver
253858 [task-result-getter-1] INFO  org.apache.spark.scheduler.TaskSetManager  - Finished task 768.0 in stage 0.0 (TID 768) in 6809 ms on localhost (executor driver) (769/769)
253861 [dag-scheduler-event-loop] INFO  org.apache.spark.scheduler.DAGScheduler  - ResultStage 0 (fold at SparkStarBarrierInterceptor.java:101) finished in 161.427 s
253861 [task-result-getter-1] INFO  org.apache.spark.scheduler.TaskSchedulerImpl  - Removed TaskSet 0.0, whose tasks have all completed, from pool
253876 [SparkGraphComputer-boss] INFO  org.apache.spark.scheduler.DAGScheduler  - Job 0 finished: fold at SparkStarBarrierInterceptor.java:101, took 161.598267 s
253888 [SparkGraphComputer-boss] INFO  org.apache.spark.rdd.MapPartitionsRDD  - Removing RDD 1 from persistence list
253901 [block-manager-slave-async-thread-pool-0] INFO  org.apache.spark.storage.BlockManager  - Removing RDD 1
  • However - the count (==> ) reads 0 for the sg traversal
I've most likely missed some crucial point here, but I'm not able to spot it. Please help.



Error when running JanusGraph with YARN and CQL

Varun Ganesh <operatio...@...>
 

Hello,

I am trying to run SparkGraphComputer on a JanusGraph backed by Cassandra and ElasticSearch. I have previously verified that I am able to run SparkGraphComputer on a local Spark standalone cluster.

I am now trying to run it on YARN. I have a local YARN cluster running and I have verified that it can run Spark jobs.

I followed the following links:
http://yaaics.blogspot.com/2017/07/configuring-janusgraph-for-spark-yarn.html
http://tinkerpop.apache.org/docs/3.4.6/recipes/#olap-spark-yarn

And here is my read-cql-yarn.properties file:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat
gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true

#
# JanusGraph Cassandra InputFormat configuration
#
# These properties defines the connection properties which were used while write data to JanusGraph.
janusgraphmr.ioformat.conf.storage.backend=cql
# This specifies the hostname & port for Cassandra data store.
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9042
# This specifies the keyspace where data is stored.
janusgraphmr.ioformat.conf.storage.cql.keyspace=janusgraph
# This defines the indexing backend configuration used while writing data to JanusGraph.
janusgraphmr.ioformat.conf.index.search.backend=elasticsearch
janusgraphmr.ioformat.conf.index.search.hostname=127.0.0.1
# Use the appropriate properties for the backend when using a different storage backend (HBase) or indexing backend (Solr).

#
# Apache Cassandra InputFormat configuration
#
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
cassandra.input.widerows=true

#
# SparkGraphComputer Configuration
#
spark.master=yarn
spark.submit.deployMode=client
spark.executor.memory=1g

spark.yarn.dist.archives=/tmp/spark-gremlin.zip
spark.yarn.dist.files=/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar
spark.yarn.appMasterEnv.CLASSPATH=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:./spark-gremlin.zip/*
spark.executor.extraClassPath=/Users/my_comp/Downloads/hadoop-2.7.2/etc/hadoop:/Users/my_comp/Downloads/janusgraph-0.5.2/lib/janusgraph-cql-0.5.2.jar:./spark-gremlin.zip/*

spark.driver.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/Users/my_comp/Downloads/hadoop-2.7.2/lib/native:/Users/my_comp/Downloads/hadoop-2.7.2/lib/native/Linux-amd64-64

spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator

After a bunch of trial and error, I was able to get it to a point where I see containers starting up on my YARN Resource manager UI (port 8088)

Here is the code I am running (it's a simple count):
gremlin> graph = GraphFactory.open('conf/hadoop-graph/read-cql-yarn.properties')
==>hadoopgraph[cqlinputformat->nulloutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cqlinputformat->nulloutputformat], sparkgraphcomputer]
gremlin> g.V().count()

However I am encountering the following failure:
18:49:03 ERROR org.apache.spark.scheduler.TaskSetManager - Task 2 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 10, 192.168.1.160, executor 1): java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2862)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1682)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2366)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2290)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2148)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1647)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:483)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:441)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:370)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Would really appricate it if someone could shed some light on this error and advise on next steps!

Thank you!


Centric Indexes failing to support all conditions for better performance.

chrism <cmil...@...>
 

JanusGraph documentation: https://docs.janusgraph.org/index-management/index-performance/
is describing usage of Vertex Centrix Index [edge=battled + properties=(rating,time)]
g.V(h).outE('battled').has('rating', 5.0).has('time', inside(10, 50)).inV()

From my understanding profile() of above is reporting \_isFitted=true
to indicate that backend-query delivered all results as conditions: 
\_condition=(rating = 0.5 AND time > 10 AND time < 50 AND type[battled])

Two things are obvious from above: centric index is supporting multiple property keys, and equality and range/interval constraints.
However isFitted is false for all kind of conditions or combinations which are not really breaking the above rules, still in range constraints:

a) g.V(h).outE('battled').has('rating',lt(5.0)).has('time', inside(10, 50)).inV()   // P.lt used for first key
b) g.V(h).outE('battled').has('rating',gt(5.0)) // P.gt used
c) g.V(h).outE('battled').or( hasNot('rating'), has('rating',eq(5.0)) ) // OrStep() used

Even b) can be "fitted" by  has('rating',inside(5.0,Long.MAX_VALUE)) 
all that is very confusing, and probably not working as expected, what I am doing wrong? 
as from my experience only one property key can be used for query conditions and using index, the second is ignored.

Having isFitted=false is not really improving performance, from my understanding,
when one only condition allows to get most of my edges and is asking to filter them in memory,  as this is stated by implementation of BasicVertexCentricQueryBuilder.java.
Are there limitations not described in the JG doco? It is a glitch?

Can you offer explanation how to utilize Centric Indexes for edges in full support? 

Christopher


Re: How to open the same graph multiple times and not get the same object?

Mladen Marović <mladen...@...>
 

Hello Boxuan,

I need to support reindexing very large graphs. To my knowledge, the only feasible way that's supported is via the `MapReduceIndexManagement` class. This is not ideal for me as I'd like to utilise an existing Apache Spark cluster to run this job, and `MapReduceIndexManagement` is a Hadoop/MapReduce implementation. Therefore, I started writing a `SparkIndexManagement` class that's supposed to be a drop-in replacement that offers Spark support.

The basic structure of the code that processes a single partition should be something like this:

        public ScanMetrics processPartition(Iterator<Tuple2<NullWritable, VertexWritable>> vertices) {
            if (partition.hasNext()) {
                // open the graph
                JanusGraph graph = JanusGraphFactory.open(getGraphConfiguration());

                // prepare for partition processing
                job.workerIterationStart(graph, getJobConfiguration(), metrics);
                
                // find and process each vertex
                vertices.forEachRemaining(
                    tuple -> {
                        ...
                        JanusGraphVertex vertex = ...  // load the vertex
                        job.process(vertex, metrics);
                        ...
                    }
                );
                
                // finish processing the partition
                job.workerIterationEnd(metrics);
            }

            ...
        }

At first everything seemed quite straightforward, so I implemented a quick-and-dirty solution as a proof of concept. However, after running the first buildable solution, I came upon an unexpected error: "java.lang.IllegalArgumentException: The transaction has already been closed". The confusing part was that the implementation worked when I ran the local Spark cluster as "local[1]" (which spawns only one worker thread), but when running it as "local[*]" (which spawns multiple worker threads, one per core), the error would always appear, although not always on the same task.

After some digging, I seem to have found the main cause. Loading the graph data by using `org.janusgraph.hadoop.formats.cql.CqlInputFormat` in the `SparkContext.newAPIHadoopRDD()` call returns a `JavaPairRDD<NullWritable, VertexWritable>` with several partitions, as expected. The graph used to read vertices in this input format is opened via `JanusGraphFactory.open()`. After iterating through all vertices returned by the partition, the underlying graph is closed in a final `release()` call for that partition. This makes sense because that partition is done with reading. However, when processing that partition, I need to open a graph to pass to `IndexRepairJob.workerIterationStart()`, and also create a separate read-only transaction (fromt that same graph) to fetch the vertex properly and pass it to `IndexRepairJob.process()`. `IndexRepairJob` also creates a write transaction to make some changes to the graph.

This would all work fine in MapReduce because there, the first `map()` step is run in its entirety first, which means that reindexing/vertex is done only after ALL partitions have been read and the `CqlInputFormat` finished its part. I don't have much experience in MapReduce, but that's how I understand it to work - a single map() result is first written on disk, and then that result is read from the disk to be the input to the subsequent map() call. On the other hand, Spark optimizes the map-reduce paradigm by chaining subsequent map() calls to keep objects in memory as much as possible. So, when this runs on a "local[*]" cluster, or a Spark executor with multiple cores, and the graph is opened via JanusGraphFactory.open(), all threads in that executor share the graph object. Each thread runs on a different RDD partition, but they can be at different phases of the reindexing process (different map() steps) at the same time. When one thread closes the graph for whatever reason (e.g. when `CqlInputFormat` finishes reading a partition), other threads simply blow up.

For example, if I have partitions/tasks with 300, 600 and 900 vertices and they all run on a single 3-core Spark executor, they'll be processed in parallel by three separate threads. The first thread will process 300 vertices and, upon iterating the final vertex, will close the underlying graph (as part of the `CqlInputFormat` implementation, from what I gathered). Closing the graph immediately closes all opened transactions. However, the same graph is used in other threads as well in parallel. The second thread might have only finished processing 350 vertices at the time the first closed the graph, so the next time it tries to write something, it crashes because it uses a transaction that's already closed.

The ideal solution should be to open separate graph instances of the same graph, one in `CqlInputFormat`, and the other that is passed to `IndexRepairJob.workerIterationStart()`, for each task. In that case, if one graph is closed, no other tasks or processing phases would be affected. I tried that out today by opening the graph using the `StandardJanusGraph` constructor (at least in my part of the code) and so far that worked well because in most of my test runs the job completed successfully. The runs that failed occurred during debugging, when the execution was stuck on a breakpoint for a while, so maybe there were some timeouts involved or something. This remains to be tested. I also strongly suspect that the problem still remains, at least in theory, because `CqlInputFormat` still uses the `JanusGraphFactory.open()` call, but the probability for that is reduced, at least in the environment and on the data I'm currently testing on. I haven't analyzed the `CqlInputFormat` code fully to understand how it behaves in that case yet.

Admittedly, I could provide my own InputFormat class, or at least subclass it and try to hack and slash and make it work somehow, but that seriously complicates everything and defeats the purpose of everything I'm trying to do here.

Another workaround would be to limit each Spark executor to use only one core, but that seems wasteful and is definitely something I would try to avoid.

I probably missed a lot of details, but that's the general idea and my conclusions so far. Feel free to correct me if I missed anything or wrote anything wrong, as well as point me in the right direction if such an implementation already exists and I just didn't come across it. 

Best regards,

Mladen

PS An additional question here would be to see if there is any danger in opening multiple separate graph instances and using them to modify the graph, but as this is already done in the current MapReduce implementation anyway, and all my transactions are opened as read-only, I'm guessing that shouldn't pose a problem here.


On Wednesday, December 9, 2020 at 4:32:10 PM UTC+1 li...@... wrote:
Hi Mladen,

Agree with Marc, that's something you could try. If possible, could you share the reason why you have to open the same graph multiple times with different graph objects? If there is no other solution to your problem then this can be a feature request.

Best regards,
Boxuan
On Wednesday, December 9, 2020 at 2:50:48 PM UTC+8 HadoopMarc wrote:
Hi Mladen,

The constructor of StandardJanusGraph seems worth a try:


HTH,   Marc

Op dinsdag 8 december 2020 om 19:34:55 UTC+1 schreef Mladen Marović:
Hello,

I'm writing a Java program that, for various implementation details, needs to open the same graph multiple times. Currently I'm using JanusGraphFactory.open(...), but this always looks up the graph by its name in the static JanusGraphManager instance and returns the same object.

is there a way to create two different object instances of the same Janusgraph graph? These instances need to be completely separate, so that closing one graph does not close transactions created using the other graph. I checked the documentation and inspected the code directly while debugging, but couldn't find anything useful.

Thanks in advance,

Mladen


Re: Running OLAP on HBase with SparkGraphComputer fails with Error Container killed by YARN for exceeding memory limits

Roy Yu <7604...@...>
 

Hi Marc, 

The parameter  hbase.mapreduce.tableinput.mappers.per.region  can be effective. I set it to 40, and there are 40 tasks processing every region. But here comes the new promblem--the data skew. I use g.E().count() to count all the edges of the graph. During counting one region, one spark task containing all 2.6GB data, while other 39 tasks containing 0 data. The task failed again.  I checked my data. There are some vertices which have more 1 million incident edges.  So I tried to solve this promblem using vertex cut(https://docs.janusgraph.org/advanced-topics/partitioning/), my graph schema is something like  [mgmt.makeVertexLabel('product').partition().make() ]. But when I using MR to load data to the new graph, it consumed more than 10 times when the attemp without using partition(), from the hbase table detail page, I found the data loading process was busy reading data from  and writing data to the first region. The first region became the hot spot. I guess it relates to vertex ids. Could help me again?

On Tuesday, December 8, 2020 at 3:13:42 PM UTC+8 HadoopMarc wrote:
Hi Roy,

As I mentioned, I did not keep up with possibly new janusgraph-hbase features. From the HBase source, I see that HBase now has a "hbase.mapreduce.tableinput.mappers.per.region" config parameter.


It should not be too difficult to adapt the janusgraph HBaseInputFormat to leverage this feature (or maybe it even works without change???).

Best wishes,

Marc

Op dinsdag 8 december 2020 om 04:21:19 UTC+1 schreef Roy Yu:
you seem to run on cloud infra that reduces your requested 40 Gb to 33 Gb (see https://databricks.com/session_na20/running-apache-spark-on-kubernetes-best-practices-and-pitfalls). Fact of life. 
---------------------
Sorry Marc I misled you. Error Message was generated when I set spark.executor.memory to 30G, when it failed, I increased spark.executor.memory  to 40G, it failed either. I felt desperate and come here to ask for help
On Tuesday, December 8, 2020 at 10:35:19 AM UTC+8 Roy Yu wrote:
Hi Marc

Thanks for your immediate response.
I've tried to set spark.yarn.executor.memoryOverhead=10G and re-run the task, and it stilled failed. From the spark task UI, I saw 80% of processing time is Full GC time. As you said, 2.6GB(GZ compressed) exploding is  my root cause. Now I'm trying to reduce my region size to 1GB, if that will still fail, I'm gonna config the hbase hfile not using compressed format.
This was my first time running janusgraph OLAP, and I think this is a common promblom, as HBase region size 2.6GB(compressed) is not large, 20GB is very common in our production. If the community dose not solve the promblem, the Janusgraph HBase based OLAP solution cannot be adopted by other companies either.

On Tuesday, December 8, 2020 at 12:40:40 AM UTC+8 HadoopMarc wrote:
Hi Roy,

There seem to be three things bothering you here:
  1. you did not specify spark.yarn.executor.memoryOverhead, as the exception message says. Easily solved.
  2. you seem to run on cloud infra that reduces your requested 40 Gb to 33 Gb (see https://databricks.com/session_na20/running-apache-spark-on-kubernetes-best-practices-and-pitfalls). Fact of life.
  3. the janusgraph HBaseInputFormat use sentire HBase regions as hadoop partitions, which are fed into spark tasks. The 2.6Gb region size is for compressed binary data which explodes when expanded into java objects. This is your real problem.
I did not follow the latest status of janusgraph-hbase features for the HBaseInputFormat, but you have to somehow use spark with smaller partitions than an entire HBase region.
A long time ago, I had success with skipping the HBaseInputFormat and have spark executors connect to JanusGraph themselves. That is not a quick solution, though.

Best wishes,

Marc

Op maandag 7 december 2020 om 14:10:55 UTC+1 schreef Roy Yu:
Error message:
ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 33.1 GB of 33 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled because of YARN-4714. 

 graph conifg:
spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:MaxGCPauseMillis=500 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/mnt/data_1/log/spark2/gc-spark%p.log
spark.executor.cores=1
spark.executor.memory=40960m
spark.executor.instances=3

Region info:
hdfs dfs -du -h /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc
67     134    /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/.regioninfo
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/.tmp
2.6 G  5.1 G  /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/e
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/f
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/g
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/h
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/i
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/l
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/m
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/recovered.edits
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/s
0      0      /apps/hbase/data/data/default/ky415/f069fafb3ee51d6a2e5bc2377b468bcc/t
root@~$

Anybody who can help me?


Re: How to open the same graph multiple times and not get the same object?

Boxuan Li <libo...@...>
 

Hi Mladen,

Agree with Marc, that's something you could try. If possible, could you share the reason why you have to open the same graph multiple times with different graph objects? If there is no other solution to your problem then this can be a feature request.

Best regards,
Boxuan

On Wednesday, December 9, 2020 at 2:50:48 PM UTC+8 HadoopMarc wrote:
Hi Mladen,

The constructor of StandardJanusGraph seems worth a try:


HTH,   Marc

Op dinsdag 8 december 2020 om 19:34:55 UTC+1 schreef Mladen Marović:
Hello,

I'm writing a Java program that, for various implementation details, needs to open the same graph multiple times. Currently I'm using JanusGraphFactory.open(...), but this always looks up the graph by its name in the static JanusGraphManager instance and returns the same object.

is there a way to create two different object instances of the same Janusgraph graph? These instances need to be completely separate, so that closing one graph does not close transactions created using the other graph. I checked the documentation and inspected the code directly while debugging, but couldn't find anything useful.

Thanks in advance,

Mladen


SimplePath query is slower in 6 node vs 3 node Cassandra cluster

Varun Ganesh <operatio...@...>
 

Hello,

I am currently using Janusgraph version 0.5.2. I have a graph with about 18 million vertices and 25 million edges.

I have two versions of this graph, one backed by a 3 node Cassandra cluster and another backed by 6 Cassandra nodes (both with 3x replication factor)

I am running the below query on both of them:

g.V().hasLabel('label_A').has('some_id', 123).has('data.name', 'value1').repeat(both('sample_edge').simplePath()).until(has('data.name', 'value2')).path().by('data.name').next()

The issue is that this query takes ~130ms on the 3 node cluster whereas it takes ~400ms on the 6 node cluster.

I have tried running ".profile()" on both versions and the outputs are almost identical in terms of the steps and time taken.

g.V().hasLabel('label_A').has('some_id', 123).has('data.name', 'value1').repeat(both('sample_edge').simplePath()).until(has('data.name', 'value2')).path().by('data.name').limit(1).profile()

==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[~label.eq(label_A), o...                     1           1           4.582     0.39
    \_condition=(~label = label_A AND some_id = 123 AND data.name = value1)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=multiKSQ[1]@8000
    \_index=someVertexByNameComposite
  optimization                                                                                 0.028
  optimization                                                                                 0.907
  backend-query                                                        1                       3.012
    \_query=someVertexByNameComposite:multiKSQ[1]@8000
    \_limit=8000
RepeatStep([JanusGraphVertexStep(BOTH,[...                     2           2        1167.493    99.45
  HasStep([data.name.eq(...                                                          803.247
  JanusGraphVertexStep(BOTH,[...                           12934       12934         334.095
    \_condition=type[sample_edge]
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812d311c
    \_multi=true
    \_vertices=264
    optimization                                                                               0.073
    backend-query                                                    266                       5.640
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812d311c
    optimization                                                                               0.028
    backend-query                                                  12689                     312.544
    \_query=org.janusgraph.diskstorage.keycolumnvalue.SliceQuery@812d311c
  PathFilterStep(simple)                                           12441       12441          10.980
  JanusGraphMultiQueryStep(RepeatEndStep)                           1187        1187          11.825
  RepeatEndStep                                                        2           2         810.468
RangeGlobalStep(0,1)                                                   1           1           0.419     0.04
PathStep([value(data.name)])                                 1           1           1.474     0.13
                                            >TOTAL                     -           -        1173.969        -

I'd really appreciate some input on figuring out why the query is 3x slower on 6 nodes.

I realise that you may require more context. Happy to provide more information as required!

(I had previously posted this on the forum: https://groups.google.com/g/janusgraph-users/c/nkNFaFzdr4I. But I was hoping that I might get a bit more traction through the mailing list)

 Thank you!


Re: Configuring Transaction Log feature

Pawan Shriwas <shriwa...@...>
 

Hi Sandeep,

I think I have already added below line to indicate that it should pull the detail from now onwords in processor. Is it not working?

 "setStartTimeNow()"

Is anyone other face the same thing in their java code? 

Thanks,
Pawan

On Friday, 4 December 2020 at 16:22:51 UTC+5:30 sa...@... wrote:
pawan,
can you check for following in your logs Loaded unidentified ReadMarker start time...
seems your readmarker is starting from 1970. so it tries to read changes since then

Regards,
Sandeep
On Saturday, November 28, 2020 at 8:48:18 PM UTC+8 shr...@... wrote:
one correction to last post in below line.

    JanusGraphTransaction tx = graph.buildTransaction().logIdentifier("TestLog").start();



On Saturday, 28 November 2020 at 18:16:09 UTC+5:30 Pawan Shriwas wrote:
Hi Sandeep,

Please see below java code and properties information which I am trying in local with Cassandra cql as backend.  This code is not giving me the change log as event which I can get via gremlin console with same script and properties. Please let me know if anything needs to be modify here with code or properties.

<!-- Java Code -->
package com.example.graph;

import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphTransaction;
import org.janusgraph.core.JanusGraphVertex;
import org.janusgraph.core.log.ChangeProcessor;
import org.janusgraph.core.log.ChangeState;
import org.janusgraph.core.log.LogProcessorFramework;
import org.janusgraph.core.log.TransactionId;

public class TestLog {
public static void listenLogsEvent(){
JanusGraph graph = JanusGraphFactory.open("/home/ist/Downloads/IM/jgraphdb_local.properties");
LogProcessorFramework logProcessor = JanusGraphFactory.openTransactionLog(graph);

logProcessor.addLogProcessor("TestLog").
    setProcessorIdentifier("TestLogCounter").
    setStartTimeNow().
    addProcessor(new ChangeProcessor(){
        @Override
        public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
        System.out.println("tx--"+tx.toString());
        System.out.println("txId--"+txId.toString());
        System.out.println("changeState--"+changeState.toString());
       }
    }).
    build();
for(int i=0;i<=10;i++) {
        System.out.println("going to add ="+i);
    JanusGraphTransaction tx = graph.buildTransaction().logIdentifier("PawanTestLog").start();
    JanusGraphVertex a = tx.addVertex("TimeL");
    a.property("type", "HOLD");
    a.property("serialNo", "XS31B4");
    tx.commit();
        System.out.println("Vertex committed ="+a.toString());
}
}
public static void main(String[] args) {
System.out.println("starting main");
listenLogsEvent();
}
}

<!----- graph properties------->
gremlin.graph=org.janusgraph.core.JanusGraphFactory
graph.name=TestGraph
storage.backend = cql
storage.hostname = localhost
storage.cql.keyspace=janusgraphcql
query.fast-property = true
storage.lock.wait-time=10000
storage.batch-loading=true

Thanks in advance.

Thanks,
Pawan


On Saturday, 28 November 2020 at 16:19:20 UTC+5:30 sa...@... wrote:
Pawan,
Can you elaborate more on the program where your are trying to embed the script in?
Regards,
Sandeep

On Sat, 28 Nov 2020, 13:48 Pawan Shriwas, <shr...@...> wrote:
Hey Jason,

Same thing happen with my as well where above script work well in gremlin console  but when we use it in java. we are not getting anything in process() section as callback. Could you help for the same.  


On Wednesday, 7 February 2018 at 20:28:41 UTC+5:30 Jason Plurad wrote:
It means that it will use the 'storage.backend' value as the storage. See the code in GraphDatabaseConfiguration.java. It looks like your only choice is 'default', and it seems like the option is there for the future possibility to use a different backend.

The code in the docs seemed to work ok, other than a minor change in the setStartTime() parameters. You can cut and paste this code into the Gremlin Console to use with the prepackaged distribution.

import java.util.concurrent.atomic.*;
import org.janusgraph.core.log.*;
import java.util.concurrent.*;

graph
= JanusGraphFactory.open('conf/janusgraph-cassandra-es.properties');

totalHumansAdded
= new AtomicInteger(0);
totalGodsAdded
= new AtomicInteger(0);
logProcessor
= JanusGraphFactory.openTransactionLog(graph);
logProcessor
.addLogProcessor("addedPerson").
        setProcessorIdentifier
("addedPersonCounter").
        setStartTime
(Instant.now()).
        addProcessor
(new ChangeProcessor() {
           
public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
               
for (v in changeState.getVertices(Change.ADDED)) {
                   
if (v.label().equals("human")) totalHumansAdded.incrementAndGet();
                   
System.out.println("total humans = " + totalHumansAdded);
               
}
           
}
       
}).
        addProcessor
(new ChangeProcessor() {
           
public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
               
for (v in changeState.getVertices(Change.ADDED)) {
                   
if (v.label().equals("god")) totalGodsAdded.incrementAndGet();
                   
System.out.println("total gods = " + totalGodsAdded);
               
}
           
}
       
}).
        build
()

tx
= graph.buildTransaction().logIdentifier("addedPerson").start();
u
= tx.addVertex(T.label, "human");
u
.property("name", "proteros");
u
.property("age", 36);
tx
.commit();

If you inspect the keyspace in Cassandra afterwards, you'll see that a separate table is created for "ulog_addedPerson".

Did you have some example code of what you are attempting?



On Wednesday, February 7, 2018 at 5:55:58 AM UTC-5, Sandeep Mishra wrote:
Hi Guys,

We are trying to used transaction log feature of Janusgraph, which is not working as expected.No callback is received at
public void process(JanusGraphTransaction janusGraphTransaction, TransactionId transactionId, ChangeState changeState) {

Janusgraph documentation says value for log.[X].backend is 'default'.
Not sure what exactly it means. does it mean HBase which is being used as backend for data.

Please let  me know, if anyone has configured it.

Thanks and Regards,
Sandeep Mishra

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/JN4ZsB9_DMM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgr...@....


Re: How to run groovy script in background?

HadoopMarc <bi...@...>
 

You could end your script with:

System.exit(0)

HTH,    Marc

Op woensdag 9 december 2020 om 04:16:43 UTC+1 schreef Phate:

Hi all, is it possible run gremlin.sh in background? 
I try to use `-e` argument to run a groovy script, and always change to stopped status, but it can finish when change to in foreground.

```
[bin]$ touch test.groovy
[bin]$ ./gremlin.sh -e test.groovy &
[1] 21385
[bin]$ 

[1]+  Stopped                 ./gremlin.sh -e test.groovy
[bin]$ fg
./gremlin.sh -e test.groovy
```

I found there a `stty` process, but redirect stdout to file and `stty -tostop` were not work.

```
[bin]$ ps
  PID TTY          TIME CMD
21347 pts/4    00:00:00 bash
21485 pts/4    00:00:06 java
21541 pts/4    00:00:00 ps
[bin]$ 

[1]+  Stopped                 ./gremlin.sh -e test.groovy
[bin]$ ps
  PID TTY          TIME CMD
21347 pts/4    00:00:00 bash
21485 pts/4    00:00:12 java
21545 pts/4    00:00:00 stty
21546 pts/4    00:00:00 ps
```
Same problem on CentOS7 and docker container, any idea how to solve it?


Re: How to open the same graph multiple times and not get the same object?

HadoopMarc <bi...@...>
 

Hi Mladen,

The constructor of StandardJanusGraph seems worth a try:

https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/StandardJanusGraph.java

HTH,   Marc

Op dinsdag 8 december 2020 om 19:34:55 UTC+1 schreef Mladen Marović:

Hello,

I'm writing a Java program that, for various implementation details, needs to open the same graph multiple times. Currently I'm using JanusGraphFactory.open(...), but this always looks up the graph by its name in the static JanusGraphManager instance and returns the same object.

is there a way to create two different object instances of the same Janusgraph graph? These instances need to be completely separate, so that closing one graph does not close transactions created using the other graph. I checked the documentation and inspected the code directly while debugging, but couldn't find anything useful.

Thanks in advance,

Mladen