Re: [BLOG] Configuring JanusGraph for spark-yarn


Joe Obernberger <joseph.o...@...>
 

Marc - thank you.  I've updated the classpath and removed nearly all of the CDH jars; had to keep chimera and some of the HBase libs in there.  Apart from those and all the jars in lib.zip, it is working as it did before.  The reason I turned DEBUG off was because it was producing 100+GBytes of logs.  Nearly all of which are things like:

18:04:29 DEBUG org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore - Generated HBase Filter ColumnRangeFilter [\x10\xC0, \x10\xC1)
18:04:29 DEBUG org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Guava vertex cache size: requested=20000 effective=20000 (min=100)
18:04:29 DEBUG org.janusgraph.graphdb.transaction.vertexcache.GuavaVertexCache - Created dirty vertex map with initial size 32
18:04:29 DEBUG org.janusgraph.graphdb.transaction.vertexcache.GuavaVertexCache - Created vertex cache with max size 20000
18:04:29 DEBUG org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore - Generated HBase Filter ColumnRangeFilter [\x10\xC2, \x10\xC3)
18:04:29 DEBUG org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Guava vertex cache size: requested=20000 effective=20000 (min=100)
18:04:29 DEBUG org.janusgraph.graphdb.transaction.vertexcache.GuavaVertexCache - Created dirty vertex map with initial size 32
18:04:29 DEBUG org.janusgraph.graphdb.transaction.vertexcache.GuavaVertexCache - Created vertex cache with max size 20000

Do those mean anything to you?  I've turned it back on for running with smaller graph sizes, but so far I don't see anything helpful there apart from an exception about not setting HADOOP_HOME.
Here are the spark properties; notice the nice and small extraClassPath!  :)

Name

Value

gremlin.graph

org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph

gremlin.hadoop.deriveMemory

false

gremlin.hadoop.graphReader

org.janusgraph.hadoop.formats.hbase.HBaseInputFormat

gremlin.hadoop.graphWriter

org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.graphWriter.hasEdges

false

gremlin.hadoop.inputLocation

none

gremlin.hadoop.jarsInDistributedCache

true

gremlin.hadoop.memoryOutputFormat

org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.outputLocation

output

janusgraphmr.ioformat.conf.storage.backend

hbase

janusgraphmr.ioformat.conf.storage.hbase.region-count

5

janusgraphmr.ioformat.conf.storage.hbase.regions-per-server

5

janusgraphmr.ioformat.conf.storage.hbase.short-cf-names

false

janusgraphmr.ioformat.conf.storage.hbase.table

TEST0.2.0

janusgraphmr.ioformat.conf.storage.hostname

10.22.5.65:2181

log4j.appender.STDOUT

org.apache.log4j.ConsoleAppender

log4j.logger.deng

WARNING

log4j.rootLogger

STDOUT

org.slf4j.simpleLogger.defaultLogLevel

warn

spark.akka.frameSize

1024

spark.app.id

application_1502118729859_0041

spark.app.name

Apache TinkerPop's Spark-Gremlin

spark.authenticate

false

spark.cores.max

64

spark.driver.appUIAddress

http://10.22.5.61:4040

spark.driver.extraJavaOptons

-XX:ReservedCodeCacheSize=100M -XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m

spark.driver.extraLibraryPath

/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native

spark.driver.host

10.22.5.61

spark.driver.port

38529

spark.dynamicAllocation.enabled

true

spark.dynamicAllocation.executorIdleTimeout

60

spark.dynamicAllocation.minExecutors

0

spark.dynamicAllocation.schedulerBacklogTimeout

1

spark.eventLog.dir

hdfs://host001:8020/user/spark/applicationHistory

spark.eventLog.enabled

true

spark.executor.extraClassPath

/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.2.0-SNAPSHOT.jar:./lib.zip/*:/opt/cloudera/parcels/CDH/lib/hbase/bin/../lib/*:/etc/hbase/conf:

spark.executor.extraJavaOptions

-XX:ReservedCodeCacheSize=100M -XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m -Dlogback.configurationFile=logback.xml

spark.executor.extraLibraryPath

/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native

spark.executor.heartbeatInterval

100000

spark.executor.id

driver

spark.executor.memory

10240m

spark.externalBlockStore.folderName

spark-27dac3f3-dfbc-4f32-b52d-ececdbcae0db

spark.kyroserializer.buffer.max

1600m

spark.master

yarn-client

spark.network.timeout

90000

spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS

host005

spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES

http://host005:8088/proxy/application_1502118729859_0041

spark.scheduler.mode

FIFO

spark.serializer

org.apache.spark.serializer.KryoSerializer

spark.shuffle.service.enabled

true

spark.shuffle.service.port

7337

spark.ui.filters

org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter

spark.ui.killEnabled

true

spark.yarn.am.extraLibraryPath

/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native

spark.yarn.appMasterEnv.CLASSPATH

/etc/haddop/conf:/etc/hbase/conf:./lib.zip/*

spark.yarn.config.gatewayPath

/opt/cloudera/parcels

spark.yarn.config.replacementPath

{{HADOOP_COMMON_HOME}}/../../..

spark.yarn.dist.archives

/home/graph/janusgraph-0.2.0-SNAPSHOT-hadoop2.JOE/lib.zip

spark.yarn.dist.files

/home/graph/janusgraph-0.2.0-SNAPSHOT-hadoop2.JOE/conf/logback.xml

spark.yarn.dist.jars

/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.2.0-SNAPSHOT.jar

spark.yarn.historyServer.address

http://host001:18088

zookeeper.znode.parent

/hbase


-Joe

On 8/9/2017 3:33 PM, HadoopMarc wrote:

Hi Gari and Joe,

Glad to see you testing the recipes for MapR and Cloudera respectively!  I am sure that you realized by now that getting this to work is like walking through a minefield. If you deviate from the known path, the odds for getting through are dim, and no one wants to be in your vicinity. So, if you see a need to deviate (which there may be for the hadoop distributions you use), you will need your mine sweeper, that is, put the logging level to DEBUG for relevant java packages.

This is where you deviated:
  • for Gari: you put all kinds of MapR lib folders on the applications master's classpath (other classpath configs are not visible from your post)
  • for Joe: you put all kinds of Cloudera lib folders on the executors classpath (worst of all the spark-assembly.jar)

Probably, you experience all kinds of mismatches in netty libraries which slows down or even kills all comms between the yarn containers. The philosophy of the recipes really is to only add the minimum number of conf folders and jars to the Tinkerpop/Janusgraph distribution and see from there if any libraries are missing.


At my side, it has become apparent that I should at least add to the recipes:

  • proof of work for a medium-sized graph (say 10M vertices and edges)
  • configs for the number of executors present in the OLAP job (instead of relying on spark default number of 2)

So, still some work to do!


Cheers,    Marc


Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.