Re: [BLOG] Configuring JanusGraph for spark-yarn


HadoopMarc <bi...@...>
 

Hi Gari and Joe,

Glad to see you testing the recipes for MapR and Cloudera respectively!  I am sure that you realized by now that getting this to work is like walking through a minefield. If you deviate from the known path, the odds for getting through are dim, and no one wants to be in your vicinity. So, if you see a need to deviate (which there may be for the hadoop distributions you use), you will need your mine sweeper, that is, put the logging level to DEBUG for relevant java packages.

This is where you deviated:
  • for Gari: you put all kinds of MapR lib folders on the applications master's classpath (other classpath configs are not visible from your post)
  • for Joe: you put all kinds of Cloudera lib folders on the executors classpath (worst of all the spark-assembly.jar)

Probably, you experience all kinds of mismatches in netty libraries which slows down or even kills all comms between the yarn containers. The philosophy of the recipes really is to only add the minimum number of conf folders and jars to the Tinkerpop/Janusgraph distribution and see from there if any libraries are missing.


At my side, it has become apparent that I should at least add to the recipes:

  • proof of work for a medium-sized graph (say 10M vertices and edges)
  • configs for the number of executors present in the OLAP job (instead of relying on spark default number of 2)

So, still some work to do!


Cheers,    Marc


Op woensdag 9 augustus 2017 19:16:23 UTC+2 schreef Joseph Obernberger:

Hi Marc - I did try splitting regions.  What happens when I run the SparkGraphComputer job, is that it seems to hit one region server hard, then moves onto the next; appears to run serially.  All that said, I think the big issue is that running with Spark for a count takes 2 hours and running without takes 2 minutes.  Perhaps the SparkGraphComputer is not the tool to be used for handling large graphs?  Although, I'm not sure what the purpose is then?

For reference, here is my Hadoop configuration file using Cloudera CDH 5.10.0 and JanusGraph 0.1.0:

Config:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
 
gremlin.hadoop.memoryOutputFormat=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat
gremlin.hadoop.memoryOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.deriveMemory=false
 
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
 
log4j.rootLogger=WARNING, STDOUT
log4j.logger.deng=WARNING
log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
org.slf4j.simpleLogger.defaultLogLevel=warn
 
#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
janusgraphmr.ioformat.conf.storage.hostname=host003:2181,host004:2181,host005:2181
janusgraphmr.ioformat.conf.storage.hbase.table=Large2
janusgraphmr.ioformat.conf.storage.hbase.region-count=5
janusgraphmr.ioformat.conf.storage.hbase.regions-per-server=5
janusgraphmr.ioformat.conf.storage.hbase.short-cf-names=false
# Security configs are needed in case of a secure cluster
zookeeper.znode.parent=/hbase
 
#
# SparkGraphComputer with Yarn Configuration
#
#spark.master=yarn-client
spark.executor.extraJavaOptions=-XX:ReservedCodeCacheSize=100M -XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m -Dlogback.configurationFile=logback.xml
spark.driver.extraJavaOptons=-XX:ReservedCodeCacheSize=100M -XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m
spark.master=yarn-cluster
spark.executor.memory=10240m
spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.yarn.dist.archives=/home/graph/janusgraph-0.1.0-SNAPSHOT-hadoop2/lib.zip
spark.yarn.dist.files=/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.1.0-SNAPSHOT.jar,/home/graph/janusgraph-0.1.0-SNAPSHOT-hadoop2/conf/logback.xml
spark.yarn.dist.jars=/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.1.0-SNAPSHOT.jar
spark.yarn.appMasterEnv.CLASSPATH=/etc/haddop/conf:/etc/hbase/conf:./lib.zip/*
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/native:/opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/lib/native/Linux-amd64-64
spark.akka.frameSize=1024
spark.kyroserializer.buffer.max=1600m
spark.network.timeout=90000
spark.executor.heartbeatInterval=100000
spark.cores.max=64
 
#
# Relevant configs from spark-defaults.conf
#
spark.authenticate=false
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.executorIdleTimeout=60
spark.dynamicAllocation.minExecutors=0
spark.dynamicAllocation.schedulerBacklogTimeout=1
spark.eventLog.enabled=true
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.shuffle.service.enabled=true
spark.shuffle.service.port=7337
spark.ui.killEnabled=true
spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.1.0-SNAPSHOT.jar:./lib.zip/*:\
/opt/cloudera/parcels/CDH/lib/hbase/bin/../conf:\
/opt/cloudera/parcels/CDH/lib/hbase/bin/..:\
/opt/cloudera/parcels/CDH/lib/spark/assembly/lib/spark-assembly.jar:\
/usr/lib/jvm/java-openjdk/jre/lib/rt.jar:\
/opt/cloudera/parcels/CDH/lib/hbase/bin/../lib/*:\
/etc/hadoop/conf:\
/etc/hbase/conf:\
/opt/cloudera/parcels/CDH/lib/hadoop/libexec/../../hadoop/lib/*:\
/opt/cloudera/parcels/CDH/lib/hadoop/libexec/../../hadoop/.//*:\
/opt/cloudera/parcels/CDH/lib/hadoop/libexec/../../hadoop-yarn/lib/*:\
/opt/cloudera/parcels/CDH/lib/hadoop/libexec/../../hadoop-yarn/.//*
spark.eventLog.dir=hdfs://host001:8020/user/spark/applicationHistory
spark.yarn.historyServer.address=http://host001:18088
spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native
spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native
spark.yarn.config.gatewayPath=/opt/cloudera/parcels
spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
spark.master=yarn-client
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Thanks very much for any input / further dialog!

-Joe

On 8/8/2017 3:17 AM, HadoopMarc wrote:
Hi Joseph,

You ran into terrain I have not yet covered myself. Up till now I have been using the graben1437 PR for Titan and for OLAP I adopted a poor man's approach where node id's are distributed over spark tasks and each spark executor makes its own Titan/HBase connection. This performs well, but does not have the nice abstraction of the HBaseInputFormat.

So, no clear answer to this one, but just some thoughts:
 - could you try to move some regions manually and see what it does to performance?
 - how do your OLAP vertex count times compare to the OLTP count times?
 - how does the sum of spark task execution times compare to the yarn start-to-end time difference you reported? In other words, how much of the start-to-end time is spent in waiting for timeouts?
 - unless you managed to create a vertex with > 1GB size, the RowTooBigException sounds like a bug (which you can report on Jnausgraph's github page). Hbase does not like large rows at all, so vertex/edge properties should not have blob values.
 
@(David Robinson): do you have any additional thoughts on this?

Cheers,    Marc

Op maandag 7 augustus 2017 23:12:02 UTC+2 schreef Joseph Obernberger:

Hi Marc - I've been able to get it to run longer, but am now getting a RowTooBigException from HBase.  How does JanusGraph store data in HBase?  The current max size of a row in 1GByte, which makes me think this error is covering something else up.

What I'm seeing so far in testing with a 5 server cluster - each machine with 128G of RAM:
HBase table is 1.5G in size, split across 7 regions, and has 20,001,105 rows.  To do a g.V().count() takes 2 hours and results in 3,842,755 verticies.

Another HBase table is 5.7G in size split across 10 regions, is 57,620,276 rows, and took 6.5 hours to run the count and results in 10,859,491 nodes.  When running, it looks like it hits one server very hard even though the YARN tasks are distributed across the cluster.  One HBase node gets hammered.

The RowTooBigException is below.  Anything to try?  Thank you for any help!


org.janusgraph.core.JanusGraphException: Could not process individual retrieval call
                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:257)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1269)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1137)
                at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:209)
                at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:75)
                at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:54)
                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:67)
                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:28)
                at com.google.common.collect.Iterators$7.computeNext(Iterators.java:651)
                at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
                at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
                at org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl.getTypeInspector(JanusGraphHadoopSetupImpl.java:60)
                at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.<init>(JanusGraphVertexDeserializer.java:55)
                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.lambda$static$0(GiraphInputFormat.java:49)
                at org.janusgraph.hadoop.formats.util.GiraphInputFormat$RefCountedCloseable.acquire(GiraphInputFormat.java:100)
                at org.janusgraph.hadoop.formats.util.GiraphRecordReader.<init>(GiraphRecordReader.java:47)
                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.createRecordReader(GiraphInputFormat.java:67)
                at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:166)
                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
                at org.apache.spark.scheduler.Task.run(Task.scala:89)
                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                at java.lang.Thread.run(Thread.java:745)
Caused by: org.janusgraph.core.JanusGraphException: Could not call index
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1262)
                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:255)
                ... 34 more
Caused by: org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:57)
                at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:444)
                at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:395)
                at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:51)
                at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:529)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.lambda$call$5(StandardJanusGraphTx.java:1258)
                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:97)
                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:89)
                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:81)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1258)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1255)
                at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
                at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
                at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
                at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
                at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
                at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
                at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1255)
                ... 35 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT10S
                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:101)
                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
                ... 53 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getHelper(HBaseKeyColumnValueStore.java:202)
                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getSlice(HBaseKeyColumnValueStore.java:90)
                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:398)
                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:395)
                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
                ... 54 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
Sat Aug 05 07:22:03 EDT 2017, RpcRetryingCaller{globalStartTime=1501932111280, pause=100, retries=35}, org.apache.hadoop.hbase.regionserver.RowTooBigException: rg.apache.hadoop.hbase.regionserver.RowTooBigException: Max row size allowed: 1073741824, but the row is bigger than that.
                at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:564)
                at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5697)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5856)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5634)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5611)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5597)
                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6792)
                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6770)
                at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2023)
                at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
                at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
                at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)


On 8/6/2017 3:50 PM, HadoopMarc wrote:
Hi ... and others,  I have been offline for a few weeks enjoying a holiday and will start looking into your questions and make the suggested corrections. Thanks for following the recipes and helping others with it.

..., did you run the recipe on the same HDP sandbox and same Tinkerpop version? I remember (from 4 weeks ago) that copying the zookeeper.znode.parent property from the hbase configs to the janusgraph configs was essential to get janusgraph's HBaseInputFormat working (that is: read graph data for the spark tasks).

Cheers,    Marc

Op maandag 24 juli 2017 10:12:13 UTC+2 schreef spi...@...:
hi,Thanks for your post.
I did it according to the post.But I ran into a problem.
15:58:49,110  INFO SecurityManager:58 - Changing view acls to: rc
15:58:49,110  INFO SecurityManager:58 - Changing modify acls to: rc
15:58:49,110  INFO SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rc); users with modify permissions: Set(rc)
15:58:49,111  INFO Client:58 - Submitting application 25 to ResourceManager
15:58:49,320  INFO YarnClientImpl:274 - Submitted application application_1500608983535_0025
15:58:49,321  INFO SchedulerExtensionServices:58 - Starting Yarn extension services with app application_1500608983535_0025 and attemptId None
15:58:50,325  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:50,326  INFO Client:58 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1500883129115
final status: UNDEFINED
user: rc
15:58:51,330  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:52,333  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:53,335  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:54,337  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:55,340  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:56,343  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:56,802  INFO YarnSchedulerBackend$YarnSchedulerEndpoint:58 - ApplicationMaster registered as NettyRpcEndpointRef(null)
15:58:56,822  INFO YarnClientSchedulerBackend:58 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dl-rc-optd-ambari-master-v-test-1.host.dataengine.com,dl-rc-optd-ambari-master-v-test-2.host.dataengine.com, PROXY_URI_BASES -> http://dl-rc-optd-ambari-master-v-test-1.host.dataengine.com:8088/proxy/application_1500608983535_0025,http://dl-rc-optd-ambari-master-v-test-2.host.dataengine.com:8088/proxy/application_1500608983535_0025), /proxy/application_1500608983535_0025
15:58:56,824  INFO JettyUtils:58 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15:58:57,346  INFO Client:58 - Application report for application_1500608983535_0025 (state: RUNNING)
15:58:57,347  INFO Client:58 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.200.48.154
ApplicationMaster RPC port: 0
queue: default
start time: 1500883129115
final status: UNDEFINED
user: rc
15:58:57,348  INFO YarnClientSchedulerBackend:58 - Application application_1500608983535_0025 has started running.
15:58:57,358  INFO Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 47514.
15:58:57,358  INFO NettyBlockTransferService:58 - Server created on 47514
15:58:57,360  INFO BlockManagerMaster:58 - Trying to register BlockManager
15:58:57,363  INFO BlockManagerMasterEndpoint:58 - Registering block manager 10.200.48.112:47514 with 2.4 GB RAM, BlockManagerId(driver, 10.200.48.112, 47514)15:58:57,366  INFO BlockManagerMaster:58 - Registered BlockManager
15:58:57,585  INFO EventLoggingListener:58 - Logging events to hdfs:///spark-history/application_1500608983535_0025
15:59:07,177  WARN YarnSchedulerBackend$YarnSchedulerEndpoint:70 - Container marked as failed: container_e170_1500608983535_0025_01_000002 on host: dl-rc-optd-ambari-slave-v-test-1.host.dataengine.com. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_e170_1500608983535_0025_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Shell output: main : command provided 1
main : run as user is rc
main : requested yarn user is rc


Container exited with a non-zero exit code 1
Display stack trace? [yN]15:59:57,702  WARN TransportChannelHandler:79 - Exception in connection from 10.200.48.155/10.200.48.155:50921
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
15:59:57,704 ERROR TransportResponseHandler:132 - Still have 1 requests outstanding when connection from 10.200.48.155/10.200.48.155:50921 is closed
15:59:57,706  WARN NettyRpcEndpointRef:91 - Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)

I am confused about that. Could you please help me?



在 2017年7月6日星期四 UTC+8下午4:15:37,HadoopMarc写道:

Readers wanting to run OLAP queries on a real spark-yarn cluster might want to check my recent post:

http://yaaics.blogspot.nl/2017/07/configuring-janusgraph-for-spark-yarn.html

Regards,  Marc
--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Virus-free. www.avg.com

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Join {janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.