Re: Strange behaviors for Janusgraph 0.5.3 on AWS EMR
hadoopmarc@...
Hi Alessandro,
The executors tab of the spark UI shows the product of spark.executor.instances times spark.executor.cores. I guess spark.executor.instances defaults to one and EMR might limit the number of executor cores?
I also won't hurt to explicitly specify
For having written output on "output", you have to configure the distributed storage, so that "output" is a path on hadoop-hdfs (each executor writes its output to a partition on the distributed storage, so you would have 768 partitions in the output directory). Be aware that TinkerPop uses a bit strange naming in the output directory.
Best wishes, Marc
Best wishes, Marc
The executors tab of the spark UI shows the product of spark.executor.instances times spark.executor.cores. I guess spark.executor.instances defaults to one and EMR might limit the number of executor cores?
I also won't hurt to explicitly specify
spark.submit.deployMode=client
assuming EMR allows it. I am not sure whether Gremlin Console needs client mode to have the count results returned. And with a "zero" result in the Gremlin Console did you mean 0 or just ==> ?For having written output on "output", you have to configure the distributed storage, so that "output" is a path on hadoop-hdfs (each executor writes its output to a partition on the distributed storage, so you would have 768 partitions in the output directory). Be aware that TinkerPop uses a bit strange naming in the output directory.
Best wishes, Marc
Best wishes, Marc