Date
1 - 6 of 6
Not able to run queries using spark graph computer from java
Sai Supraj R
Hi,
I am getting the following error when running queries using spark graph computer from java.
I am getting the following error when running queries using spark graph computer from java.
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Edge with id already exists: 1469152598528
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:182)
at org.janusgraph.hadoop.formats.util.HadoopRecordReader.nextKeyValue(HadoopRecordReader.java:69)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:230)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:187)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
... 3 more
code:
read-cql.properties:
Thanks
Sai
code:
Graph graph = JanusGraphFactory.open("read-cql.properties");
GraphTraversalSource g = createTraversal();
x = g.V().count()
read-cql.properties:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat gremlin.hadoop.jarsInDistributedCache=true gremlin.hadoop.inputLocation=none gremlin.hadoop.outputLocation=output gremlin.spark.persistContext=true janusgraphmr.ioformat.conf.storage.backend=cql # This specifies the hostname & port for Cassandra data store. janusgraphmr.ioformat.conf.storage.hostname=10.88.68.52,10.88.68.11,10.88.68.47 janusgraphmr.ioformat.conf.storage.port=9042 # This specifies the keyspace where data is stored. janusgraphmr.ioformat.conf.storage.cql.keyspace=iqvia cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner cassandra.input.widerows=true spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
Thanks
Sai
Sai Supraj R
Hi Marc,
Sorry my bad I have posted the wrong code.
I used Graph graph = GraphFactory.open("read-cql.properties");
and i got the above error.
Thanks
Sai
On Thu, May 6, 2021 at 10:11 AM <hadoopmarc@...> wrote:
Hi Sai,
The calling code you present is not complete.
The first line should read (because HadoopGraph does not derive from JanusGraph):
Graph graph = GraphFactory.open("read-cql.properties");Best wishes, Marc
Sai Supraj R
Hi Marc,
I got this when querying using OLTP:
gremlin> g.V(1469152598528)
==>v[1469152598528]
gremlin> g.V(1469152598528).elementMap()
==>[id:1469152598528,label:vertex]
==>v[1469152598528]
gremlin> g.V(1469152598528).elementMap()
==>[id:1469152598528,label:vertex]
I am also trying to run spark graph computer with yarn on emr.
Spark version = 2.4.4
Scala version = 2.12.10
java.io.FileNotFoundException: File file:/home/hadoop/.sparkStaging/application_1618505307369/__spark_libs__910446852825.zip does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:671)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:992)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:661)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:464)
at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:671)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:992)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:661)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:464)
at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I followed this blog but ended up with the above exception:
Thanks
Sai
On Fri, May 7, 2021 at 7:33 AM <hadoopmarc@...> wrote:
Hi Sai,
What happens in createTraversal()?
What do you get with g.V(1469152598528).elementMap() if you open the graph for OLTP queries?
Best wishes, Marc
hadoopmarc@...
Hi Sai,
The blog you mentioned is a bit outdated and is for spark-1.x. To get an idea of what changes are needed to get OLAP running with spark-2.x, you can take a look at:
https://tinkerpop.apache.org/docs/current/recipes/#olap-spark-yarn
Best wishes, Marc
The blog you mentioned is a bit outdated and is for spark-1.x. To get an idea of what changes are needed to get OLAP running with spark-2.x, you can take a look at:
https://tinkerpop.apache.org/docs/current/recipes/#olap-spark-yarn
Best wishes, Marc