Date
1 - 6 of 6
Not able to run queries using spark graph computer from java
Sai Supraj R
Hi,
I am getting the following error when running queries using spark graph computer from java. Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Edge with id already exists: 1469152598528
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:182)
at org.janusgraph.hadoop.formats.util.HadoopRecordReader.nextKeyValue(HadoopRecordReader.java:69)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:230)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:187)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
... 3 more code: Graph graph = JanusGraphFactory.open("read-cql.properties"); read-cql.properties: gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cql.CqlInputFormat gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat gremlin.hadoop.jarsInDistributedCache=true gremlin.hadoop.inputLocation=none gremlin.hadoop.outputLocation=output gremlin.spark.persistContext=true janusgraphmr.ioformat.conf.storage.backend=cql # This specifies the hostname & port for Cassandra data store. janusgraphmr.ioformat.conf.storage.hostname=10.88.68.52,10.88.68.11,10.88.68.47 janusgraphmr.ioformat.conf.storage.port=9042 # This specifies the keyspace where data is stored. janusgraphmr.ioformat.conf.storage.cql.keyspace=iqvia cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner cassandra.input.widerows=true spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator Thanks Sai |
|
hadoopmarc@...
Hi Sai,
The calling code you present is not complete. The first line should read (because HadoopGraph does not derive from JanusGraph): Graph graph = GraphFactory.open("read-cql.properties");Best wishes, Marc |
|
Sai Supraj R
Hi Marc, Sorry my bad I have posted the wrong code. I used Graph graph = GraphFactory.open("read-cql.properties"); and i got the above error. Thanks Sai On Thu, May 6, 2021 at 10:11 AM <hadoopmarc@...> wrote: Hi Sai, |
|
hadoopmarc@...
Hi Sai,
What happens in createTraversal()? What do you get with g.V(1469152598528).elementMap() if you open the graph for OLTP queries? Best wishes, Marc |
|
Sai Supraj R
Hi Marc, I got this when querying using OLTP: gremlin> g.V(1469152598528) ==>v[1469152598528] gremlin> g.V(1469152598528).elementMap() ==>[id:1469152598528,label:vertex] I am also trying to run spark graph computer with yarn on emr. Spark version = 2.4.4 Scala version = 2.12.10 java.io.FileNotFoundException: File file:/home/hadoop/.sparkStaging/application_1618505307369/__spark_libs__910446852825.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:671) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:992) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:661) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:464) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:243) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:236) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:224) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) I followed this blog but ended up with the above exception: Thanks Sai On Fri, May 7, 2021 at 7:33 AM <hadoopmarc@...> wrote: Hi Sai, |
|
hadoopmarc@...
Hi Sai,
The blog you mentioned is a bit outdated and is for spark-1.x. To get an idea of what changes are needed to get OLAP running with spark-2.x, you can take a look at: https://tinkerpop.apache.org/docs/current/recipes/#olap-spark-yarn Best wishes, Marc |
|