Unknown compressor type with sparkGraphComputer


Ajay Srivastava <Ajay.Sr...@...>
 

Hi,

I am executing gremlin query using SparkGraphComputer and get following exception -

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured dev-3/192.101.167.171:8182

gremlin> :> olapgraph.traversal().withComputer(org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer).V().count()

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Unknown compressor type for id: 2
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer$CompressionType.getFromId(StringSerializer.java:273)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:104)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:38)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObjectInternal(StandardSerializer.java:236)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObject(StandardSerializer.java:224)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:203)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:193)
at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:129)
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:100)
at org.janusgraph.hadoop.formats.util.GiraphRecordReader.nextKeyValue(GiraphRecordReader.java:60)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:168)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.advance(IteratorUtils.java:298)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.hasNext(IteratorUtils.java:269)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Here is my olapgraph configuration -

# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
janusgraphmr.ioformat.conf.storage.hostname=dev-1
janusgraphmr.ioformat.conf.storage.hbase.table=tryJanus

#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer

Is my configuration correct ?
Have I missed setting any property here ?

Regards,
Ajay


Ajay Srivastava <Ajay.Sr...@...>
 

The compression in all the column families is set to gz, which is second entry (id = 1) in Enum of both HBase and janusgraph. So there should not be any problem in running this query.
This query and others work well in embedded mode and gremlin’s oltp graph. Is StringSerializer called only when SparkGraphComputer is used ?

I have also run major_compaction twice in HBase but the problem remains.

Is this a bug ?
Is someone already running SparkGraphComputer with HBase as backend without any problem ?


Regards,
Ajay

On 04-Oct-2017, at 4:41 PM, Ajay Srivastava <ajay.sr...@...> wrote:

Hi,

I am executing gremlin query using SparkGraphComputer and get following exception -

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured dev-3/192.101.167.171:8182

gremlin> :> olapgraph.traversal().withComputer(org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer).V().count()

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Unknown compressor type for id: 2
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer$CompressionType.getFromId(StringSerializer.java:273)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:104)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:38)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObjectInternal(StandardSerializer.java:236)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObject(StandardSerializer.java:224)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:203)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:193)
at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:129)
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:100)
at org.janusgraph.hadoop.formats.util.GiraphRecordReader.nextKeyValue(GiraphRecordReader.java:60)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:168)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.advance(IteratorUtils.java:298)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.hasNext(IteratorUtils.java:269)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Here is my olapgraph configuration -

# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
janusgraphmr.ioformat.conf.storage.hostname=dev-1
janusgraphmr.ioformat.conf.storage.hbase.table=tryJanus

#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer

Is my configuration correct ?
Have I missed setting any property here ?

Regards,
Ajay


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsu...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/89F8CB46-D78F-4DB5-ABAF-48F576E98F60%40guavus.com.
For more options, visit https://groups.google.com/d/optout.


marc.d...@...
 

Hi Ajay,

Never tried this myself. Could you post the remote.yaml and gremlin-server.yaml files too (if they deviate from the distribution ones)? This way others will easier recognize serialization config problems.

Cheers,    Marc

Op woensdag 4 oktober 2017 13:11:30 UTC+2 schreef Ajay Srivastava:

Hi,

I am executing gremlin query using SparkGraphComputer and get following exception -

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured dev-3/192.101.167.171:8182

gremlin> :> olapgraph.traversal().withComputer(org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer).V().count()

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Unknown compressor type for id: 2
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer$CompressionType.getFromId(StringSerializer.java:273)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:104)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:38)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObjectInternal(StandardSerializer.java:236)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObject(StandardSerializer.java:224)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:203)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:193)
at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:129)
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:100)
at org.janusgraph.hadoop.formats.util.GiraphRecordReader.nextKeyValue(GiraphRecordReader.java:60)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:168)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.advance(IteratorUtils.java:298)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.hasNext(IteratorUtils.java:269)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Here is my olapgraph configuration -

# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
janusgraphmr.ioformat.conf.storage.hostname=dev-1
janusgraphmr.ioformat.conf.storage.hbase.table=tryJanus

#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer

Is my configuration correct ?
Have I missed setting any property here ?

Regards,
Ajay


Ajay Srivastava <Ajay.Sr...@...>
 

Hi Marc,

gremlin-server.yaml —>

host: dev-3
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/gremlin-server/socket-janusgraph-hbase-server.properties,
  olapgraph: conf/hadoop-graph/read-hbase.properties }
plugins:
  - janusgraph.imports
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}

remote.yaml —>

hosts: [dev-3]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}


Regards,
Ajay

On 05-Oct-2017, at 5:21 PM, marc.d...@... wrote:

Hi Ajay,

Never tried this myself. Could you post the remote.yaml and gremlin-server.yaml files too (if they deviate from the distribution ones)? This way others will easier recognize serialization config problems.

Cheers,    Marc

Op woensdag 4 oktober 2017 13:11:30 UTC+2 schreef Ajay Srivastava:
Hi,

I am executing gremlin query using SparkGraphComputer and get following exception -

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured dev-3/192.101.167.171:8182

gremlin> :> olapgraph.traversal().withComputer(org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer).V().count()

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Unknown compressor type for id: 2
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer$CompressionType.getFromId(StringSerializer.java:273)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:104)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:38)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObjectInternal(StandardSerializer.java:236)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObject(StandardSerializer.java:224)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:203)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:193)
at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:129)
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:100)
at org.janusgraph.hadoop.formats.util.GiraphRecordReader.nextKeyValue(GiraphRecordReader.java:60)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:168)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.advance(IteratorUtils.java:298)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.hasNext(IteratorUtils.java:269)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Here is my olapgraph configuration -

# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
janusgraphmr.ioformat.conf.storage.hostname=dev-1
janusgraphmr.ioformat.conf.storage.hbase.table=tryJanus

#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer

Is my configuration correct ?
Have I missed setting any property here ?

Regards,
Ajay


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsu...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/3497f8ac-60ef-4945-9c4c-e543d214e02a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


marc.d...@...
 

Hi Ajay,

I have no idea what is happening here, but since you use gremlin console, you could try the remote-object.yaml as an alternative.

Any one else?

Marc

Op donderdag 5 oktober 2017 13:59:35 UTC+2 schreef Ajay Srivastava:

Hi Marc,

gremlin-server.yaml —>

host: dev-3
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/gremlin-server/socket-janusgraph-hbase-server.properties,
  olapgraph: conf/hadoop-graph/read-hbase.properties }
plugins:
  - janusgraph.imports
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}

remote.yaml —>

hosts: [dev-3]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}


Regards,
Ajay

On 05-Oct-2017, at 5:21 PM, ma...@... wrote:

Hi Ajay,

Never tried this myself. Could you post the remote.yaml and gremlin-server.yaml files too (if they deviate from the distribution ones)? This way others will easier recognize serialization config problems.

Cheers,    Marc

Op woensdag 4 oktober 2017 13:11:30 UTC+2 schreef Ajay Srivastava:
Hi,

I am executing gremlin query using SparkGraphComputer and get following exception -

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured dev-3/192.101.167.171:8182

gremlin> :> olapgraph.traversal().withComputer(org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer).V().count()

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Unknown compressor type for id: 2
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer$CompressionType.getFromId(StringSerializer.java:273)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:104)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:38)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObjectInternal(StandardSerializer.java:236)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObject(StandardSerializer.java:224)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:203)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:193)
at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:129)
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:100)
at org.janusgraph.hadoop.formats.util.GiraphRecordReader.nextKeyValue(GiraphRecordReader.java:60)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:168)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.advance(IteratorUtils.java:298)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.hasNext(IteratorUtils.java:269)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Here is my olapgraph configuration -

# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
janusgraphmr.ioformat.conf.storage.hostname=dev-1
janusgraphmr.ioformat.conf.storage.hbase.table=tryJanus

#
# SparkGraphComputer Configuration
#
spark.master=local[4]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer

Is my configuration correct ?
Have I missed setting any property here ?

Regards,
Ajay


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/3497f8ac-60ef-4945-9c4c-e543d214e02a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.