Erratic behavior with Cassandra input and SparkGraphComputer OLAP engine


Samik R <sam...@...>
 

Hi,

I am testing out SparkGraphComputer for OLAP queries, directly reading data from a JG-Cassandra-ES instance. Everything is running on a single VM, and I have built JG on the box but cloning the repo. Using hadoop version 2.7.1 with Spark 1.6.1. Cassandra version 2.1.9 (same as packaged).

I am using the properties file mentioned in this SO thread - mostly because the setup matches with mine. I initially tried out with a smaller graph having ~1K nodes and 1.5K edges, and things seem to work fine. However when I try OLAP queries with ~300K nodes, I am facing various issues.

  • Initially, I got hit by the Exception: "java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Frame size (20784689) larger than max length (15728640)!". After some reasearch, I added the following line to the properties file: cassandra.thrift.framed.size_mb=200
  • In the next try, the Cassandra process died when I tried running the query. The gremlin server and ES processes were running though.

gremlin> graph = GraphFactory.open("conf/hadoop-graph/read-cassandra.properties")
==>hadoopgraph[cassandrainputformat->gryooutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
[Stage 0:===21:16:23 ERROR org.apache.spark.executor.Executor  - Exception in task 4.0 in stage 0.0 (TID 4)
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
...

org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
...

Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
...

Caused by: org.apache.thrift.transport.TTransportException

...


  • I restarted janusgraph and retried the same query. This time the query went through, but the same exception reappeared when I tried a groupCount.

gremlin> g.V().count()
                                                                        ==>108156
gremlin> g.V().groupCount().by(T.label)
[Stage 0:>                           21:23:49 ERROR org.apache.spark.executor.Executor  - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException


  • Another restart, and the groupCount() query went through, but the gremlin shell got killed when I tried the count query. All three daemons (gremlin, Cassandra and ES) were still running though.

gremlin> g.V().groupCount().by(T.label)
[Stage 2:>                                                          (0 +==>[hotLead:1,proactiveChatInvite:1,chatSession:906,webPage:56921,buttonChatInvite:1,webPurchase:1,visitor:1269,webSession:27378,device:21677,cart:1]
gremlin> g.V().count()
[Stage 0:>                           Killed                                    
samik@samik-lap:~/git/janusgraph$ Write failed: Broken pipe


These all seems pretty erratic to me. Any suggestions on getting consistent result with this?


Regards.

-Samik


Samik R <sam...@...>
 

Sorry, the SO thread response from which I have taken the properties is this one: http://stackoverflow.com/a/40180104/194742, not the one mentioned in the question. They are similar though.
Thanks.
-Samik


Samik Raychaudhuri <sam...@...>
 

This was seemingly happening because I was running out of memory on the VM. I am consistently getting results after ensuring enough memory is available.
Thanks.
-Samik

On 03-Mar-17 11:27 PM, Samik R wrote:

Sorry, the SO thread response from which I have taken the properties is this one: http://stackoverflow.com/a/40180104/194742, not the one mentioned in the question. They are similar though.
Thanks.
-Samik

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
For more options, visit https://groups.google.com/d/optout.


HadoopMarc <m.c.d...@...>
 

Hi Samik,

Thank you too for posting the solution.

Marc

Op dinsdag 7 maart 2017 04:37:32 UTC+1 schreef Samik R:

This was seemingly happening because I was running out of memory on the VM. I am consistently getting results after ensuring enough memory is available.
Thanks.
-Samik

On 03-Mar-17 11:27 PM, Samik R wrote:
Sorry, the SO thread response from which I have taken the properties is this one: http://stackoverflow.com/a/40180104/194742, not the one mentioned in the question. They are similar though.
Thanks.
-Samik

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.