Erratic behavior with Cassandra input and SparkGraphComputer OLAP engine
Samik R <sam...@...>
Hi, I am testing out SparkGraphComputer for OLAP queries, directly reading data from a JG-Cassandra-ES instance. Everything is running on a single VM, and I have built JG on the box but cloning the repo. Using hadoop version 2.7.1 with Spark 1.6.1. Cassandra version 2.1.9 (same as packaged). I am using the properties file mentioned in this SO thread - mostly because the setup matches with mine. I initially tried out with a smaller graph having ~1K nodes and 1.5K edges, and things seem to work fine. However when I try OLAP queries with ~300K nodes, I am facing various issues.
gremlin> graph = GraphFactory.open("conf/hadoop-graph/read-cassandra.properties") org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend Caused by: org.apache.thrift.transport.TTransportException ...
gremlin> g.V().count()
gremlin> g.V().groupCount().by(T.label) These all seems pretty erratic to me. Any suggestions on getting consistent result with this? Regards. -Samik |
|
Samik R <sam...@...>
Sorry, the SO thread response from which I have taken the properties is this one: http://stackoverflow.com/a/40180104/194742, not the one mentioned in the question. They are similar though. Thanks. -Samik |
|
Samik Raychaudhuri <sam...@...>
This was seemingly happening because I was
running out of memory on the VM. I am consistently getting results
after ensuring enough memory is available.
toggle quoted message
Show quoted text
Thanks. -Samik On 03-Mar-17 11:27 PM, Samik R wrote:
|
|
HadoopMarc <m.c.d...@...>
Hi Samik, Thank you too for posting the solution. Marc Op dinsdag 7 maart 2017 04:37:32 UTC+1 schreef Samik R:
|
|