Date   

Re: A few questions about JanusGraph.

Jason Plurad <plu...@...>
 

Sounds like the documentation could use some improvements to help make this more clear. I've opened up an issue to track it.

1) What is the relation between Gremlin server (bin/gremlin-server.bat) and the JanusGraph server (bin/janusgraph.sh)?

The pre-packaged distribution of JanusGraph starts an instance of Cassandra, Elasticsearch, and Gremlin Server to allow users to get started quickly.

You can start a Gremlin Server manually with bin/gremlin-server.sh

2) Properties in janusgraph-cassandra-es-server.properties vs. JanusGraphFactory.build().set()...open()

If you want to connect to the same graph that the Gremlin Server has defined, yes, you should use the same properties. Using a properties file could make this easier for reuse JanusGraphFactory.open("conf/gremlin-server/janusgraph-cassandra-es-server.properties"), but if you JanusGraphFactory.build().set()...open() with the same properties, you'll be connecting to the same graph.

3) In the above API, I haven't specified the JanusGraph server endpoint (the URL or the port), so which server is my Java code connecting to?

Your code is connecting to the Cassandra server. When you configure a graph using JanusGraphFactory.open(), your application is creating an embedded graph instance. It is not connecting to the graph instance running on the JanusGraph server. The graph data is ultimately stored in Cassandra, so both the JanusGraph Server and your application are working with the same graph data.

That being said, you could connect to the graph instance on the Gremlin Server using a remote connection as described in the TinkerPop docs.

4) Does Java API use websockets, and can JanusGraph server run on a different machine (right now, my Cassandra and gremlin server run on the same machine)?

In the scenario where you have an embedded graph instance, your calls to the graph are not using WebSockets. Your application is communicating directly with the Cassandra storage backend using Thrift. A Gremlin Server can run on a different machine than the storage backend. The janusgraph-cassandra-es-server.properties lets the Gremlin Server know where to find the storage backend (see the storage.hostname property).

5) Is Java API the same as Gremlin language / API?

JanusGraph implements the Apache TinkerPop APIs, including Gremlin. When you are doing graph traversals, you are dealing with TinkerPop's Gremlin language -- i.e. g.V().has("name", "manoj").toList(). The schema and index APIs are specific to JanusGraph because these are not provided by the TinkerPop abstraction.

6) Where is the documentation / examples for REST API (for adding / querying vertices, edges)?

JanusGraph doesn't currently have much for that at the moment. Gremlin Server can be configured to support an HTTP endpoint which evaluates any Gremlin. It doesn't expose specific endpoints for /vertices or /edges, but you can do all that and more with the Gremlin endpoint.

7) How can one achieve graph namespacing?

If they are completely separate graphs, creating separate keyspaces works great. You could host them all within the same graph by making sure that you don't overlap labels and property names. You could also consider the Partition Strategy.

8) If the graphs have to be stored in different Cassandra keyspaces, how can I connect to these different graphs / keyspaces from the same Java application?

Create a separate graph instance for each keyspace using storage.cassandra.keyspace in the configuration. You can define multiple graphs in the gremlin-server.yaml configuration with different properties files. Similarly you can connect to multiple graph instance from your application.


On Tuesday, August 8, 2017 at 11:54:15 AM UTC-4, Manoj Waikar wrote:
Hi,

I have read the JanusGraph documentation and the GraphOfTheGodsFactory.java file, and I also have a small sample running, However, I am still not clear about the following doubts related to JanusGraph -

1) What is the relation between Gremlin server (bin/gremlin-server.bat) and the JanusGraph server (bin/janusgraph.sh)?

2) I've specified my Cassandra related configuration values in conf/gremlin-server/janusgraph-cassandra-es-server.properties file and this file is being used when running the gremlin server. While using the Java API (from Scala), I do the following -

val graph: JanusGraph = JanusGraphFactory.build().
 
set("storage.backend", "cassandra").
 
set("storage.hostname", "localhost").
 
set("storage.cassandra.keyspace", "MyJG").
 
set("storage.username", "username").
 
set("storage.password", "password").
  open
()

Should I be using the same (conf/gremlin-server/janusgraph-cassandra-es-server.properties) file which I use to start the gremlin server from my Java code?

3) In the above API, I haven't specified the JanusGraph server endpoint (the URL or the port), so which server is my Java code connecting to?

4) Does Java API use websockets, and can JanusGraph server run on a different machine (right now, my Cassandra and gremlin server run on the same machine)?

5) Is Java API the same as Gremlin language / API?

6) Where is the documentation / examples for REST API (for adding / querying vertices, edges)?

7) How can one achieve graph namespacing? So for example, I have to create three different graphs for employees, vehicles and cities, how can I segregate the data for these three graphs? Can I give a name / id to the graph? Or do these graphs have to be stored in different Cassandra keyspaces?

8) If the graphs have to be stored in different Cassandra keyspaces, how can I connect to these different graphs / keyspaces from the same Java application?

Thanks in advance for the help.


Re: Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications

Raymond Canzanese <r...@...>
 

Looking forward to reading about your colleagues findings, Jason.  Not using indices would certainly at least partially explain the poor performance given the types of queries they were making. 


On Monday, August 7, 2017 at 1:44:33 PM UTC-4, Stephen Mallette wrote:
It did use parameters. They basically forked Jonathan Ellithorpe's work:


converted all the embedded Gremlin to strings. 


Not sure how much they modified the Gremlin statements from the Ellithorpe repo. I stopped digging into it once I didn't see vertex centric indices defined and other data modelling choices I probably wouldn't have taken. LDBC is "complex" in the sense that it takes time to dig into - hasn't really been a priority to me. 

I'm not sure why Gremlin Server got smacked around so badly in what they did. I couldn't find anything about how it was set up at all. They used TinkerPop 3.2.3 for their work - there have been a lot of enhancements since then in relation to memory management, so perhaps newer versions would have fared better in their tests. Again, hard to say what could/would have happened without spending a decent amount of time on it.

> then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph

very cool, jason. glad your colleagues could spend some time on that. it would be nice to hear what they find. 



On Mon, Aug 7, 2017 at 1:05 PM, Jason Plurad <p...@...> wrote:
This blew up a while ago on the Twitter last month https://twitter.com/adriancolyer/status/883226836561518594

The testing set up was less than ideal for Titan. Cassandra isn't really meant for a single node install.

The paper picked on Gremlin Server, but it didn't disclose anything about the server configuration. Some of the latency for the Gremlin Server-based runs could have been because they weren't using parameterized script bindings. Using the Gremlin Server is not a requirement for using Titan at all, and I'm aware of projects that don't even use it.

There's a team in my company that is trying to reproduce the results in that paper, then we'll be able to see what improvements can be made to the benchmark itself or within TinkerPop and JanusGraph.



On Thursday, August 3, 2017 at 2:05:23 PM UTC-4, Raymond Canzanese wrote:
Has everyone seen this article out of the University of Waterloo, which concludes TinkerPop 3 to be not ready for prime time?

Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications
Anil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu
10.1145/3078447.3078459

Interested to know what other folks think of this testing setup and set of conclusions.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Creating a gremlin pipeline from an arraylist

Raymond Canzanese <r...@...>
 

I have an arraylist a of edges that I want to make gremlin queries over.  In the old days, I would do: 

a._() 

And have a pipeline I could work with.  Now it seems I can do:

g.inject(a).unfold()

or

g.E(a)

Which of these techniques should I prefer?  Is one of them more efficient than the other?


A few questions about JanusGraph.

Manoj Waikar <mmwa...@...>
 

Hi,

I have read the JanusGraph documentation and the GraphOfTheGodsFactory.java file, and I also have a small sample running, However, I am still not clear about the following doubts related to JanusGraph -

1) What is the relation between Gremlin server (bin/gremlin-server.bat) and the JanusGraph server (bin/janusgraph.sh)?

2) I've specified my Cassandra related configuration values in conf/gremlin-server/janusgraph-cassandra-es-server.properties file and this file is being used when running the gremlin server. While using the Java API (from Scala), I do the following -

val graph: JanusGraph = JanusGraphFactory.build().
 
set("storage.backend", "cassandra").
 
set("storage.hostname", "localhost").
 
set("storage.cassandra.keyspace", "MyJG").
 
set("storage.username", "username").
 
set("storage.password", "password").
  open
()

Should I be using the same (conf/gremlin-server/janusgraph-cassandra-es-server.properties) file which I use to start the gremlin server from my Java code?

3) In the above API, I haven't specified the JanusGraph server endpoint (the URL or the port), so which server is my Java code connecting to?

4) Does Java API use websockets, and can JanusGraph server run on a different machine (right now, my Cassandra and gremlin server run on the same machine)?

5) Is Java API the same as Gremlin language / API?

6) Where is the documentation / examples for REST API (for adding / querying vertices, edges)?

7) How can one achieve graph namespacing? So for example, I have to create three different graphs for employees, vehicles and cities, how can I segregate the data for these three graphs? Can I give a name / id to the graph? Or do these graphs have to be stored in different Cassandra keyspaces?

8) If the graphs have to be stored in different Cassandra keyspaces, how can I connect to these different graphs / keyspaces from the same Java application?

Thanks in advance for the help.


Re: [BLOG] Configuring JanusGraph for spark-yarn

Joe Obernberger <joseph.o...@...>
 

Hi Marc - thank you very much for your reply.  I like your idea about moving regions manually and will try that.  As to OLAP vs OLTP (I assume Spark vs none), yes I have those times.
For a 1.5G table in HBase the count just using the gremlin shell without using the SparkGraphComputer:

graph = JanusGraphFactory.open('conf/graph.properties')
g=graph.traversal()
g.V().count()

takes just under 1 minute.  Using spark it takes about 2 hours.  So something isn't right.  They both return 3,842,755 vertices.   When I run it with Spark, it hits one of the region servers hard - doing over 30k requests per second for those 2 hours.

-Joe


On 8/8/2017 3:17 AM, HadoopMarc wrote:

Hi Joseph,

You ran into terrain I have not yet covered myself. Up till now I have been using the graben1437 PR for Titan and for OLAP I adopted a poor man's approach where node id's are distributed over spark tasks and each spark executor makes its own Titan/HBase connection. This performs well, but does not have the nice abstraction of the HBaseInputFormat.

So, no clear answer to this one, but just some thoughts:
 - could you try to move some regions manually and see what it does to performance?
 - how do your OLAP vertex count times compare to the OLTP count times?
 - how does the sum of spark task execution times compare to the yarn start-to-end time difference you reported? In other words, how much of the start-to-end time is spent in waiting for timeouts?
 - unless you managed to create a vertex with > 1GB size, the RowTooBigException sounds like a bug (which you can report on Jnausgraph's github page). Hbase does not like large rows at all, so vertex/edge properties should not have blob values.
 
@(David Robinson): do you have any additional thoughts on this?

Cheers,    Marc

Op maandag 7 augustus 2017 23:12:02 UTC+2 schreef Joseph Obernberger:

Hi Marc - I've been able to get it to run longer, but am now getting a RowTooBigException from HBase.  How does JanusGraph store data in HBase?  The current max size of a row in 1GByte, which makes me think this error is covering something else up.

What I'm seeing so far in testing with a 5 server cluster - each machine with 128G of RAM:
HBase table is 1.5G in size, split across 7 regions, and has 20,001,105 rows.  To do a g.V().count() takes 2 hours and results in 3,842,755 verticies.

Another HBase table is 5.7G in size split across 10 regions, is 57,620,276 rows, and took 6.5 hours to run the count and results in 10,859,491 nodes.  When running, it looks like it hits one server very hard even though the YARN tasks are distributed across the cluster.  One HBase node gets hammered.

The RowTooBigException is below.  Anything to try?  Thank you for any help!


org.janusgraph.core.JanusGraphException: Could not process individual retrieval call
                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:257)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1269)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1137)
                at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:209)
                at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:75)
                at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:54)
                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:67)
                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:28)
                at com.google.common.collect.Iterators$7.computeNext(Iterators.java:651)
                at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
                at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
                at org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl.getTypeInspector(JanusGraphHadoopSetupImpl.java:60)
                at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.<init>(JanusGraphVertexDeserializer.java:55)
                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.lambda$static$0(GiraphInputFormat.java:49)
                at org.janusgraph.hadoop.formats.util.GiraphInputFormat$RefCountedCloseable.acquire(GiraphInputFormat.java:100)
                at org.janusgraph.hadoop.formats.util.GiraphRecordReader.<init>(GiraphRecordReader.java:47)
                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.createRecordReader(GiraphInputFormat.java:67)
                at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:166)
                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
                at org.apache.spark.scheduler.Task.run(Task.scala:89)
                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                at java.lang.Thread.run(Thread.java:745)
Caused by: org.janusgraph.core.JanusGraphException: Could not call index
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1262)
                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:255)
                ... 34 more
Caused by: org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:57)
                at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:444)
                at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:395)
                at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:51)
                at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:529)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.lambda$call$5(StandardJanusGraphTx.java:1258)
                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:97)
                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:89)
                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:81)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1258)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1255)
                at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
                at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
                at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
                at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
                at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
                at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
                at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1255)
                ... 35 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT10S
                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:101)
                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
                ... 53 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getHelper(HBaseKeyColumnValueStore.java:202)
                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getSlice(HBaseKeyColumnValueStore.java:90)
                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:398)
                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:395)
                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
                ... 54 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
Sat Aug 05 07:22:03 EDT 2017, RpcRetryingCaller{globalStartTime=1501932111280, pause=100, retries=35}, org.apache.hadoop.hbase.regionserver.RowTooBigException: rg.apache.hadoop.hbase.regionserver.RowTooBigException: Max row size allowed: 1073741824, but the row is bigger than that.
                at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:564)
                at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5697)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5856)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5634)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5611)
                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5597)
                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6792)
                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6770)
                at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2023)
                at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
                at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
                at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)


On 8/6/2017 3:50 PM, HadoopMarc wrote:
Hi ... and others,  I have been offline for a few weeks enjoying a holiday and will start looking into your questions and make the suggested corrections. Thanks for following the recipes and helping others with it.

..., did you run the recipe on the same HDP sandbox and same Tinkerpop version? I remember (from 4 weeks ago) that copying the zookeeper.znode.parent property from the hbase configs to the janusgraph configs was essential to get janusgraph's HBaseInputFormat working (that is: read graph data for the spark tasks).

Cheers,    Marc

Op maandag 24 juli 2017 10:12:13 UTC+2 schreef spi...@...:
hi,Thanks for your post.
I did it according to the post.But I ran into a problem.
15:58:49,110  INFO SecurityManager:58 - Changing view acls to: rc
15:58:49,110  INFO SecurityManager:58 - Changing modify acls to: rc
15:58:49,110  INFO SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rc); users with modify permissions: Set(rc)
15:58:49,111  INFO Client:58 - Submitting application 25 to ResourceManager
15:58:49,320  INFO YarnClientImpl:274 - Submitted application application_1500608983535_0025
15:58:49,321  INFO SchedulerExtensionServices:58 - Starting Yarn extension services with app application_1500608983535_0025 and attemptId None
15:58:50,325  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:50,326  INFO Client:58 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1500883129115
final status: UNDEFINED
user: rc
15:58:51,330  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:52,333  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:53,335  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:54,337  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:55,340  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:56,343  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:56,802  INFO YarnSchedulerBackend$YarnSchedulerEndpoint:58 - ApplicationMaster registered as NettyRpcEndpointRef(null)
15:58:56,822  INFO YarnClientSchedulerBackend:58 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dl-rc-optd-ambari-master-v-test-1.host.dataengine.com,dl-rc-optd-ambari-master-v-test-2.host.dataengine.com, PROXY_URI_BASES -> http://dl-rc-optd-ambari-master-v-test-1.host.dataengine.com:8088/proxy/application_1500608983535_0025,http://dl-rc-optd-ambari-master-v-test-2.host.dataengine.com:8088/proxy/application_1500608983535_0025), /proxy/application_1500608983535_0025
15:58:56,824  INFO JettyUtils:58 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15:58:57,346  INFO Client:58 - Application report for application_1500608983535_0025 (state: RUNNING)
15:58:57,347  INFO Client:58 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.200.48.154
ApplicationMaster RPC port: 0
queue: default
start time: 1500883129115
final status: UNDEFINED
user: rc
15:58:57,348  INFO YarnClientSchedulerBackend:58 - Application application_1500608983535_0025 has started running.
15:58:57,358  INFO Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 47514.
15:58:57,358  INFO NettyBlockTransferService:58 - Server created on 47514
15:58:57,360  INFO BlockManagerMaster:58 - Trying to register BlockManager
15:58:57,363  INFO BlockManagerMasterEndpoint:58 - Registering block manager 10.200.48.112:47514 with 2.4 GB RAM, BlockManagerId(driver, 10.200.48.112, 47514)15:58:57,366  INFO BlockManagerMaster:58 - Registered BlockManager
15:58:57,585  INFO EventLoggingListener:58 - Logging events to hdfs:///spark-history/application_1500608983535_0025
15:59:07,177  WARN YarnSchedulerBackend$YarnSchedulerEndpoint:70 - Container marked as failed: container_e170_1500608983535_0025_01_000002 on host: dl-rc-optd-ambari-slave-v-test-1.host.dataengine.com. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_e170_1500608983535_0025_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Shell output: main : command provided 1
main : run as user is rc
main : requested yarn user is rc


Container exited with a non-zero exit code 1
Display stack trace? [yN]15:59:57,702  WARN TransportChannelHandler:79 - Exception in connection from 10.200.48.155/10.200.48.155:50921
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
15:59:57,704 ERROR TransportResponseHandler:132 - Still have 1 requests outstanding when connection from 10.200.48.155/10.200.48.155:50921 is closed
15:59:57,706  WARN NettyRpcEndpointRef:91 - Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)

I am confused about that. Could you please help me?



在 2017年7月6日星期四 UTC+8下午4:15:37,HadoopMarc写道:

Readers wanting to run OLAP queries on a real spark-yarn cluster might want to check my recent post:

http://yaaics.blogspot.nl/2017/07/configuring-janusgraph-for-spark-yarn.html

Regards,  Marc
--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Virus-free. www.avg.com

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
For more options, visit https://groups.google.com/d/optout.


Re: Index on a vertex label from Java

Jason Plurad <plu...@...>
 

You can't create an index on a vertex label right now. See https://github.com/JanusGraph/janusgraph/issues/283

You can create an index on a property. For example, you could define a property called "mylabel", create a composite index on it, then do g.V().has("mylabel", "foo").count().next().


On Monday, August 7, 2017 at 5:06:19 PM UTC-4, Peter Schwarz wrote:
How does one create an index on a vertex label from Java?  I want to speed up queries that retrieve or count the vertices with a  particular label, e.g. g.V().hasLabel("foo").count().next().  In Gremlin-Groovy, I think you can use getPropertyKey(T.label) to reference the key that represents a label and pass that to addKey, but this does not work in Java because getPropertyKey expects a String and T.label is an enum.  What's the right way to do this?


Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data

Misha Brukman <mbru...@...>
 

You might consider using a format other than JSON which can easily read incrementally, such as CSV or a more compact binary encoding such as:
or you may want to use a streaming JSON reader which will read as little as possible to generate a meaningful callback. Several options exist (I have not tested these), e.g.,
If you search on Stack Overflow, you'll find others have had exactly the same issue as here with Python and JSON files, and the answers to those questions all pointed to incremental JSON parser libraries for Python as the solution to the OOMs.

On Tue, Aug 8, 2017 at 10:49 AM, Amyth Arora <aroras....@...> wrote:
Thanks Robert, I am going to check the file size  and its contents when I reach home and also will try to load the file through the shell and post the update here.

On Tue, 8 Aug 2017 at 8:14 PM, Robert Dale <rob...@...> wrote:
Well, whatever, but your stacktrace points to:  String fileContents = new File(jsonPath).getText('UTF-8')
Thus, your file does not fit in memory - either available system memory or within jvm max memory.

Robert Dale

On Tue, Aug 8, 2017 at 10:38 AM, Amyth Arora <aroras....@...> wrote:
Hi Robert,

The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

storage.batch-loading
ids.block-size

But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
Looks like your file doesn't fit in memory.

Robert Dale

On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
Hi Everyone,

I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

import groovy.json.JsonSlurper;
import java.util.ArrayList;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.Multiplicity;
import org.janusgraph.core.schema.SchemaAction;
import org.janusgraph.core.schema.SchemaStatus;
import org.janusgraph.core.util.JanusGraphCleanup;
import org.janusgraph.graphdb.database.StandardJanusGraph;
import org.janusgraph.graphdb.database.management.ManagementSystem;

/**
 * Given a json file, populates data into the given JanusGraph DB
 */
class JanusGraphBuilder {

    String graphPath;
    StandardJanusGraph graph;
    ManagementSystem management;
    GraphTraversalSource traversal;

    def dummyData;


    public void main(String jsonPath, String janusGraphPath) {
        this.graphPath  = janusGraphPath
        this.initGraph()
        this.initialize(jsonPath)
        this.populate()
    }

    public void createEdges(def edges) {
        println "Preparing edges."
        edges.each {
            def relation = it.edge
            def properties = it.properties
            def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
            def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
            def newEdge = vertexFrom.addEdge(relation, vertexTo)
            properties.each {
                if (it.key == 'score') {
                    it.value = Float.parseFloat(it.value.toString())
                }
                newEdge.property(it.key, it.value)
            }
        }
        this.graph.tx().commit()
        println "Created edges successfully"
    }

    public void createVertexes(def vertexes) {
        println "Preparing vertices."
        vertexes.each {
            def uniqueLabel = it.labels[0]
            def properties = it.properties
            def newVertex = this.graph.addVertex(label, uniqueLabel)
            properties.each {
                newVertex.property(it.key, it.value)
            }
        }
        this.graph.tx().commit()
        println "Created vertices successfully"
    }

    public void createSchema() {
        println "Preparing schema."
        // Do not create indexes while another transaction is in progress
        this.graph.tx().rollback()
        this.management = this.graph.openManagement()
        this.management.set('ids.block-size', 20000000)

        // Make property keys
        def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
        def name = this.management.makePropertyKey("name").dataType(String.class).make()
        def number = this.management.makePropertyKey("number").dataType(String.class).make()
        def email = this.management.makePropertyKey("email").dataType(String.class).make()
        def score = this.management.makePropertyKey("score").dataType(Float.class).make()
        def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
        def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
        def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
        def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
        def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
        def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

        // Define Vertex Labels
        this.management.makeVertexLabel("person").make();
        this.management.makeVertexLabel("candidate").make();
        this.management.makeVertexLabel("recruiter").make();
        this.management.makeVertexLabel("employee").make();
        this.management.makeVertexLabel("linkedin").make();
        this.management.makeVertexLabel("job").make();
        this.management.makeVertexLabel("company").make();
        this.management.makeVertexLabel("institute").make();
        def phoneV = this.management.makeVertexLabel("phone").make();
        def emailV = this.management.makeVertexLabel("email").make();

        // Define Edge Labels
        this.management.makeEdgeLabel("knows").make();
        this.management.makeEdgeLabel("has").make();
        this.management.makeEdgeLabel("provided_by").make();
        this.management.makeEdgeLabel("studied_at").make();
        this.management.makeEdgeLabel("worked_at").make();
        this.management.makeEdgeLabel("posted").make();
        this.management.makeEdgeLabel("liked").make();
        this.management.makeEdgeLabel("worked_with").make();
        this.management.makeEdgeLabel("studied_with").make();
        this.management.makeEdgeLabel("is_a_match_for").make();

        // Create indexes
        this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
        this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
        this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
        this.management.commit()
        this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
        this.management = this.graph.openManagement()
        this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

        this.management.commit()

        println "Created schema successfully"
    }

    public void populate() {
        // Create db schema
        this.createSchema()

        // Create vertexes from the given dummy data
        def vertexTransaction = this.graph.newTransaction()
        def vertexes = this.dummyData.vertexes;
        this.createVertexes(vertexes)
        vertexTransaction.commit()
        this.initGraph()

        def edgeTransaction = this.graph.newTransaction()
        // Create edges from the given dummy data
        def edges = this.dummyData.edges;
        this.createEdges(edges)
        edgeTransaction.commit()
        this.initGraph()

        println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
    }

    public void initialize(String jsonPath) {
        String fileContents = new File(jsonPath).getText('UTF-8')
        def slurper = new JsonSlurper()
        def results = slurper.parseText(fileContents)
        this.dummyData = results;
        this.resetData()
    }

    public void resetData() {
        // Remove all the data from the storage backend
        this.graph.close()
        JanusGraphCleanup.clear(this.graph)
        this.initGraph()
    }

    public void initGraph() {
        this.graph = JanusGraphFactory.open(this.graphPath)
        this.traversal = this.graph.traversal()
    }
}


JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

storage.backend=hbase

## Google cloud BIGTABLE configuration options
storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

storage.hostname=localhost
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5

# Bulk Loading
storage.batch-loading = true
ids.block-size=20000000

I execute the script using the following command:

./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
at java.lang.StringBuilder.append(StringBuilder.java:190)
at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at JanusGraphBuilder.initialize(Script1.groovy:146)
at JanusGraphBuilder.main(Script1.groovy:29)
at JanusGraphBuilder$main.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
at Script1.run(Script1.groovy:168)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

Any help will be much appreciated. Thanks.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Thanks & Regards
-----------------------------------------------------

Amyth Arora
-----------------------------------------------------

Web:

-----------------------------------------------------

Email Addresses:
aroras....@...,
-----------------------------------------------------

Social Profiles:
Twitter - @mytharora
    -----------------------------------------------------

    --
    Thanks & Regards
    -----------------------------------------------------

    Amyth Arora
    -----------------------------------------------------

    Web:

    -----------------------------------------------------

    Email Addresses:
    aroras....@...,
    -----------------------------------------------------

    Social Profiles:
    Twitter - @mytharora
      -----------------------------------------------------

      --
      You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
      For more options, visit https://groups.google.com/d/optout.


      Re: Cache expiration time

      Jason Plurad <plu...@...>
       

      According to the docs, it is a GLOBAL_OFFLINE configuration setting: "These options can only be changed for the entire database cluster at once when all instances are shut down." You'll need to set the value using the ManagementSystem.

      If you want to do it through a remote console session, you could try something like this:

      gremlin> :remote connect tinkerpop.server conf/remote.yaml
      ==>Configured localhost/127.0.0.1:8182
      gremlin
      > :> mgmt = graph.openManagement(); mgmt.set('cache.db-cache-time', 360000); mgmt.commit(); true
      ==>true

      At this point, the value is set but it is not active. You need to restart the Gremlin Server so the new configuration is picked up.

      Another thing you should be aware of when working with GLOBAL_OFFLINE properties, is that you can't change the value if there are multiple open graph instances -- for example, you have the Gremlin Server started and also make a direct connection with JanusGraphFactory.open(). You should shutdown all connections so there is only 1 remaining (you can verify with mgmt.getOpenInstances()) before attempting to set the configuration property.

      -- Jason


      On Tuesday, August 8, 2017 at 7:42:17 AM UTC-4, Ohad Pinchevsky wrote:
      Hi,

      I am trying to increase/disable the cache expiration time using the cache.db-cache-time property
      I changed the value to 0 and restarted the Gemlin server, but it seems it is not working (based on execution time, first time slow, second fast, waiting, third time slow again).

      What am I missing?

      Thanks,
      Ohad


      Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data

      Amyth Arora <aroras....@...>
       

      Thanks Robert, I am going to check the file size  and its contents when I reach home and also will try to load the file through the shell and post the update here.

      On Tue, 8 Aug 2017 at 8:14 PM, Robert Dale <rob...@...> wrote:
      Well, whatever, but your stacktrace points to:  String fileContents = new File(jsonPath).getText('UTF-8')
      Thus, your file does not fit in memory - either available system memory or within jvm max memory.

      Robert Dale

      On Tue, Aug 8, 2017 at 10:38 AM, Amyth Arora <aroras....@...> wrote:
      Hi Robert,

      The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


      while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

      storage.batch-loading
      ids.block-size

      But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

      On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
      Looks like your file doesn't fit in memory.

      Robert Dale

      On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
      Hi Everyone,

      I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

      import groovy.json.JsonSlurper;
      import java.util.ArrayList;
      import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
      import org.janusgraph.core.JanusGraphFactory;
      import org.janusgraph.core.PropertyKey;
      import org.janusgraph.core.Multiplicity;
      import org.janusgraph.core.schema.SchemaAction;
      import org.janusgraph.core.schema.SchemaStatus;
      import org.janusgraph.core.util.JanusGraphCleanup;
      import org.janusgraph.graphdb.database.StandardJanusGraph;
      import org.janusgraph.graphdb.database.management.ManagementSystem;

      /**
       * Given a json file, populates data into the given JanusGraph DB
       */
      class JanusGraphBuilder {

          String graphPath;
          StandardJanusGraph graph;
          ManagementSystem management;
          GraphTraversalSource traversal;

          def dummyData;


          public void main(String jsonPath, String janusGraphPath) {
              this.graphPath  = janusGraphPath
              this.initGraph()
              this.initialize(jsonPath)
              this.populate()
          }

          public void createEdges(def edges) {
              println "Preparing edges."
              edges.each {
                  def relation = it.edge
                  def properties = it.properties
                  def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                  def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                  def newEdge = vertexFrom.addEdge(relation, vertexTo)
                  properties.each {
                      if (it.key == 'score') {
                          it.value = Float.parseFloat(it.value.toString())
                      }
                      newEdge.property(it.key, it.value)
                  }
              }
              this.graph.tx().commit()
              println "Created edges successfully"
          }

          public void createVertexes(def vertexes) {
              println "Preparing vertices."
              vertexes.each {
                  def uniqueLabel = it.labels[0]
                  def properties = it.properties
                  def newVertex = this.graph.addVertex(label, uniqueLabel)
                  properties.each {
                      newVertex.property(it.key, it.value)
                  }
              }
              this.graph.tx().commit()
              println "Created vertices successfully"
          }

          public void createSchema() {
              println "Preparing schema."
              // Do not create indexes while another transaction is in progress
              this.graph.tx().rollback()
              this.management = this.graph.openManagement()
              this.management.set('ids.block-size', 20000000)

              // Make property keys
              def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
              def name = this.management.makePropertyKey("name").dataType(String.class).make()
              def number = this.management.makePropertyKey("number").dataType(String.class).make()
              def email = this.management.makePropertyKey("email").dataType(String.class).make()
              def score = this.management.makePropertyKey("score").dataType(Float.class).make()
              def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
              def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
              def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
              def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
              def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
              def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

              // Define Vertex Labels
              this.management.makeVertexLabel("person").make();
              this.management.makeVertexLabel("candidate").make();
              this.management.makeVertexLabel("recruiter").make();
              this.management.makeVertexLabel("employee").make();
              this.management.makeVertexLabel("linkedin").make();
              this.management.makeVertexLabel("job").make();
              this.management.makeVertexLabel("company").make();
              this.management.makeVertexLabel("institute").make();
              def phoneV = this.management.makeVertexLabel("phone").make();
              def emailV = this.management.makeVertexLabel("email").make();

              // Define Edge Labels
              this.management.makeEdgeLabel("knows").make();
              this.management.makeEdgeLabel("has").make();
              this.management.makeEdgeLabel("provided_by").make();
              this.management.makeEdgeLabel("studied_at").make();
              this.management.makeEdgeLabel("worked_at").make();
              this.management.makeEdgeLabel("posted").make();
              this.management.makeEdgeLabel("liked").make();
              this.management.makeEdgeLabel("worked_with").make();
              this.management.makeEdgeLabel("studied_with").make();
              this.management.makeEdgeLabel("is_a_match_for").make();

              // Create indexes
              this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
              this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
              this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
              this.management.commit()
              this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
              this.management = this.graph.openManagement()
              this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

              this.management.commit()

              println "Created schema successfully"
          }

          public void populate() {
              // Create db schema
              this.createSchema()

              // Create vertexes from the given dummy data
              def vertexTransaction = this.graph.newTransaction()
              def vertexes = this.dummyData.vertexes;
              this.createVertexes(vertexes)
              vertexTransaction.commit()
              this.initGraph()

              def edgeTransaction = this.graph.newTransaction()
              // Create edges from the given dummy data
              def edges = this.dummyData.edges;
              this.createEdges(edges)
              edgeTransaction.commit()
              this.initGraph()

              println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
          }

          public void initialize(String jsonPath) {
              String fileContents = new File(jsonPath).getText('UTF-8')
              def slurper = new JsonSlurper()
              def results = slurper.parseText(fileContents)
              this.dummyData = results;
              this.resetData()
          }

          public void resetData() {
              // Remove all the data from the storage backend
              this.graph.close()
              JanusGraphCleanup.clear(this.graph)
              this.initGraph()
          }

          public void initGraph() {
              this.graph = JanusGraphFactory.open(this.graphPath)
              this.traversal = this.graph.traversal()
          }
      }


      JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
      graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

      I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

      storage.backend=hbase

      ## Google cloud BIGTABLE configuration options
      storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

      storage.hostname=localhost
      cache.db-cache = true
      cache.db-cache-clean-wait = 20
      cache.db-cache-time = 180000
      cache.db-cache-size = 0.5

      # Bulk Loading
      storage.batch-loading = true
      ids.block-size=20000000

      I execute the script using the following command:

      ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

      The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

      18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
      18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
      18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
      18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
      Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOf(Arrays.java:3332)
      at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
      at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
      at java.lang.StringBuilder.append(StringBuilder.java:190)
      at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
      at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
      at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
      at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
      at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
      at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
      at JanusGraphBuilder.initialize(Script1.groovy:146)
      at JanusGraphBuilder.main(Script1.groovy:29)
      at JanusGraphBuilder$main.call(Unknown Source)
      at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
      at Script1.run(Script1.groovy:168)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
      at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

      Any help will be much appreciated. Thanks.


      --
      You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
      For more options, visit https://groups.google.com/d/optout.
      --
      Thanks & Regards
      -----------------------------------------------------

      Amyth Arora
      -----------------------------------------------------

      Web:

      -----------------------------------------------------

      Email Addresses:
      aroras....@...,
      -----------------------------------------------------

      Social Profiles:
      Twitter - @mytharora
        -----------------------------------------------------

        --
        Thanks & Regards
        -----------------------------------------------------

        Amyth Arora
        -----------------------------------------------------

        Web:

        -----------------------------------------------------

        Email Addresses:
        aroras....@...,
        -----------------------------------------------------

        Social Profiles:
        Twitter - @mytharora
          -----------------------------------------------------


          Re: Jetty ALPN/NPN has not been properly configured.

          Amyth Arora <aroras....@...>
           

          Hi Misha,

          Yes, I did, I am sorry. I have found the solution to this. Am going to post it and close the issue on github once I reach home. Sorry about the duplicate issue.


          On Tue, 8 Aug 2017 at 7:56 PM, Misha Brukman <mbru...@...> wrote:
          Looks like you've also filed a GitHub issue on this: https://github.com/JanusGraph/janusgraph/issues/450 — please use either the mailing list or GitHub to report issues, but not both, as duplication isn't helpful.

          Since this might be a bug, let's follow up on GitHub.

          On Tue, Aug 8, 2017 at 6:46 AM, Amyth Arora <aroras....@...> wrote:

          Hey, We have a mid-sized (10 million vertices, 5 billion edges) graph database (currently running on neo4j) at our organization. we are in the process of moving to JanusGraph with google cloud big table as the storage backend.

          I am facing the following issue while connection to Big Table from the gremlin shell. I receive the following error while I try to instantiate the graph.


          Jetty ALPN/NPN has not been properly configured.
          


          Here is the traceback.


          java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.hbase.HBaseStoreManager
          	at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
          	at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:480)
          	at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:414)
          	at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1343)
          	at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:107)
          	at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:75)
          	at org.janusgraph.core.JanusGraphFactory$open.call(Unknown Source)
          	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
          	at groovysh_evaluate.run(groovysh_evaluate:3)
          	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
          	at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:70)
          	at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:190)
          	at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
          	at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
          	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
          	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
          	at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
          	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
          	at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
          	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
          	at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
          	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
          	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
          	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
          	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
          	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:124)
          	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
          	at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
          	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
          	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
          	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
          	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
          	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
          	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
          	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
          	at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:152)
          	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
          	at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:455)
          Caused by: java.lang.reflect.InvocationTargetException
          	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
          	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          	at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
          	... 54 more
          Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
          	at org.janusgraph.diskstorage.hbase.HBaseStoreManager.<init>(HBaseStoreManager.java:336)
          	... 59 more
          Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
          	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
          	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
          	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
          	at org.janusgraph.diskstorage.hbase.HBaseCompat1_0.createConnection(HBaseCompat1_0.java:43)
          	at org.janusgraph.diskstorage.hbase.HBaseStoreManager.<init>(HBaseStoreManager.java:334)
          	... 59 more
          Caused by: java.lang.reflect.InvocationTargetException
          	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
          	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
          	... 63 more
          Caused by: java.lang.IllegalArgumentException: Jetty ALPN/NPN has not been properly configured.
          	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.selectApplicationProtocolConfig(GrpcSslContexts.java:174)
          	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:151)
          	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:139)
          	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:109)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createSslContext(BigtableSession.java:126)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createNettyChannel(BigtableSession.java:475)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession$4.create(BigtableSession.java:401)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.io.ChannelPool.<init>(ChannelPool.java:246)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createChannelPool(BigtableSession.java:404)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createManagedPool(BigtableSession.java:416)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.getDataChannelPool(BigtableSession.java:274)
          	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.<init>(BigtableSession.java:247)
          	at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:143)
          	at com.google.cloud.bigtable.hbase1_0.BigtableConnection.<init>(BigtableConnection.java:56)
          


          Following is my janus configuration file(s):


          gremlin-server.yaml

          host: localhost
          port: 8182
          scriptEvaluationTimeout: 30000
          channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
          graphs: {
            graph: conf/testconf.properties}
          plugins:
            - janusgraph.imports
          scriptEngines: {
            gremlin-groovy: {
              imports: [java.lang.Math],
              staticImports: [java.lang.Math.PI],
              scripts: [scripts/empty-sample.groovy]}}
          serializers:
            - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
            - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
            - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
            - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
            - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
            - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
          processors:
            - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
            - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
          metrics: {
            consoleReporter: {enabled: true, interval: 180000},
            csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
            jmxReporter: {enabled: true},
            slf4jReporter: {enabled: true, interval: 180000},
            gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
            graphiteReporter: {enabled: false, interval: 180000}}
          maxInitialLineLength: 4096
          maxHeaderSize: 8192
          maxChunkSize: 8192
          maxContentLength: 65536
          maxAccumulationBufferComponents: 1024
          resultIterationBatchSize: 64
          writeBufferLowWaterMark: 32768
          writeBufferHighWaterMark: 65536
          ssl: {
            enabled: false}
          


          testconf.properties

          storage.backend=hbase
          
          ## Google cloud BIGTABLE configuration options
          storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection
          storage.hbase.ext.google.bigtable.project.id=my-project-id
          storage.hbase.ext.google.bigtable.instance.id=myinstanceid
          
          #!storage.hostname=127.0.0.1
          cache.db-cache = true
          cache.db-cache-clean-wait = 20
          cache.db-cache-time = 180000
          cache.db-cache-size = 0.5
          

          I have added the google-cloud-bigtable and
          netty-tcnative-boringssl-static jar files to the lib folder in janusgraph directory.

          How I am trying to instantiate the graph

          ./bin/gremlin.sh
          graph = JanusGraphFactory.open('conf/testconf.properties')
          


          And this gives me the above error. Is there something that I am missing ?


          NOTE

          Also, when I run the gremlin-server I get the following warning message:


          807  [main] WARN  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] configured at [conf/testconf.properties] could not be instantiated and will not be available in Gremlin Server.  GraphFactory message: Configuration must contain a valid 'gremlin.graph' setting
          java.lang.RuntimeException: Configuration must contain a valid 'gremlin.graph' setting
          	at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:57)
          	at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
          	at org.apache.tinkerpop.gremlin.server.GraphManager.lambda$new$8(GraphManager.java:55)
          	at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
          	at org.apache.tinkerpop.gremlin.server.GraphManager.<init>(GraphManager.java:53)
          	at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:83)
          	at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:110)
          	at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:344)

          --
          You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
          For more options, visit https://groups.google.com/d/optout.
          --
          Thanks & Regards
          -----------------------------------------------------

          Amyth Arora
          -----------------------------------------------------

          Web:

          -----------------------------------------------------

          Email Addresses:
          aroras....@...,
          -----------------------------------------------------

          Social Profiles:
          Twitter - @mytharora
            -----------------------------------------------------


            Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data

            Robert Dale <rob...@...>
             

            Well, whatever, but your stacktrace points to:  String fileContents = new File(jsonPath).getText('UTF-8')
            Thus, your file does not fit in memory - either available system memory or within jvm max memory.

            Robert Dale

            On Tue, Aug 8, 2017 at 10:38 AM, Amyth Arora <aroras....@...> wrote:
            Hi Robert,

            The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


            while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

            storage.batch-loading
            ids.block-size

            But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

            On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
            Looks like your file doesn't fit in memory.

            Robert Dale

            On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
            Hi Everyone,

            I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

            import groovy.json.JsonSlurper;
            import java.util.ArrayList;
            import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
            import org.janusgraph.core.JanusGraphFactory;
            import org.janusgraph.core.PropertyKey;
            import org.janusgraph.core.Multiplicity;
            import org.janusgraph.core.schema.SchemaAction;
            import org.janusgraph.core.schema.SchemaStatus;
            import org.janusgraph.core.util.JanusGraphCleanup;
            import org.janusgraph.graphdb.database.StandardJanusGraph;
            import org.janusgraph.graphdb.database.management.ManagementSystem;

            /**
             * Given a json file, populates data into the given JanusGraph DB
             */
            class JanusGraphBuilder {

                String graphPath;
                StandardJanusGraph graph;
                ManagementSystem management;
                GraphTraversalSource traversal;

                def dummyData;


                public void main(String jsonPath, String janusGraphPath) {
                    this.graphPath  = janusGraphPath
                    this.initGraph()
                    this.initialize(jsonPath)
                    this.populate()
                }

                public void createEdges(def edges) {
                    println "Preparing edges."
                    edges.each {
                        def relation = it.edge
                        def properties = it.properties
                        def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                        def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                        def newEdge = vertexFrom.addEdge(relation, vertexTo)
                        properties.each {
                            if (it.key == 'score') {
                                it.value = Float.parseFloat(it.value.toString())
                            }
                            newEdge.property(it.key, it.value)
                        }
                    }
                    this.graph.tx().commit()
                    println "Created edges successfully"
                }

                public void createVertexes(def vertexes) {
                    println "Preparing vertices."
                    vertexes.each {
                        def uniqueLabel = it.labels[0]
                        def properties = it.properties
                        def newVertex = this.graph.addVertex(label, uniqueLabel)
                        properties.each {
                            newVertex.property(it.key, it.value)
                        }
                    }
                    this.graph.tx().commit()
                    println "Created vertices successfully"
                }

                public void createSchema() {
                    println "Preparing schema."
                    // Do not create indexes while another transaction is in progress
                    this.graph.tx().rollback()
                    this.management = this.graph.openManagement()
                    this.management.set('ids.block-size', 20000000)

                    // Make property keys
                    def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                    def name = this.management.makePropertyKey("name").dataType(String.class).make()
                    def number = this.management.makePropertyKey("number").dataType(String.class).make()
                    def email = this.management.makePropertyKey("email").dataType(String.class).make()
                    def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                    def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                    def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                    def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                    def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                    def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                    def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                    // Define Vertex Labels
                    this.management.makeVertexLabel("person").make();
                    this.management.makeVertexLabel("candidate").make();
                    this.management.makeVertexLabel("recruiter").make();
                    this.management.makeVertexLabel("employee").make();
                    this.management.makeVertexLabel("linkedin").make();
                    this.management.makeVertexLabel("job").make();
                    this.management.makeVertexLabel("company").make();
                    this.management.makeVertexLabel("institute").make();
                    def phoneV = this.management.makeVertexLabel("phone").make();
                    def emailV = this.management.makeVertexLabel("email").make();

                    // Define Edge Labels
                    this.management.makeEdgeLabel("knows").make();
                    this.management.makeEdgeLabel("has").make();
                    this.management.makeEdgeLabel("provided_by").make();
                    this.management.makeEdgeLabel("studied_at").make();
                    this.management.makeEdgeLabel("worked_at").make();
                    this.management.makeEdgeLabel("posted").make();
                    this.management.makeEdgeLabel("liked").make();
                    this.management.makeEdgeLabel("worked_with").make();
                    this.management.makeEdgeLabel("studied_with").make();
                    this.management.makeEdgeLabel("is_a_match_for").make();

                    // Create indexes
                    this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                    this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                    this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                    this.management.commit()
                    this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                    this.management = this.graph.openManagement()
                    this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                    this.management.commit()

                    println "Created schema successfully"
                }

                public void populate() {
                    // Create db schema
                    this.createSchema()

                    // Create vertexes from the given dummy data
                    def vertexTransaction = this.graph.newTransaction()
                    def vertexes = this.dummyData.vertexes;
                    this.createVertexes(vertexes)
                    vertexTransaction.commit()
                    this.initGraph()

                    def edgeTransaction = this.graph.newTransaction()
                    // Create edges from the given dummy data
                    def edges = this.dummyData.edges;
                    this.createEdges(edges)
                    edgeTransaction.commit()
                    this.initGraph()

                    println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                }

                public void initialize(String jsonPath) {
                    String fileContents = new File(jsonPath).getText('UTF-8')
                    def slurper = new JsonSlurper()
                    def results = slurper.parseText(fileContents)
                    this.dummyData = results;
                    this.resetData()
                }

                public void resetData() {
                    // Remove all the data from the storage backend
                    this.graph.close()
                    JanusGraphCleanup.clear(this.graph)
                    this.initGraph()
                }

                public void initGraph() {
                    this.graph = JanusGraphFactory.open(this.graphPath)
                    this.traversal = this.graph.traversal()
                }
            }


            JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
            graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

            I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

            storage.backend=hbase

            ## Google cloud BIGTABLE configuration options
            storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

            storage.hostname=localhost
            cache.db-cache = true
            cache.db-cache-clean-wait = 20
            cache.db-cache-time = 180000
            cache.db-cache-size = 0.5

            # Bulk Loading
            storage.batch-loading = true
            ids.block-size=20000000

            I execute the script using the following command:

            ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

            The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

            18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
            18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
            18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
            18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
            Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
            at java.util.Arrays.copyOf(Arrays.java:3332)
            at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
            at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
            at java.lang.StringBuilder.append(StringBuilder.java:190)
            at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
            at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
            at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
            at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
            at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
            at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
            at JanusGraphBuilder.initialize(Script1.groovy:146)
            at JanusGraphBuilder.main(Script1.groovy:29)
            at JanusGraphBuilder$main.call(Unknown Source)
            at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
            at Script1.run(Script1.groovy:168)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
            at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

            Any help will be much appreciated. Thanks.


            --
            You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
            For more options, visit https://groups.google.com/d/optout.
            --
            Thanks & Regards
            -----------------------------------------------------

            Amyth Arora
            -----------------------------------------------------

            Web:

            -----------------------------------------------------

            Email Addresses:
            aroras....@...,
            -----------------------------------------------------

            Social Profiles:
            Twitter - @mytharora
              -----------------------------------------------------


              Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data

              Amyth Arora <aroras....@...>
               

              Hi Robert,

              The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


              while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

              storage.batch-loading
              ids.block-size

              But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

              On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
              Looks like your file doesn't fit in memory.

              Robert Dale

              On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
              Hi Everyone,

              I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

              import groovy.json.JsonSlurper;
              import java.util.ArrayList;
              import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
              import org.janusgraph.core.JanusGraphFactory;
              import org.janusgraph.core.PropertyKey;
              import org.janusgraph.core.Multiplicity;
              import org.janusgraph.core.schema.SchemaAction;
              import org.janusgraph.core.schema.SchemaStatus;
              import org.janusgraph.core.util.JanusGraphCleanup;
              import org.janusgraph.graphdb.database.StandardJanusGraph;
              import org.janusgraph.graphdb.database.management.ManagementSystem;

              /**
               * Given a json file, populates data into the given JanusGraph DB
               */
              class JanusGraphBuilder {

                  String graphPath;
                  StandardJanusGraph graph;
                  ManagementSystem management;
                  GraphTraversalSource traversal;

                  def dummyData;


                  public void main(String jsonPath, String janusGraphPath) {
                      this.graphPath  = janusGraphPath
                      this.initGraph()
                      this.initialize(jsonPath)
                      this.populate()
                  }

                  public void createEdges(def edges) {
                      println "Preparing edges."
                      edges.each {
                          def relation = it.edge
                          def properties = it.properties
                          def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                          def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                          def newEdge = vertexFrom.addEdge(relation, vertexTo)
                          properties.each {
                              if (it.key == 'score') {
                                  it.value = Float.parseFloat(it.value.toString())
                              }
                              newEdge.property(it.key, it.value)
                          }
                      }
                      this.graph.tx().commit()
                      println "Created edges successfully"
                  }

                  public void createVertexes(def vertexes) {
                      println "Preparing vertices."
                      vertexes.each {
                          def uniqueLabel = it.labels[0]
                          def properties = it.properties
                          def newVertex = this.graph.addVertex(label, uniqueLabel)
                          properties.each {
                              newVertex.property(it.key, it.value)
                          }
                      }
                      this.graph.tx().commit()
                      println "Created vertices successfully"
                  }

                  public void createSchema() {
                      println "Preparing schema."
                      // Do not create indexes while another transaction is in progress
                      this.graph.tx().rollback()
                      this.management = this.graph.openManagement()
                      this.management.set('ids.block-size', 20000000)

                      // Make property keys
                      def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                      def name = this.management.makePropertyKey("name").dataType(String.class).make()
                      def number = this.management.makePropertyKey("number").dataType(String.class).make()
                      def email = this.management.makePropertyKey("email").dataType(String.class).make()
                      def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                      def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                      def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                      def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                      def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                      def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                      def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                      // Define Vertex Labels
                      this.management.makeVertexLabel("person").make();
                      this.management.makeVertexLabel("candidate").make();
                      this.management.makeVertexLabel("recruiter").make();
                      this.management.makeVertexLabel("employee").make();
                      this.management.makeVertexLabel("linkedin").make();
                      this.management.makeVertexLabel("job").make();
                      this.management.makeVertexLabel("company").make();
                      this.management.makeVertexLabel("institute").make();
                      def phoneV = this.management.makeVertexLabel("phone").make();
                      def emailV = this.management.makeVertexLabel("email").make();

                      // Define Edge Labels
                      this.management.makeEdgeLabel("knows").make();
                      this.management.makeEdgeLabel("has").make();
                      this.management.makeEdgeLabel("provided_by").make();
                      this.management.makeEdgeLabel("studied_at").make();
                      this.management.makeEdgeLabel("worked_at").make();
                      this.management.makeEdgeLabel("posted").make();
                      this.management.makeEdgeLabel("liked").make();
                      this.management.makeEdgeLabel("worked_with").make();
                      this.management.makeEdgeLabel("studied_with").make();
                      this.management.makeEdgeLabel("is_a_match_for").make();

                      // Create indexes
                      this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                      this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                      this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                      this.management.commit()
                      this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                      this.management = this.graph.openManagement()
                      this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                      this.management.commit()

                      println "Created schema successfully"
                  }

                  public void populate() {
                      // Create db schema
                      this.createSchema()

                      // Create vertexes from the given dummy data
                      def vertexTransaction = this.graph.newTransaction()
                      def vertexes = this.dummyData.vertexes;
                      this.createVertexes(vertexes)
                      vertexTransaction.commit()
                      this.initGraph()

                      def edgeTransaction = this.graph.newTransaction()
                      // Create edges from the given dummy data
                      def edges = this.dummyData.edges;
                      this.createEdges(edges)
                      edgeTransaction.commit()
                      this.initGraph()

                      println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                  }

                  public void initialize(String jsonPath) {
                      String fileContents = new File(jsonPath).getText('UTF-8')
                      def slurper = new JsonSlurper()
                      def results = slurper.parseText(fileContents)
                      this.dummyData = results;
                      this.resetData()
                  }

                  public void resetData() {
                      // Remove all the data from the storage backend
                      this.graph.close()
                      JanusGraphCleanup.clear(this.graph)
                      this.initGraph()
                  }

                  public void initGraph() {
                      this.graph = JanusGraphFactory.open(this.graphPath)
                      this.traversal = this.graph.traversal()
                  }
              }


              JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
              graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

              I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

              storage.backend=hbase

              ## Google cloud BIGTABLE configuration options
              storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

              storage.hostname=localhost
              cache.db-cache = true
              cache.db-cache-clean-wait = 20
              cache.db-cache-time = 180000
              cache.db-cache-size = 0.5

              # Bulk Loading
              storage.batch-loading = true
              ids.block-size=20000000

              I execute the script using the following command:

              ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

              The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

              18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
              18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
              18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
              18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
              Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
              at java.util.Arrays.copyOf(Arrays.java:3332)
              at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
              at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
              at java.lang.StringBuilder.append(StringBuilder.java:190)
              at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
              at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
              at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
              at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
              at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
              at JanusGraphBuilder.initialize(Script1.groovy:146)
              at JanusGraphBuilder.main(Script1.groovy:29)
              at JanusGraphBuilder$main.call(Unknown Source)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
              at Script1.run(Script1.groovy:168)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
              at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

              Any help will be much appreciated. Thanks.


              --
              You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
              For more options, visit https://groups.google.com/d/optout.
              --
              Thanks & Regards
              -----------------------------------------------------

              Amyth Arora
              -----------------------------------------------------

              Web:

              -----------------------------------------------------

              Email Addresses:
              aroras....@...,
              -----------------------------------------------------

              Social Profiles:
              Twitter - @mytharora
                -----------------------------------------------------


                Re: Jetty ALPN/NPN has not been properly configured.

                Misha Brukman <mbru...@...>
                 

                Looks like you've also filed a GitHub issue on this: https://github.com/JanusGraph/janusgraph/issues/450 — please use either the mailing list or GitHub to report issues, but not both, as duplication isn't helpful.

                Since this might be a bug, let's follow up on GitHub.

                On Tue, Aug 8, 2017 at 6:46 AM, Amyth Arora <aroras....@...> wrote:

                Hey, We have a mid-sized (10 million vertices, 5 billion edges) graph database (currently running on neo4j) at our organization. we are in the process of moving to JanusGraph with google cloud big table as the storage backend.

                I am facing the following issue while connection to Big Table from the gremlin shell. I receive the following error while I try to instantiate the graph.


                Jetty ALPN/NPN has not been properly configured.
                


                Here is the traceback.


                java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.hbase.HBaseStoreManager
                	at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
                	at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:480)
                	at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:414)
                	at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1343)
                	at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:107)
                	at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:75)
                	at org.janusgraph.core.JanusGraphFactory$open.call(Unknown Source)
                	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
                	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
                	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
                	at groovysh_evaluate.run(groovysh_evaluate:3)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:70)
                	at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:190)
                	at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
                	at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
                	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
                	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
                	at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
                	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
                	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
                	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:124)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
                	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
                	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
                	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:152)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:455)
                Caused by: java.lang.reflect.InvocationTargetException
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
                	at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
                	... 54 more
                Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
                	at org.janusgraph.diskstorage.hbase.HBaseStoreManager.<init>(HBaseStoreManager.java:336)
                	... 59 more
                Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
                	at org.janusgraph.diskstorage.hbase.HBaseCompat1_0.createConnection(HBaseCompat1_0.java:43)
                	at org.janusgraph.diskstorage.hbase.HBaseStoreManager.<init>(HBaseStoreManager.java:334)
                	... 59 more
                Caused by: java.lang.reflect.InvocationTargetException
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
                	... 63 more
                Caused by: java.lang.IllegalArgumentException: Jetty ALPN/NPN has not been properly configured.
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.selectApplicationProtocolConfig(GrpcSslContexts.java:174)
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:151)
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:139)
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:109)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createSslContext(BigtableSession.java:126)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createNettyChannel(BigtableSession.java:475)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession$4.create(BigtableSession.java:401)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.io.ChannelPool.<init>(ChannelPool.java:246)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createChannelPool(BigtableSession.java:404)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createManagedPool(BigtableSession.java:416)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.getDataChannelPool(BigtableSession.java:274)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.<init>(BigtableSession.java:247)
                	at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:143)
                	at com.google.cloud.bigtable.hbase1_0.BigtableConnection.<init>(BigtableConnection.java:56)
                


                Following is my janus configuration file(s):


                gremlin-server.yaml

                host: localhost
                port: 8182
                scriptEvaluationTimeout: 30000
                channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
                graphs: {
                  graph: conf/testconf.properties}
                plugins:
                  - janusgraph.imports
                scriptEngines: {
                  gremlin-groovy: {
                    imports: [java.lang.Math],
                    staticImports: [java.lang.Math.PI],
                    scripts: [scripts/empty-sample.groovy]}}
                serializers:
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                processors:
                  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
                  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
                metrics: {
                  consoleReporter: {enabled: true, interval: 180000},
                  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
                  jmxReporter: {enabled: true},
                  slf4jReporter: {enabled: true, interval: 180000},
                  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
                  graphiteReporter: {enabled: false, interval: 180000}}
                maxInitialLineLength: 4096
                maxHeaderSize: 8192
                maxChunkSize: 8192
                maxContentLength: 65536
                maxAccumulationBufferComponents: 1024
                resultIterationBatchSize: 64
                writeBufferLowWaterMark: 32768
                writeBufferHighWaterMark: 65536
                ssl: {
                  enabled: false}
                


                testconf.properties

                storage.backend=hbase
                
                ## Google cloud BIGTABLE configuration options
                storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection
                storage.hbase.ext.google.bigtable.project.id=my-project-id
                storage.hbase.ext.google.bigtable.instance.id=myinstanceid
                
                #!storage.hostname=127.0.0.1
                cache.db-cache = true
                cache.db-cache-clean-wait = 20
                cache.db-cache-time = 180000
                cache.db-cache-size = 0.5
                

                I have added the google-cloud-bigtable and
                netty-tcnative-boringssl-static jar files to the lib folder in janusgraph directory.

                How I am trying to instantiate the graph

                ./bin/gremlin.sh
                graph = JanusGraphFactory.open('conf/testconf.properties')
                


                And this gives me the above error. Is there something that I am missing ?


                NOTE

                Also, when I run the gremlin-server I get the following warning message:


                807  [main] WARN  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] configured at [conf/testconf.properties] could not be instantiated and will not be available in Gremlin Server.  GraphFactory message: Configuration must contain a valid 'gremlin.graph' setting
                java.lang.RuntimeException: Configuration must contain a valid 'gremlin.graph' setting
                	at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:57)
                	at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
                	at org.apache.tinkerpop.gremlin.server.GraphManager.lambda$new$8(GraphManager.java:55)
                	at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
                	at org.apache.tinkerpop.gremlin.server.GraphManager.<init>(GraphManager.java:53)
                	at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:83)
                	at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:110)
                	at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:344)

                --
                You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
                For more options, visit https://groups.google.com/d/optout.


                Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data

                Robert Dale <rob...@...>
                 

                Looks like your file doesn't fit in memory.

                Robert Dale

                On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
                Hi Everyone,

                I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

                import groovy.json.JsonSlurper;
                import java.util.ArrayList;
                import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
                import org.janusgraph.core.JanusGraphFactory;
                import org.janusgraph.core.PropertyKey;
                import org.janusgraph.core.Multiplicity;
                import org.janusgraph.core.schema.SchemaAction;
                import org.janusgraph.core.schema.SchemaStatus;
                import org.janusgraph.core.util.JanusGraphCleanup;
                import org.janusgraph.graphdb.database.StandardJanusGraph;
                import org.janusgraph.graphdb.database.management.ManagementSystem;

                /**
                 * Given a json file, populates data into the given JanusGraph DB
                 */
                class JanusGraphBuilder {

                    String graphPath;
                    StandardJanusGraph graph;
                    ManagementSystem management;
                    GraphTraversalSource traversal;

                    def dummyData;


                    public void main(String jsonPath, String janusGraphPath) {
                        this.graphPath  = janusGraphPath
                        this.initGraph()
                        this.initialize(jsonPath)
                        this.populate()
                    }

                    public void createEdges(def edges) {
                        println "Preparing edges."
                        edges.each {
                            def relation = it.edge
                            def properties = it.properties
                            def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                            def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                            def newEdge = vertexFrom.addEdge(relation, vertexTo)
                            properties.each {
                                if (it.key == 'score') {
                                    it.value = Float.parseFloat(it.value.toString())
                                }
                                newEdge.property(it.key, it.value)
                            }
                        }
                        this.graph.tx().commit()
                        println "Created edges successfully"
                    }

                    public void createVertexes(def vertexes) {
                        println "Preparing vertices."
                        vertexes.each {
                            def uniqueLabel = it.labels[0]
                            def properties = it.properties
                            def newVertex = this.graph.addVertex(label, uniqueLabel)
                            properties.each {
                                newVertex.property(it.key, it.value)
                            }
                        }
                        this.graph.tx().commit()
                        println "Created vertices successfully"
                    }

                    public void createSchema() {
                        println "Preparing schema."
                        // Do not create indexes while another transaction is in progress
                        this.graph.tx().rollback()
                        this.management = this.graph.openManagement()
                        this.management.set('ids.block-size', 20000000)

                        // Make property keys
                        def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                        def name = this.management.makePropertyKey("name").dataType(String.class).make()
                        def number = this.management.makePropertyKey("number").dataType(String.class).make()
                        def email = this.management.makePropertyKey("email").dataType(String.class).make()
                        def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                        def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                        def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                        def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                        def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                        def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                        def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                        // Define Vertex Labels
                        this.management.makeVertexLabel("person").make();
                        this.management.makeVertexLabel("candidate").make();
                        this.management.makeVertexLabel("recruiter").make();
                        this.management.makeVertexLabel("employee").make();
                        this.management.makeVertexLabel("linkedin").make();
                        this.management.makeVertexLabel("job").make();
                        this.management.makeVertexLabel("company").make();
                        this.management.makeVertexLabel("institute").make();
                        def phoneV = this.management.makeVertexLabel("phone").make();
                        def emailV = this.management.makeVertexLabel("email").make();

                        // Define Edge Labels
                        this.management.makeEdgeLabel("knows").make();
                        this.management.makeEdgeLabel("has").make();
                        this.management.makeEdgeLabel("provided_by").make();
                        this.management.makeEdgeLabel("studied_at").make();
                        this.management.makeEdgeLabel("worked_at").make();
                        this.management.makeEdgeLabel("posted").make();
                        this.management.makeEdgeLabel("liked").make();
                        this.management.makeEdgeLabel("worked_with").make();
                        this.management.makeEdgeLabel("studied_with").make();
                        this.management.makeEdgeLabel("is_a_match_for").make();

                        // Create indexes
                        this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                        this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                        this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                        this.management.commit()
                        this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                        this.management = this.graph.openManagement()
                        this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                        this.management.commit()

                        println "Created schema successfully"
                    }

                    public void populate() {
                        // Create db schema
                        this.createSchema()

                        // Create vertexes from the given dummy data
                        def vertexTransaction = this.graph.newTransaction()
                        def vertexes = this.dummyData.vertexes;
                        this.createVertexes(vertexes)
                        vertexTransaction.commit()
                        this.initGraph()

                        def edgeTransaction = this.graph.newTransaction()
                        // Create edges from the given dummy data
                        def edges = this.dummyData.edges;
                        this.createEdges(edges)
                        edgeTransaction.commit()
                        this.initGraph()

                        println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                    }

                    public void initialize(String jsonPath) {
                        String fileContents = new File(jsonPath).getText('UTF-8')
                        def slurper = new JsonSlurper()
                        def results = slurper.parseText(fileContents)
                        this.dummyData = results;
                        this.resetData()
                    }

                    public void resetData() {
                        // Remove all the data from the storage backend
                        this.graph.close()
                        JanusGraphCleanup.clear(this.graph)
                        this.initGraph()
                    }

                    public void initGraph() {
                        this.graph = JanusGraphFactory.open(this.graphPath)
                        this.traversal = this.graph.traversal()
                    }
                }


                JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
                graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

                I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

                storage.backend=hbase

                ## Google cloud BIGTABLE configuration options
                storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

                storage.hostname=localhost
                cache.db-cache = true
                cache.db-cache-clean-wait = 20
                cache.db-cache-time = 180000
                cache.db-cache-size = 0.5

                # Bulk Loading
                storage.batch-loading = true
                ids.block-size=20000000

                I execute the script using the following command:

                ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

                The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

                18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
                18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
                18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
                18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
                Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
                at java.util.Arrays.copyOf(Arrays.java:3332)
                at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
                at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
                at java.lang.StringBuilder.append(StringBuilder.java:190)
                at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
                at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
                at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
                at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
                at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
                at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
                at JanusGraphBuilder.initialize(Script1.groovy:146)
                at JanusGraphBuilder.main(Script1.groovy:29)
                at JanusGraphBuilder$main.call(Unknown Source)
                at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
                at Script1.run(Script1.groovy:168)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
                at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

                Any help will be much appreciated. Thanks.


                --
                You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
                For more options, visit https://groups.google.com/d/optout.


                Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data

                Amyth Arora <aroras....@...>
                 

                Hi Everyone,

                I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

                import groovy.json.JsonSlurper;
                import java.util.ArrayList;
                import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
                import org.janusgraph.core.JanusGraphFactory;
                import org.janusgraph.core.PropertyKey;
                import org.janusgraph.core.Multiplicity;
                import org.janusgraph.core.schema.SchemaAction;
                import org.janusgraph.core.schema.SchemaStatus;
                import org.janusgraph.core.util.JanusGraphCleanup;
                import org.janusgraph.graphdb.database.StandardJanusGraph;
                import org.janusgraph.graphdb.database.management.ManagementSystem;

                /**
                 * Given a json file, populates data into the given JanusGraph DB
                 */
                class JanusGraphBuilder {

                    String graphPath;
                    StandardJanusGraph graph;
                    ManagementSystem management;
                    GraphTraversalSource traversal;

                    def dummyData;


                    public void main(String jsonPath, String janusGraphPath) {
                        this.graphPath  = janusGraphPath
                        this.initGraph()
                        this.initialize(jsonPath)
                        this.populate()
                    }

                    public void createEdges(def edges) {
                        println "Preparing edges."
                        edges.each {
                            def relation = it.edge
                            def properties = it.properties
                            def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                            def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                            def newEdge = vertexFrom.addEdge(relation, vertexTo)
                            properties.each {
                                if (it.key == 'score') {
                                    it.value = Float.parseFloat(it.value.toString())
                                }
                                newEdge.property(it.key, it.value)
                            }
                        }
                        this.graph.tx().commit()
                        println "Created edges successfully"
                    }

                    public void createVertexes(def vertexes) {
                        println "Preparing vertices."
                        vertexes.each {
                            def uniqueLabel = it.labels[0]
                            def properties = it.properties
                            def newVertex = this.graph.addVertex(label, uniqueLabel)
                            properties.each {
                                newVertex.property(it.key, it.value)
                            }
                        }
                        this.graph.tx().commit()
                        println "Created vertices successfully"
                    }

                    public void createSchema() {
                        println "Preparing schema."
                        // Do not create indexes while another transaction is in progress
                        this.graph.tx().rollback()
                        this.management = this.graph.openManagement()
                        this.management.set('ids.block-size', 20000000)

                        // Make property keys
                        def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                        def name = this.management.makePropertyKey("name").dataType(String.class).make()
                        def number = this.management.makePropertyKey("number").dataType(String.class).make()
                        def email = this.management.makePropertyKey("email").dataType(String.class).make()
                        def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                        def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                        def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                        def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                        def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                        def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                        def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                        // Define Vertex Labels
                        this.management.makeVertexLabel("person").make();
                        this.management.makeVertexLabel("candidate").make();
                        this.management.makeVertexLabel("recruiter").make();
                        this.management.makeVertexLabel("employee").make();
                        this.management.makeVertexLabel("linkedin").make();
                        this.management.makeVertexLabel("job").make();
                        this.management.makeVertexLabel("company").make();
                        this.management.makeVertexLabel("institute").make();
                        def phoneV = this.management.makeVertexLabel("phone").make();
                        def emailV = this.management.makeVertexLabel("email").make();

                        // Define Edge Labels
                        this.management.makeEdgeLabel("knows").make();
                        this.management.makeEdgeLabel("has").make();
                        this.management.makeEdgeLabel("provided_by").make();
                        this.management.makeEdgeLabel("studied_at").make();
                        this.management.makeEdgeLabel("worked_at").make();
                        this.management.makeEdgeLabel("posted").make();
                        this.management.makeEdgeLabel("liked").make();
                        this.management.makeEdgeLabel("worked_with").make();
                        this.management.makeEdgeLabel("studied_with").make();
                        this.management.makeEdgeLabel("is_a_match_for").make();

                        // Create indexes
                        this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                        this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                        this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                        this.management.commit()
                        this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                        this.management = this.graph.openManagement()
                        this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                        this.management.commit()

                        println "Created schema successfully"
                    }

                    public void populate() {
                        // Create db schema
                        this.createSchema()

                        // Create vertexes from the given dummy data
                        def vertexTransaction = this.graph.newTransaction()
                        def vertexes = this.dummyData.vertexes;
                        this.createVertexes(vertexes)
                        vertexTransaction.commit()
                        this.initGraph()

                        def edgeTransaction = this.graph.newTransaction()
                        // Create edges from the given dummy data
                        def edges = this.dummyData.edges;
                        this.createEdges(edges)
                        edgeTransaction.commit()
                        this.initGraph()

                        println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                    }

                    public void initialize(String jsonPath) {
                        String fileContents = new File(jsonPath).getText('UTF-8')
                        def slurper = new JsonSlurper()
                        def results = slurper.parseText(fileContents)
                        this.dummyData = results;
                        this.resetData()
                    }

                    public void resetData() {
                        // Remove all the data from the storage backend
                        this.graph.close()
                        JanusGraphCleanup.clear(this.graph)
                        this.initGraph()
                    }

                    public void initGraph() {
                        this.graph = JanusGraphFactory.open(this.graphPath)
                        this.traversal = this.graph.traversal()
                    }
                }


                JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
                graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

                I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

                storage.backend=hbase

                ## Google cloud BIGTABLE configuration options
                storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection
                storage.hbase.ext.google.bigtable.project.id=my-project-id
                storage.hbase.ext.google.bigtable.instance.id=my-instance-id

                storage.hostname=localhost
                cache.db-cache = true
                cache.db-cache-clean-wait = 20
                cache.db-cache-time = 180000
                cache.db-cache-size = 0.5

                # Bulk Loading
                storage.batch-loading = true
                ids.block-size=20000000

                I execute the script using the following command:

                ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

                The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

                18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
                18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
                18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
                18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
                Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
                at java.util.Arrays.copyOf(Arrays.java:3332)
                at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
                at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
                at java.lang.StringBuilder.append(StringBuilder.java:190)
                at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
                at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
                at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
                at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
                at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
                at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
                at JanusGraphBuilder.initialize(Script1.groovy:146)
                at JanusGraphBuilder.main(Script1.groovy:29)
                at JanusGraphBuilder$main.call(Unknown Source)
                at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
                at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
                at Script1.run(Script1.groovy:168)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
                at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
                at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

                Any help will be much appreciated. Thanks.


                Amyth (twiiter.com/mytharora)


                Cache expiration time

                Ohad Pinchevsky <ohad.pi...@...>
                 

                Hi,

                I am trying to increase/disable the cache expiration time using the cache.db-cache-time property
                I changed the value to 0 and restarted the Gemlin server, but it seems it is not working (based on execution time, first time slow, second fast, waiting, third time slow again).

                What am I missing?

                Thanks,
                Ohad


                Jetty ALPN/NPN has not been properly configured.

                Amyth Arora <aroras....@...>
                 

                Hey, We have a mid-sized (10 million vertices, 5 billion edges) graph database (currently running on neo4j) at our organization. we are in the process of moving to JanusGraph with google cloud big table as the storage backend.

                I am facing the following issue while connection to Big Table from the gremlin shell. I receive the following error while I try to instantiate the graph.


                Jetty ALPN/NPN has not been properly configured.
                


                Here is the traceback.


                java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.hbase.HBaseStoreManager
                	at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
                	at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:480)
                	at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:414)
                	at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1343)
                	at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:107)
                	at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:75)
                	at org.janusgraph.core.JanusGraphFactory$open.call(Unknown Source)
                	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
                	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
                	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
                	at groovysh_evaluate.run(groovysh_evaluate:3)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:70)
                	at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:190)
                	at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
                	at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
                	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
                	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
                	at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
                	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
                	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
                	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:124)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
                	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                	at java.lang.reflect.Method.invoke(Method.java:498)
                	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
                	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
                	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
                	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
                	at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:152)
                	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
                	at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:455)
                Caused by: java.lang.reflect.InvocationTargetException
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
                	at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
                	... 54 more
                Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
                	at org.janusgraph.diskstorage.hbase.HBaseStoreManager.<init>(HBaseStoreManager.java:336)
                	... 59 more
                Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
                	at org.janusgraph.diskstorage.hbase.HBaseCompat1_0.createConnection(HBaseCompat1_0.java:43)
                	at org.janusgraph.diskstorage.hbase.HBaseStoreManager.<init>(HBaseStoreManager.java:334)
                	... 59 more
                Caused by: java.lang.reflect.InvocationTargetException
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
                	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
                	... 63 more
                Caused by: java.lang.IllegalArgumentException: Jetty ALPN/NPN has not been properly configured.
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.selectApplicationProtocolConfig(GrpcSslContexts.java:174)
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:151)
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:139)
                	at com.google.bigtable.repackaged.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:109)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createSslContext(BigtableSession.java:126)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createNettyChannel(BigtableSession.java:475)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession$4.create(BigtableSession.java:401)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.io.ChannelPool.<init>(ChannelPool.java:246)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createChannelPool(BigtableSession.java:404)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.createManagedPool(BigtableSession.java:416)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.getDataChannelPool(BigtableSession.java:274)
                	at com.google.bigtable.repackaged.com.google.cloud.bigtable.grpc.BigtableSession.<init>(BigtableSession.java:247)
                	at org.apache.hadoop.hbase.client.AbstractBigtableConnection.<init>(AbstractBigtableConnection.java:143)
                	at com.google.cloud.bigtable.hbase1_0.BigtableConnection.<init>(BigtableConnection.java:56)
                


                Following is my janus configuration file(s):


                gremlin-server.yaml

                host: localhost
                port: 8182
                scriptEvaluationTimeout: 30000
                channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
                graphs: {
                  graph: conf/testconf.properties}
                plugins:
                  - janusgraph.imports
                scriptEngines: {
                  gremlin-groovy: {
                    imports: [java.lang.Math],
                    staticImports: [java.lang.Math.PI],
                    scripts: [scripts/empty-sample.groovy]}}
                serializers:
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
                processors:
                  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
                  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
                metrics: {
                  consoleReporter: {enabled: true, interval: 180000},
                  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
                  jmxReporter: {enabled: true},
                  slf4jReporter: {enabled: true, interval: 180000},
                  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
                  graphiteReporter: {enabled: false, interval: 180000}}
                maxInitialLineLength: 4096
                maxHeaderSize: 8192
                maxChunkSize: 8192
                maxContentLength: 65536
                maxAccumulationBufferComponents: 1024
                resultIterationBatchSize: 64
                writeBufferLowWaterMark: 32768
                writeBufferHighWaterMark: 65536
                ssl: {
                  enabled: false}
                


                testconf.properties

                storage.backend=hbase
                
                ## Google cloud BIGTABLE configuration options
                storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection
                storage.hbase.ext.google.bigtable.project.id=my-project-id
                storage.hbase.ext.google.bigtable.instance.id=myinstanceid
                
                #!storage.hostname=127.0.0.1
                cache.db-cache = true
                cache.db-cache-clean-wait = 20
                cache.db-cache-time = 180000
                cache.db-cache-size = 0.5
                

                I have added the google-cloud-bigtable and
                netty-tcnative-boringssl-static jar files to the lib folder in janusgraph directory.

                How I am trying to instantiate the graph

                ./bin/gremlin.sh
                graph = JanusGraphFactory.open('conf/testconf.properties')
                


                And this gives me the above error. Is there something that I am missing ?


                NOTE

                Also, when I run the gremlin-server I get the following warning message:


                807  [main] WARN  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] configured at [conf/testconf.properties] could not be instantiated and will not be available in Gremlin Server.  GraphFactory message: Configuration must contain a valid 'gremlin.graph' setting
                java.lang.RuntimeException: Configuration must contain a valid 'gremlin.graph' setting
                	at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:57)
                	at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
                	at org.apache.tinkerpop.gremlin.server.GraphManager.lambda$new$8(GraphManager.java:55)
                	at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
                	at org.apache.tinkerpop.gremlin.server.GraphManager.<init>(GraphManager.java:53)
                	at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:83)
                	at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:110)
                	at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:344)


                Re: [BLOG] Configuring JanusGraph for spark-yarn

                HadoopMarc <bi...@...>
                 

                Hi Joseph,

                You ran into terrain I have not yet covered myself. Up till now I have been using the graben1437 PR for Titan and for OLAP I adopted a poor man's approach where node id's are distributed over spark tasks and each spark executor makes its own Titan/HBase connection. This performs well, but does not have the nice abstraction of the HBaseInputFormat.

                So, no clear answer to this one, but just some thoughts:
                 - could you try to move some regions manually and see what it does to performance?
                 - how do your OLAP vertex count times compare to the OLTP count times?
                 - how does the sum of spark task execution times compare to the yarn start-to-end time difference you reported? In other words, how much of the start-to-end time is spent in waiting for timeouts?
                 - unless you managed to create a vertex with > 1GB size, the RowTooBigException sounds like a bug (which you can report on Jnausgraph's github page). Hbase does not like large rows at all, so vertex/edge properties should not have blob values.
                 
                @(David Robinson): do you have any additional thoughts on this?

                Cheers,    Marc

                Op maandag 7 augustus 2017 23:12:02 UTC+2 schreef Joseph Obernberger:

                Hi Marc - I've been able to get it to run longer, but am now getting a RowTooBigException from HBase.  How does JanusGraph store data in HBase?  The current max size of a row in 1GByte, which makes me think this error is covering something else up.

                What I'm seeing so far in testing with a 5 server cluster - each machine with 128G of RAM:
                HBase table is 1.5G in size, split across 7 regions, and has 20,001,105 rows.  To do a g.V().count() takes 2 hours and results in 3,842,755 verticies.

                Another HBase table is 5.7G in size split across 10 regions, is 57,620,276 rows, and took 6.5 hours to run the count and results in 10,859,491 nodes.  When running, it looks like it hits one server very hard even though the YARN tasks are distributed across the cluster.  One HBase node gets hammered.

                The RowTooBigException is below.  Anything to try?  Thank you for any help!


                org.janusgraph.core.JanusGraphException: Could not process individual retrieval call
                                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:257)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1269)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1137)
                                at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:209)
                                at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:75)
                                at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:54)
                                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:67)
                                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:28)
                                at com.google.common.collect.Iterators$7.computeNext(Iterators.java:651)
                                at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
                                at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
                                at org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl.getTypeInspector(JanusGraphHadoopSetupImpl.java:60)
                                at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.<init>(JanusGraphVertexDeserializer.java:55)
                                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.lambda$static$0(GiraphInputFormat.java:49)
                                at org.janusgraph.hadoop.formats.util.GiraphInputFormat$RefCountedCloseable.acquire(GiraphInputFormat.java:100)
                                at org.janusgraph.hadoop.formats.util.GiraphRecordReader.<init>(GiraphRecordReader.java:47)
                                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.createRecordReader(GiraphInputFormat.java:67)
                                at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:166)
                                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
                                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
                                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
                                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
                                at org.apache.spark.scheduler.Task.run(Task.scala:89)
                                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
                                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                                at java.lang.Thread.run(Thread.java:745)
                Caused by: org.janusgraph.core.JanusGraphException: Could not call index
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1262)
                                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:255)
                                ... 34 more
                Caused by: org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
                                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:57)
                                at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:444)
                                at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:395)
                                at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:51)
                                at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:529)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.lambda$call$5(StandardJanusGraphTx.java:1258)
                                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:97)
                                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:89)
                                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:81)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1258)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1255)
                                at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
                                at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
                                at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
                                at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
                                at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
                                at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
                                at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1255)
                                ... 35 more
                Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT10S
                                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:101)
                                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
                                ... 53 more
                Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
                                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getHelper(HBaseKeyColumnValueStore.java:202)
                                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getSlice(HBaseKeyColumnValueStore.java:90)
                                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:398)
                                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:395)
                                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
                                ... 54 more
                Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
                Sat Aug 05 07:22:03 EDT 2017, RpcRetryingCaller{globalStartTime=1501932111280, pause=100, retries=35}, org.apache.hadoop.hbase.regionserver.RowTooBigException: rg.apache.hadoop.hbase.regionserver.RowTooBigException: Max row size allowed: 1073741824, but the row is bigger than that.
                                at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:564)
                                at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5697)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5856)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5634)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5611)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5597)
                                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6792)
                                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6770)
                                at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2023)
                                at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
                                at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
                                at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
                                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
                                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)


                On 8/6/2017 3:50 PM, HadoopMarc wrote:
                Hi ... and others,  I have been offline for a few weeks enjoying a holiday and will start looking into your questions and make the suggested corrections. Thanks for following the recipes and helping others with it.

                ..., did you run the recipe on the same HDP sandbox and same Tinkerpop version? I remember (from 4 weeks ago) that copying the zookeeper.znode.parent property from the hbase configs to the janusgraph configs was essential to get janusgraph's HBaseInputFormat working (that is: read graph data for the spark tasks).

                Cheers,    Marc

                Op maandag 24 juli 2017 10:12:13 UTC+2 schreef spi...@...:
                hi,Thanks for your post.
                I did it according to the post.But I ran into a problem.
                15:58:49,110  INFO SecurityManager:58 - Changing view acls to: rc
                15:58:49,110  INFO SecurityManager:58 - Changing modify acls to: rc
                15:58:49,110  INFO SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rc); users with modify permissions: Set(rc)
                15:58:49,111  INFO Client:58 - Submitting application 25 to ResourceManager
                15:58:49,320  INFO YarnClientImpl:274 - Submitted application application_1500608983535_0025
                15:58:49,321  INFO SchedulerExtensionServices:58 - Starting Yarn extension services with app application_1500608983535_0025 and attemptId None
                15:58:50,325  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:50,326  INFO Client:58 -
                client token: N/A
                diagnostics: N/A
                ApplicationMaster host: N/A
                ApplicationMaster RPC port: -1
                queue: default
                start time: 1500883129115
                final status: UNDEFINED
                user: rc
                15:58:51,330  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:52,333  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:53,335  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:54,337  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:55,340  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:56,343  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:56,802  INFO YarnSchedulerBackend$YarnSchedulerEndpoint:58 - ApplicationMaster registered as NettyRpcEndpointRef(null)
                15:58:56,822  INFO YarnClientSchedulerBackend:58 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dl-rc-optd-ambari-master-v-test-1.host.dataengine.com,dl-rc-optd-ambari-master-v-test-2.host.dataengine.com, PROXY_URI_BASES -> http://dl-rc-optd-ambari-master-v-test-1.host.dataengine.com:8088/proxy/application_1500608983535_0025,http://dl-rc-optd-ambari-master-v-test-2.host.dataengine.com:8088/proxy/application_1500608983535_0025), /proxy/application_1500608983535_0025
                15:58:56,824  INFO JettyUtils:58 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
                15:58:57,346  INFO Client:58 - Application report for application_1500608983535_0025 (state: RUNNING)
                15:58:57,347  INFO Client:58 -
                client token: N/A
                diagnostics: N/A
                ApplicationMaster host: 10.200.48.154
                ApplicationMaster RPC port: 0
                queue: default
                start time: 1500883129115
                final status: UNDEFINED
                user: rc
                15:58:57,348  INFO YarnClientSchedulerBackend:58 - Application application_1500608983535_0025 has started running.
                15:58:57,358  INFO Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 47514.
                15:58:57,358  INFO NettyBlockTransferService:58 - Server created on 47514
                15:58:57,360  INFO BlockManagerMaster:58 - Trying to register BlockManager
                15:58:57,363  INFO BlockManagerMasterEndpoint:58 - Registering block manager 10.200.48.112:47514 with 2.4 GB RAM, BlockManagerId(driver, 10.200.48.112, 47514)15:58:57,366  INFO BlockManagerMaster:58 - Registered BlockManager
                15:58:57,585  INFO EventLoggingListener:58 - Logging events to hdfs:///spark-history/application_1500608983535_0025
                15:59:07,177  WARN YarnSchedulerBackend$YarnSchedulerEndpoint:70 - Container marked as failed: container_e170_1500608983535_0025_01_000002 on host: dl-rc-optd-ambari-slave-v-test-1.host.dataengine.com. Exit status: 1. Diagnostics: Exception from container-launch.
                Container id: container_e170_1500608983535_0025_01_000002
                Exit code: 1
                Stack trace: ExitCodeException exitCode=1:
                at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
                at org.apache.hadoop.util.Shell.run(Shell.java:487)
                at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
                at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                at java.lang.Thread.run(Thread.java:745)

                Shell output: main : command provided 1
                main : run as user is rc
                main : requested yarn user is rc


                Container exited with a non-zero exit code 1
                Display stack trace? [yN]15:59:57,702  WARN TransportChannelHandler:79 - Exception in connection from 10.200.48.155/10.200.48.155:50921
                java.io.IOException: Connection reset by peer
                at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
                at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
                at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
                at sun.nio.ch.IOUtil.read(IOUtil.java:192)
                at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
                at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
                at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
                at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
                at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
                at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
                at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
                at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
                at java.lang.Thread.run(Thread.java:748)
                15:59:57,704 ERROR TransportResponseHandler:132 - Still have 1 requests outstanding when connection from 10.200.48.155/10.200.48.155:50921 is closed
                15:59:57,706  WARN NettyRpcEndpointRef:91 - Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
                java.io.IOException: Connection reset by peer
                at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
                at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
                at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
                at sun.nio.ch.IOUtil.read(IOUtil.java:192)
                at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
                at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
                at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
                at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
                at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
                at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
                at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
                at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
                at java.lang.Thread.run(Thread.java:748)

                I am confused about that. Could you please help me?



                在 2017年7月6日星期四 UTC+8下午4:15:37,HadoopMarc写道:

                Readers wanting to run OLAP queries on a real spark-yarn cluster might want to check my recent post:

                http://yaaics.blogspot.nl/2017/07/configuring-janusgraph-for-spark-yarn.html

                Regards,  Marc
                --
                You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
                For more options, visit https://groups.google.com/d/optout.

                Virus-free. www.avg.com


                Re: [BLOG] Configuring JanusGraph for spark-yarn

                Joe Obernberger <joseph.o...@...>
                 

                Hi Marc - I've been able to get it to run longer, but am now getting a RowTooBigException from HBase.  How does JanusGraph store data in HBase?  The current max size of a row in 1GByte, which makes me think this error is covering something else up.

                What I'm seeing so far in testing with a 5 server cluster - each machine with 128G of RAM:
                HBase table is 1.5G in size, split across 7 regions, and has 20,001,105 rows.  To do a g.V().count() takes 2 hours and results in 3,842,755 verticies.

                Another HBase table is 5.7G in size split across 10 regions, is 57,620,276 rows, and took 6.5 hours to run the count and results in 10,859,491 nodes.  When running, it looks like it hits one server very hard even though the YARN tasks are distributed across the cluster.  One HBase node gets hammered.

                The RowTooBigException is below.  Anything to try?  Thank you for any help!


                org.janusgraph.core.JanusGraphException: Could not process individual retrieval call
                                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:257)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1269)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6.execute(StandardJanusGraphTx.java:1137)
                                at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:209)
                                at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:75)
                                at org.janusgraph.graphdb.query.ResultSetIterator.nextInternal(ResultSetIterator.java:54)
                                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:67)
                                at org.janusgraph.graphdb.query.ResultSetIterator.next(ResultSetIterator.java:28)
                                at com.google.common.collect.Iterators$7.computeNext(Iterators.java:651)
                                at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
                                at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
                                at org.janusgraph.hadoop.formats.util.input.current.JanusGraphHadoopSetupImpl.getTypeInspector(JanusGraphHadoopSetupImpl.java:60)
                                at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.<init>(JanusGraphVertexDeserializer.java:55)
                                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.lambda$static$0(GiraphInputFormat.java:49)
                                at org.janusgraph.hadoop.formats.util.GiraphInputFormat$RefCountedCloseable.acquire(GiraphInputFormat.java:100)
                                at org.janusgraph.hadoop.formats.util.GiraphRecordReader.<init>(GiraphRecordReader.java:47)
                                at org.janusgraph.hadoop.formats.util.GiraphInputFormat.createRecordReader(GiraphInputFormat.java:67)
                                at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:166)
                                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:133)
                                at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
                                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                                at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
                                at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
                                at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
                                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
                                at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
                                at org.apache.spark.scheduler.Task.run(Task.scala:89)
                                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
                                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                                at java.lang.Thread.run(Thread.java:745)
                Caused by: org.janusgraph.core.JanusGraphException: Could not call index
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1262)
                                at org.janusgraph.graphdb.query.QueryUtil.processIntersectingRetrievals(QueryUtil.java:255)
                                ... 34 more
                Caused by: org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
                                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:57)
                                at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:444)
                                at org.janusgraph.diskstorage.BackendTransaction.indexQuery(BackendTransaction.java:395)
                                at org.janusgraph.graphdb.query.graph.MultiKeySliceQuery.execute(MultiKeySliceQuery.java:51)
                                at org.janusgraph.graphdb.database.IndexSerializer.query(IndexSerializer.java:529)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.lambda$call$5(StandardJanusGraphTx.java:1258)
                                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:97)
                                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:89)
                                at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:81)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1258)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6$1.call(StandardJanusGraphTx.java:1255)
                                at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
                                at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
                                at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
                                at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
                                at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
                                at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
                                at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
                                at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$6$6.call(StandardJanusGraphTx.java:1255)
                                ... 35 more
                Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Could not successfully complete backend operation due to repeated temporary exceptions after PT10S
                                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:101)
                                at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
                                ... 53 more
                Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
                                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getHelper(HBaseKeyColumnValueStore.java:202)
                                at org.janusgraph.diskstorage.hbase.HBaseKeyColumnValueStore.getSlice(HBaseKeyColumnValueStore.java:90)
                                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                                at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:77)
                                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:398)
                                at org.janusgraph.diskstorage.BackendTransaction$5.call(BackendTransaction.java:395)
                                at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
                                ... 54 more
                Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
                Sat Aug 05 07:22:03 EDT 2017, RpcRetryingCaller{globalStartTime=1501932111280, pause=100, retries=35}, org.apache.hadoop.hbase.regionserver.RowTooBigException: rg.apache.hadoop.hbase.regionserver.RowTooBigException: Max row size allowed: 1073741824, but the row is bigger than that.
                                at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:564)
                                at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5697)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5856)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5634)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5611)
                                at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5597)
                                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6792)
                                at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6770)
                                at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2023)
                                at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
                                at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
                                at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
                                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
                                at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)


                On 8/6/2017 3:50 PM, HadoopMarc wrote:

                Hi ... and others,  I have been offline for a few weeks enjoying a holiday and will start looking into your questions and make the suggested corrections. Thanks for following the recipes and helping others with it.

                ..., did you run the recipe on the same HDP sandbox and same Tinkerpop version? I remember (from 4 weeks ago) that copying the zookeeper.znode.parent property from the hbase configs to the janusgraph configs was essential to get janusgraph's HBaseInputFormat working (that is: read graph data for the spark tasks).

                Cheers,    Marc

                Op maandag 24 juli 2017 10:12:13 UTC+2 schreef spi...@...:
                hi,Thanks for your post.
                I did it according to the post.But I ran into a problem.
                15:58:49,110  INFO SecurityManager:58 - Changing view acls to: rc
                15:58:49,110  INFO SecurityManager:58 - Changing modify acls to: rc
                15:58:49,110  INFO SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rc); users with modify permissions: Set(rc)
                15:58:49,111  INFO Client:58 - Submitting application 25 to ResourceManager
                15:58:49,320  INFO YarnClientImpl:274 - Submitted application application_1500608983535_0025
                15:58:49,321  INFO SchedulerExtensionServices:58 - Starting Yarn extension services with app application_1500608983535_0025 and attemptId None
                15:58:50,325  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:50,326  INFO Client:58 -
                client token: N/A
                diagnostics: N/A
                ApplicationMaster host: N/A
                ApplicationMaster RPC port: -1
                queue: default
                start time: 1500883129115
                final status: UNDEFINED
                user: rc
                15:58:51,330  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:52,333  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:53,335  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:54,337  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:55,340  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:56,343  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
                15:58:56,802  INFO YarnSchedulerBackend$YarnSchedulerEndpoint:58 - ApplicationMaster registered as NettyRpcEndpointRef(null)
                15:58:56,822  INFO YarnClientSchedulerBackend:58 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dl-rc-optd-ambari-master-v-test-1.host.dataengine.com,dl-rc-optd-ambari-master-v-test-2.host.dataengine.com, PROXY_URI_BASES -> http://dl-rc-optd-ambari-master-v-test-1.host.dataengine.com:8088/proxy/application_1500608983535_0025,http://dl-rc-optd-ambari-master-v-test-2.host.dataengine.com:8088/proxy/application_1500608983535_0025), /proxy/application_1500608983535_0025
                15:58:56,824  INFO JettyUtils:58 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
                15:58:57,346  INFO Client:58 - Application report for application_1500608983535_0025 (state: RUNNING)
                15:58:57,347  INFO Client:58 -
                client token: N/A
                diagnostics: N/A
                ApplicationMaster host: 10.200.48.154
                ApplicationMaster RPC port: 0
                queue: default
                start time: 1500883129115
                final status: UNDEFINED
                user: rc
                15:58:57,348  INFO YarnClientSchedulerBackend:58 - Application application_1500608983535_0025 has started running.
                15:58:57,358  INFO Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 47514.
                15:58:57,358  INFO NettyBlockTransferService:58 - Server created on 47514
                15:58:57,360  INFO BlockManagerMaster:58 - Trying to register BlockManager
                15:58:57,363  INFO BlockManagerMasterEndpoint:58 - Registering block manager 10.200.48.112:47514 with 2.4 GB RAM, BlockManagerId(driver, 10.200.48.112, 47514)15:58:57,366  INFO BlockManagerMaster:58 - Registered BlockManager
                15:58:57,585  INFO EventLoggingListener:58 - Logging events to hdfs:///spark-history/application_1500608983535_0025
                15:59:07,177  WARN YarnSchedulerBackend$YarnSchedulerEndpoint:70 - Container marked as failed: container_e170_1500608983535_0025_01_000002 on host: dl-rc-optd-ambari-slave-v-test-1.host.dataengine.com. Exit status: 1. Diagnostics: Exception from container-launch.
                Container id: container_e170_1500608983535_0025_01_000002
                Exit code: 1
                Stack trace: ExitCodeException exitCode=1:
                at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
                at org.apache.hadoop.util.Shell.run(Shell.java:487)
                at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
                at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                at java.lang.Thread.run(Thread.java:745)

                Shell output: main : command provided 1
                main : run as user is rc
                main : requested yarn user is rc


                Container exited with a non-zero exit code 1
                Display stack trace? [yN]15:59:57,702  WARN TransportChannelHandler:79 - Exception in connection from 10.200.48.155/10.200.48.155:50921
                java.io.IOException: Connection reset by peer
                at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
                at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
                at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
                at sun.nio.ch.IOUtil.read(IOUtil.java:192)
                at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
                at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
                at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
                at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
                at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
                at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
                at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
                at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
                at java.lang.Thread.run(Thread.java:748)
                15:59:57,704 ERROR TransportResponseHandler:132 - Still have 1 requests outstanding when connection from 10.200.48.155/10.200.48.155:50921 is closed
                15:59:57,706  WARN NettyRpcEndpointRef:91 - Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
                java.io.IOException: Connection reset by peer
                at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
                at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
                at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
                at sun.nio.ch.IOUtil.read(IOUtil.java:192)
                at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
                at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
                at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
                at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
                at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
                at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
                at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
                at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
                at java.lang.Thread.run(Thread.java:748)

                I am confused about that. Could you please help me?



                在 2017年7月6日星期四 UTC+8下午4:15:37,HadoopMarc写道:

                Readers wanting to run OLAP queries on a real spark-yarn cluster might want to check my recent post:

                http://yaaics.blogspot.nl/2017/07/configuring-janusgraph-for-spark-yarn.html

                Regards,  Marc
                --
                You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
                For more options, visit https://groups.google.com/d/optout.

                Virus-free. www.avg.com


                Re: Best practice setup for Go driver development & identifying the websocket serialization format

                John Helmsen <john....@...>
                 

                I don't think that you have a serialization problem here, but more of an idiosyncrasy of how gremgo works.  I've used it a bit, and whenever gremgo returns a value, it returns it using an empty interface construction, or perhaps an array of empty interfaces:
                https://tour.golang.org/methods/14

                 The object -> string system in Go is actually quite smart, and unwraps your empty interface to a string when it prints out the answer.  The format you are seeing is the result of this transformation.

                The empty interface is used as the return type, since all different types of data could be contained inside.  You will have to write your own code to perform the unwrapping in order to work with the contained data.