Exception in thread "main" java.lang.OutOfMemoryError: Java heap space while loading bulk data


Misha Brukman <mbru...@...>
 

You might consider using a format other than JSON which can easily read incrementally, such as CSV or a more compact binary encoding such as:
or you may want to use a streaming JSON reader which will read as little as possible to generate a meaningful callback. Several options exist (I have not tested these), e.g.,
If you search on Stack Overflow, you'll find others have had exactly the same issue as here with Python and JSON files, and the answers to those questions all pointed to incremental JSON parser libraries for Python as the solution to the OOMs.

On Tue, Aug 8, 2017 at 10:49 AM, Amyth Arora <aroras....@...> wrote:
Thanks Robert, I am going to check the file size  and its contents when I reach home and also will try to load the file through the shell and post the update here.

On Tue, 8 Aug 2017 at 8:14 PM, Robert Dale <rob...@...> wrote:
Well, whatever, but your stacktrace points to:  String fileContents = new File(jsonPath).getText('UTF-8')
Thus, your file does not fit in memory - either available system memory or within jvm max memory.

Robert Dale

On Tue, Aug 8, 2017 at 10:38 AM, Amyth Arora <aroras....@...> wrote:
Hi Robert,

The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

storage.batch-loading
ids.block-size

But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
Looks like your file doesn't fit in memory.

Robert Dale

On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
Hi Everyone,

I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

import groovy.json.JsonSlurper;
import java.util.ArrayList;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.Multiplicity;
import org.janusgraph.core.schema.SchemaAction;
import org.janusgraph.core.schema.SchemaStatus;
import org.janusgraph.core.util.JanusGraphCleanup;
import org.janusgraph.graphdb.database.StandardJanusGraph;
import org.janusgraph.graphdb.database.management.ManagementSystem;

/**
 * Given a json file, populates data into the given JanusGraph DB
 */
class JanusGraphBuilder {

    String graphPath;
    StandardJanusGraph graph;
    ManagementSystem management;
    GraphTraversalSource traversal;

    def dummyData;


    public void main(String jsonPath, String janusGraphPath) {
        this.graphPath  = janusGraphPath
        this.initGraph()
        this.initialize(jsonPath)
        this.populate()
    }

    public void createEdges(def edges) {
        println "Preparing edges."
        edges.each {
            def relation = it.edge
            def properties = it.properties
            def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
            def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
            def newEdge = vertexFrom.addEdge(relation, vertexTo)
            properties.each {
                if (it.key == 'score') {
                    it.value = Float.parseFloat(it.value.toString())
                }
                newEdge.property(it.key, it.value)
            }
        }
        this.graph.tx().commit()
        println "Created edges successfully"
    }

    public void createVertexes(def vertexes) {
        println "Preparing vertices."
        vertexes.each {
            def uniqueLabel = it.labels[0]
            def properties = it.properties
            def newVertex = this.graph.addVertex(label, uniqueLabel)
            properties.each {
                newVertex.property(it.key, it.value)
            }
        }
        this.graph.tx().commit()
        println "Created vertices successfully"
    }

    public void createSchema() {
        println "Preparing schema."
        // Do not create indexes while another transaction is in progress
        this.graph.tx().rollback()
        this.management = this.graph.openManagement()
        this.management.set('ids.block-size', 20000000)

        // Make property keys
        def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
        def name = this.management.makePropertyKey("name").dataType(String.class).make()
        def number = this.management.makePropertyKey("number").dataType(String.class).make()
        def email = this.management.makePropertyKey("email").dataType(String.class).make()
        def score = this.management.makePropertyKey("score").dataType(Float.class).make()
        def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
        def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
        def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
        def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
        def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
        def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

        // Define Vertex Labels
        this.management.makeVertexLabel("person").make();
        this.management.makeVertexLabel("candidate").make();
        this.management.makeVertexLabel("recruiter").make();
        this.management.makeVertexLabel("employee").make();
        this.management.makeVertexLabel("linkedin").make();
        this.management.makeVertexLabel("job").make();
        this.management.makeVertexLabel("company").make();
        this.management.makeVertexLabel("institute").make();
        def phoneV = this.management.makeVertexLabel("phone").make();
        def emailV = this.management.makeVertexLabel("email").make();

        // Define Edge Labels
        this.management.makeEdgeLabel("knows").make();
        this.management.makeEdgeLabel("has").make();
        this.management.makeEdgeLabel("provided_by").make();
        this.management.makeEdgeLabel("studied_at").make();
        this.management.makeEdgeLabel("worked_at").make();
        this.management.makeEdgeLabel("posted").make();
        this.management.makeEdgeLabel("liked").make();
        this.management.makeEdgeLabel("worked_with").make();
        this.management.makeEdgeLabel("studied_with").make();
        this.management.makeEdgeLabel("is_a_match_for").make();

        // Create indexes
        this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
        this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
        this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
        this.management.commit()
        this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
        this.management = this.graph.openManagement()
        this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

        this.management.commit()

        println "Created schema successfully"
    }

    public void populate() {
        // Create db schema
        this.createSchema()

        // Create vertexes from the given dummy data
        def vertexTransaction = this.graph.newTransaction()
        def vertexes = this.dummyData.vertexes;
        this.createVertexes(vertexes)
        vertexTransaction.commit()
        this.initGraph()

        def edgeTransaction = this.graph.newTransaction()
        // Create edges from the given dummy data
        def edges = this.dummyData.edges;
        this.createEdges(edges)
        edgeTransaction.commit()
        this.initGraph()

        println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
    }

    public void initialize(String jsonPath) {
        String fileContents = new File(jsonPath).getText('UTF-8')
        def slurper = new JsonSlurper()
        def results = slurper.parseText(fileContents)
        this.dummyData = results;
        this.resetData()
    }

    public void resetData() {
        // Remove all the data from the storage backend
        this.graph.close()
        JanusGraphCleanup.clear(this.graph)
        this.initGraph()
    }

    public void initGraph() {
        this.graph = JanusGraphFactory.open(this.graphPath)
        this.traversal = this.graph.traversal()
    }
}


JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

storage.backend=hbase

## Google cloud BIGTABLE configuration options
storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

storage.hostname=localhost
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5

# Bulk Loading
storage.batch-loading = true
ids.block-size=20000000

I execute the script using the following command:

./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
at java.lang.StringBuilder.append(StringBuilder.java:190)
at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at JanusGraphBuilder.initialize(Script1.groovy:146)
at JanusGraphBuilder.main(Script1.groovy:29)
at JanusGraphBuilder$main.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
at Script1.run(Script1.groovy:168)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

Any help will be much appreciated. Thanks.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Thanks & Regards
-----------------------------------------------------

Amyth Arora
-----------------------------------------------------

Web:

-----------------------------------------------------

Email Addresses:
aroras....@...,
-----------------------------------------------------

Social Profiles:
Twitter - @mytharora
    -----------------------------------------------------

    --
    Thanks & Regards
    -----------------------------------------------------

    Amyth Arora
    -----------------------------------------------------

    Web:

    -----------------------------------------------------

    Email Addresses:
    aroras....@...,
    -----------------------------------------------------

    Social Profiles:
    Twitter - @mytharora
      -----------------------------------------------------

      --
      You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
      For more options, visit https://groups.google.com/d/optout.


      Amyth Arora <aroras....@...>
       

      Thanks Robert, I am going to check the file size  and its contents when I reach home and also will try to load the file through the shell and post the update here.

      On Tue, 8 Aug 2017 at 8:14 PM, Robert Dale <rob...@...> wrote:
      Well, whatever, but your stacktrace points to:  String fileContents = new File(jsonPath).getText('UTF-8')
      Thus, your file does not fit in memory - either available system memory or within jvm max memory.

      Robert Dale

      On Tue, Aug 8, 2017 at 10:38 AM, Amyth Arora <aroras....@...> wrote:
      Hi Robert,

      The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


      while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

      storage.batch-loading
      ids.block-size

      But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

      On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
      Looks like your file doesn't fit in memory.

      Robert Dale

      On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
      Hi Everyone,

      I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

      import groovy.json.JsonSlurper;
      import java.util.ArrayList;
      import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
      import org.janusgraph.core.JanusGraphFactory;
      import org.janusgraph.core.PropertyKey;
      import org.janusgraph.core.Multiplicity;
      import org.janusgraph.core.schema.SchemaAction;
      import org.janusgraph.core.schema.SchemaStatus;
      import org.janusgraph.core.util.JanusGraphCleanup;
      import org.janusgraph.graphdb.database.StandardJanusGraph;
      import org.janusgraph.graphdb.database.management.ManagementSystem;

      /**
       * Given a json file, populates data into the given JanusGraph DB
       */
      class JanusGraphBuilder {

          String graphPath;
          StandardJanusGraph graph;
          ManagementSystem management;
          GraphTraversalSource traversal;

          def dummyData;


          public void main(String jsonPath, String janusGraphPath) {
              this.graphPath  = janusGraphPath
              this.initGraph()
              this.initialize(jsonPath)
              this.populate()
          }

          public void createEdges(def edges) {
              println "Preparing edges."
              edges.each {
                  def relation = it.edge
                  def properties = it.properties
                  def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                  def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                  def newEdge = vertexFrom.addEdge(relation, vertexTo)
                  properties.each {
                      if (it.key == 'score') {
                          it.value = Float.parseFloat(it.value.toString())
                      }
                      newEdge.property(it.key, it.value)
                  }
              }
              this.graph.tx().commit()
              println "Created edges successfully"
          }

          public void createVertexes(def vertexes) {
              println "Preparing vertices."
              vertexes.each {
                  def uniqueLabel = it.labels[0]
                  def properties = it.properties
                  def newVertex = this.graph.addVertex(label, uniqueLabel)
                  properties.each {
                      newVertex.property(it.key, it.value)
                  }
              }
              this.graph.tx().commit()
              println "Created vertices successfully"
          }

          public void createSchema() {
              println "Preparing schema."
              // Do not create indexes while another transaction is in progress
              this.graph.tx().rollback()
              this.management = this.graph.openManagement()
              this.management.set('ids.block-size', 20000000)

              // Make property keys
              def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
              def name = this.management.makePropertyKey("name").dataType(String.class).make()
              def number = this.management.makePropertyKey("number").dataType(String.class).make()
              def email = this.management.makePropertyKey("email").dataType(String.class).make()
              def score = this.management.makePropertyKey("score").dataType(Float.class).make()
              def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
              def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
              def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
              def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
              def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
              def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

              // Define Vertex Labels
              this.management.makeVertexLabel("person").make();
              this.management.makeVertexLabel("candidate").make();
              this.management.makeVertexLabel("recruiter").make();
              this.management.makeVertexLabel("employee").make();
              this.management.makeVertexLabel("linkedin").make();
              this.management.makeVertexLabel("job").make();
              this.management.makeVertexLabel("company").make();
              this.management.makeVertexLabel("institute").make();
              def phoneV = this.management.makeVertexLabel("phone").make();
              def emailV = this.management.makeVertexLabel("email").make();

              // Define Edge Labels
              this.management.makeEdgeLabel("knows").make();
              this.management.makeEdgeLabel("has").make();
              this.management.makeEdgeLabel("provided_by").make();
              this.management.makeEdgeLabel("studied_at").make();
              this.management.makeEdgeLabel("worked_at").make();
              this.management.makeEdgeLabel("posted").make();
              this.management.makeEdgeLabel("liked").make();
              this.management.makeEdgeLabel("worked_with").make();
              this.management.makeEdgeLabel("studied_with").make();
              this.management.makeEdgeLabel("is_a_match_for").make();

              // Create indexes
              this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
              this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
              this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
              this.management.commit()
              this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
              this.management = this.graph.openManagement()
              this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

              this.management.commit()

              println "Created schema successfully"
          }

          public void populate() {
              // Create db schema
              this.createSchema()

              // Create vertexes from the given dummy data
              def vertexTransaction = this.graph.newTransaction()
              def vertexes = this.dummyData.vertexes;
              this.createVertexes(vertexes)
              vertexTransaction.commit()
              this.initGraph()

              def edgeTransaction = this.graph.newTransaction()
              // Create edges from the given dummy data
              def edges = this.dummyData.edges;
              this.createEdges(edges)
              edgeTransaction.commit()
              this.initGraph()

              println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
          }

          public void initialize(String jsonPath) {
              String fileContents = new File(jsonPath).getText('UTF-8')
              def slurper = new JsonSlurper()
              def results = slurper.parseText(fileContents)
              this.dummyData = results;
              this.resetData()
          }

          public void resetData() {
              // Remove all the data from the storage backend
              this.graph.close()
              JanusGraphCleanup.clear(this.graph)
              this.initGraph()
          }

          public void initGraph() {
              this.graph = JanusGraphFactory.open(this.graphPath)
              this.traversal = this.graph.traversal()
          }
      }


      JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
      graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

      I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

      storage.backend=hbase

      ## Google cloud BIGTABLE configuration options
      storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

      storage.hostname=localhost
      cache.db-cache = true
      cache.db-cache-clean-wait = 20
      cache.db-cache-time = 180000
      cache.db-cache-size = 0.5

      # Bulk Loading
      storage.batch-loading = true
      ids.block-size=20000000

      I execute the script using the following command:

      ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

      The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

      18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
      18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
      18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
      18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
      Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOf(Arrays.java:3332)
      at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
      at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
      at java.lang.StringBuilder.append(StringBuilder.java:190)
      at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
      at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
      at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
      at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
      at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
      at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
      at JanusGraphBuilder.initialize(Script1.groovy:146)
      at JanusGraphBuilder.main(Script1.groovy:29)
      at JanusGraphBuilder$main.call(Unknown Source)
      at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
      at Script1.run(Script1.groovy:168)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
      at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
      at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

      Any help will be much appreciated. Thanks.


      --
      You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
      For more options, visit https://groups.google.com/d/optout.
      --
      Thanks & Regards
      -----------------------------------------------------

      Amyth Arora
      -----------------------------------------------------

      Web:

      -----------------------------------------------------

      Email Addresses:
      aroras....@...,
      -----------------------------------------------------

      Social Profiles:
      Twitter - @mytharora
        -----------------------------------------------------

        --
        Thanks & Regards
        -----------------------------------------------------

        Amyth Arora
        -----------------------------------------------------

        Web:

        -----------------------------------------------------

        Email Addresses:
        aroras....@...,
        -----------------------------------------------------

        Social Profiles:
        Twitter - @mytharora
          -----------------------------------------------------


          Robert Dale <rob...@...>
           

          Well, whatever, but your stacktrace points to:  String fileContents = new File(jsonPath).getText('UTF-8')
          Thus, your file does not fit in memory - either available system memory or within jvm max memory.

          Robert Dale

          On Tue, Aug 8, 2017 at 10:38 AM, Amyth Arora <aroras....@...> wrote:
          Hi Robert,

          The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


          while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

          storage.batch-loading
          ids.block-size

          But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

          On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
          Looks like your file doesn't fit in memory.

          Robert Dale

          On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
          Hi Everyone,

          I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

          import groovy.json.JsonSlurper;
          import java.util.ArrayList;
          import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
          import org.janusgraph.core.JanusGraphFactory;
          import org.janusgraph.core.PropertyKey;
          import org.janusgraph.core.Multiplicity;
          import org.janusgraph.core.schema.SchemaAction;
          import org.janusgraph.core.schema.SchemaStatus;
          import org.janusgraph.core.util.JanusGraphCleanup;
          import org.janusgraph.graphdb.database.StandardJanusGraph;
          import org.janusgraph.graphdb.database.management.ManagementSystem;

          /**
           * Given a json file, populates data into the given JanusGraph DB
           */
          class JanusGraphBuilder {

              String graphPath;
              StandardJanusGraph graph;
              ManagementSystem management;
              GraphTraversalSource traversal;

              def dummyData;


              public void main(String jsonPath, String janusGraphPath) {
                  this.graphPath  = janusGraphPath
                  this.initGraph()
                  this.initialize(jsonPath)
                  this.populate()
              }

              public void createEdges(def edges) {
                  println "Preparing edges."
                  edges.each {
                      def relation = it.edge
                      def properties = it.properties
                      def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                      def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                      def newEdge = vertexFrom.addEdge(relation, vertexTo)
                      properties.each {
                          if (it.key == 'score') {
                              it.value = Float.parseFloat(it.value.toString())
                          }
                          newEdge.property(it.key, it.value)
                      }
                  }
                  this.graph.tx().commit()
                  println "Created edges successfully"
              }

              public void createVertexes(def vertexes) {
                  println "Preparing vertices."
                  vertexes.each {
                      def uniqueLabel = it.labels[0]
                      def properties = it.properties
                      def newVertex = this.graph.addVertex(label, uniqueLabel)
                      properties.each {
                          newVertex.property(it.key, it.value)
                      }
                  }
                  this.graph.tx().commit()
                  println "Created vertices successfully"
              }

              public void createSchema() {
                  println "Preparing schema."
                  // Do not create indexes while another transaction is in progress
                  this.graph.tx().rollback()
                  this.management = this.graph.openManagement()
                  this.management.set('ids.block-size', 20000000)

                  // Make property keys
                  def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                  def name = this.management.makePropertyKey("name").dataType(String.class).make()
                  def number = this.management.makePropertyKey("number").dataType(String.class).make()
                  def email = this.management.makePropertyKey("email").dataType(String.class).make()
                  def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                  def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                  def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                  def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                  def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                  def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                  def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                  // Define Vertex Labels
                  this.management.makeVertexLabel("person").make();
                  this.management.makeVertexLabel("candidate").make();
                  this.management.makeVertexLabel("recruiter").make();
                  this.management.makeVertexLabel("employee").make();
                  this.management.makeVertexLabel("linkedin").make();
                  this.management.makeVertexLabel("job").make();
                  this.management.makeVertexLabel("company").make();
                  this.management.makeVertexLabel("institute").make();
                  def phoneV = this.management.makeVertexLabel("phone").make();
                  def emailV = this.management.makeVertexLabel("email").make();

                  // Define Edge Labels
                  this.management.makeEdgeLabel("knows").make();
                  this.management.makeEdgeLabel("has").make();
                  this.management.makeEdgeLabel("provided_by").make();
                  this.management.makeEdgeLabel("studied_at").make();
                  this.management.makeEdgeLabel("worked_at").make();
                  this.management.makeEdgeLabel("posted").make();
                  this.management.makeEdgeLabel("liked").make();
                  this.management.makeEdgeLabel("worked_with").make();
                  this.management.makeEdgeLabel("studied_with").make();
                  this.management.makeEdgeLabel("is_a_match_for").make();

                  // Create indexes
                  this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                  this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                  this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                  this.management.commit()
                  this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                  this.management = this.graph.openManagement()
                  this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                  this.management.commit()

                  println "Created schema successfully"
              }

              public void populate() {
                  // Create db schema
                  this.createSchema()

                  // Create vertexes from the given dummy data
                  def vertexTransaction = this.graph.newTransaction()
                  def vertexes = this.dummyData.vertexes;
                  this.createVertexes(vertexes)
                  vertexTransaction.commit()
                  this.initGraph()

                  def edgeTransaction = this.graph.newTransaction()
                  // Create edges from the given dummy data
                  def edges = this.dummyData.edges;
                  this.createEdges(edges)
                  edgeTransaction.commit()
                  this.initGraph()

                  println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
              }

              public void initialize(String jsonPath) {
                  String fileContents = new File(jsonPath).getText('UTF-8')
                  def slurper = new JsonSlurper()
                  def results = slurper.parseText(fileContents)
                  this.dummyData = results;
                  this.resetData()
              }

              public void resetData() {
                  // Remove all the data from the storage backend
                  this.graph.close()
                  JanusGraphCleanup.clear(this.graph)
                  this.initGraph()
              }

              public void initGraph() {
                  this.graph = JanusGraphFactory.open(this.graphPath)
                  this.traversal = this.graph.traversal()
              }
          }


          JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
          graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

          I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

          storage.backend=hbase

          ## Google cloud BIGTABLE configuration options
          storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

          storage.hostname=localhost
          cache.db-cache = true
          cache.db-cache-clean-wait = 20
          cache.db-cache-time = 180000
          cache.db-cache-size = 0.5

          # Bulk Loading
          storage.batch-loading = true
          ids.block-size=20000000

          I execute the script using the following command:

          ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

          The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

          18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
          18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
          18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
          18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
          Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
          at java.util.Arrays.copyOf(Arrays.java:3332)
          at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
          at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
          at java.lang.StringBuilder.append(StringBuilder.java:190)
          at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
          at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
          at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
          at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
          at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
          at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
          at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
          at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
          at JanusGraphBuilder.initialize(Script1.groovy:146)
          at JanusGraphBuilder.main(Script1.groovy:29)
          at JanusGraphBuilder$main.call(Unknown Source)
          at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
          at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
          at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
          at Script1.run(Script1.groovy:168)
          at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
          at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
          at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
          at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
          at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
          at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

          Any help will be much appreciated. Thanks.


          --
          You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
          For more options, visit https://groups.google.com/d/optout.
          --
          Thanks & Regards
          -----------------------------------------------------

          Amyth Arora
          -----------------------------------------------------

          Web:

          -----------------------------------------------------

          Email Addresses:
          aroras....@...,
          -----------------------------------------------------

          Social Profiles:
          Twitter - @mytharora
            -----------------------------------------------------


            Amyth Arora <aroras....@...>
             

            Hi Robert,

            The file is about 325 mb in size and contains info about a million vertices and a million edges. Also I forgot to mention that prior to testing on bigtable I tried the same script and file to test janus with cassandra backend on the same machine, which worked fine.


            while testing this with cassandra, I experienced a similar issue, which went away by the introduction of following configuration options:

            storage.batch-loading
            ids.block-size

            But in case of cassandra, the error was thrown while creating the edges. In case of bigtable, The exception is thrown as soon as the script is executed.

            On Tue, 8 Aug 2017 at 7:51 PM, Robert Dale <rob...@...> wrote:
            Looks like your file doesn't fit in memory.

            Robert Dale

            On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
            Hi Everyone,

            I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

            import groovy.json.JsonSlurper;
            import java.util.ArrayList;
            import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
            import org.janusgraph.core.JanusGraphFactory;
            import org.janusgraph.core.PropertyKey;
            import org.janusgraph.core.Multiplicity;
            import org.janusgraph.core.schema.SchemaAction;
            import org.janusgraph.core.schema.SchemaStatus;
            import org.janusgraph.core.util.JanusGraphCleanup;
            import org.janusgraph.graphdb.database.StandardJanusGraph;
            import org.janusgraph.graphdb.database.management.ManagementSystem;

            /**
             * Given a json file, populates data into the given JanusGraph DB
             */
            class JanusGraphBuilder {

                String graphPath;
                StandardJanusGraph graph;
                ManagementSystem management;
                GraphTraversalSource traversal;

                def dummyData;


                public void main(String jsonPath, String janusGraphPath) {
                    this.graphPath  = janusGraphPath
                    this.initGraph()
                    this.initialize(jsonPath)
                    this.populate()
                }

                public void createEdges(def edges) {
                    println "Preparing edges."
                    edges.each {
                        def relation = it.edge
                        def properties = it.properties
                        def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                        def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                        def newEdge = vertexFrom.addEdge(relation, vertexTo)
                        properties.each {
                            if (it.key == 'score') {
                                it.value = Float.parseFloat(it.value.toString())
                            }
                            newEdge.property(it.key, it.value)
                        }
                    }
                    this.graph.tx().commit()
                    println "Created edges successfully"
                }

                public void createVertexes(def vertexes) {
                    println "Preparing vertices."
                    vertexes.each {
                        def uniqueLabel = it.labels[0]
                        def properties = it.properties
                        def newVertex = this.graph.addVertex(label, uniqueLabel)
                        properties.each {
                            newVertex.property(it.key, it.value)
                        }
                    }
                    this.graph.tx().commit()
                    println "Created vertices successfully"
                }

                public void createSchema() {
                    println "Preparing schema."
                    // Do not create indexes while another transaction is in progress
                    this.graph.tx().rollback()
                    this.management = this.graph.openManagement()
                    this.management.set('ids.block-size', 20000000)

                    // Make property keys
                    def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                    def name = this.management.makePropertyKey("name").dataType(String.class).make()
                    def number = this.management.makePropertyKey("number").dataType(String.class).make()
                    def email = this.management.makePropertyKey("email").dataType(String.class).make()
                    def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                    def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                    def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                    def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                    def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                    def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                    def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                    // Define Vertex Labels
                    this.management.makeVertexLabel("person").make();
                    this.management.makeVertexLabel("candidate").make();
                    this.management.makeVertexLabel("recruiter").make();
                    this.management.makeVertexLabel("employee").make();
                    this.management.makeVertexLabel("linkedin").make();
                    this.management.makeVertexLabel("job").make();
                    this.management.makeVertexLabel("company").make();
                    this.management.makeVertexLabel("institute").make();
                    def phoneV = this.management.makeVertexLabel("phone").make();
                    def emailV = this.management.makeVertexLabel("email").make();

                    // Define Edge Labels
                    this.management.makeEdgeLabel("knows").make();
                    this.management.makeEdgeLabel("has").make();
                    this.management.makeEdgeLabel("provided_by").make();
                    this.management.makeEdgeLabel("studied_at").make();
                    this.management.makeEdgeLabel("worked_at").make();
                    this.management.makeEdgeLabel("posted").make();
                    this.management.makeEdgeLabel("liked").make();
                    this.management.makeEdgeLabel("worked_with").make();
                    this.management.makeEdgeLabel("studied_with").make();
                    this.management.makeEdgeLabel("is_a_match_for").make();

                    // Create indexes
                    this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                    this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                    this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                    this.management.commit()
                    this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                    this.management = this.graph.openManagement()
                    this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                    this.management.commit()

                    println "Created schema successfully"
                }

                public void populate() {
                    // Create db schema
                    this.createSchema()

                    // Create vertexes from the given dummy data
                    def vertexTransaction = this.graph.newTransaction()
                    def vertexes = this.dummyData.vertexes;
                    this.createVertexes(vertexes)
                    vertexTransaction.commit()
                    this.initGraph()

                    def edgeTransaction = this.graph.newTransaction()
                    // Create edges from the given dummy data
                    def edges = this.dummyData.edges;
                    this.createEdges(edges)
                    edgeTransaction.commit()
                    this.initGraph()

                    println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                }

                public void initialize(String jsonPath) {
                    String fileContents = new File(jsonPath).getText('UTF-8')
                    def slurper = new JsonSlurper()
                    def results = slurper.parseText(fileContents)
                    this.dummyData = results;
                    this.resetData()
                }

                public void resetData() {
                    // Remove all the data from the storage backend
                    this.graph.close()
                    JanusGraphCleanup.clear(this.graph)
                    this.initGraph()
                }

                public void initGraph() {
                    this.graph = JanusGraphFactory.open(this.graphPath)
                    this.traversal = this.graph.traversal()
                }
            }


            JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
            graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

            I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

            storage.backend=hbase

            ## Google cloud BIGTABLE configuration options
            storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

            storage.hostname=localhost
            cache.db-cache = true
            cache.db-cache-clean-wait = 20
            cache.db-cache-time = 180000
            cache.db-cache-size = 0.5

            # Bulk Loading
            storage.batch-loading = true
            ids.block-size=20000000

            I execute the script using the following command:

            ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

            The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

            18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
            18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
            18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
            18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
            Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
            at java.util.Arrays.copyOf(Arrays.java:3332)
            at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
            at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
            at java.lang.StringBuilder.append(StringBuilder.java:190)
            at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
            at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
            at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
            at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
            at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
            at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
            at JanusGraphBuilder.initialize(Script1.groovy:146)
            at JanusGraphBuilder.main(Script1.groovy:29)
            at JanusGraphBuilder$main.call(Unknown Source)
            at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
            at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
            at Script1.run(Script1.groovy:168)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
            at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
            at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

            Any help will be much appreciated. Thanks.


            --
            You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
            For more options, visit https://groups.google.com/d/optout.
            --
            Thanks & Regards
            -----------------------------------------------------

            Amyth Arora
            -----------------------------------------------------

            Web:

            -----------------------------------------------------

            Email Addresses:
            aroras....@...,
            -----------------------------------------------------

            Social Profiles:
            Twitter - @mytharora
              -----------------------------------------------------


              Robert Dale <rob...@...>
               

              Looks like your file doesn't fit in memory.

              Robert Dale

              On Tue, Aug 8, 2017 at 9:23 AM, Amyth Arora <aroras....@...> wrote:
              Hi Everyone,

              I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

              import groovy.json.JsonSlurper;
              import java.util.ArrayList;
              import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
              import org.janusgraph.core.JanusGraphFactory;
              import org.janusgraph.core.PropertyKey;
              import org.janusgraph.core.Multiplicity;
              import org.janusgraph.core.schema.SchemaAction;
              import org.janusgraph.core.schema.SchemaStatus;
              import org.janusgraph.core.util.JanusGraphCleanup;
              import org.janusgraph.graphdb.database.StandardJanusGraph;
              import org.janusgraph.graphdb.database.management.ManagementSystem;

              /**
               * Given a json file, populates data into the given JanusGraph DB
               */
              class JanusGraphBuilder {

                  String graphPath;
                  StandardJanusGraph graph;
                  ManagementSystem management;
                  GraphTraversalSource traversal;

                  def dummyData;


                  public void main(String jsonPath, String janusGraphPath) {
                      this.graphPath  = janusGraphPath
                      this.initGraph()
                      this.initialize(jsonPath)
                      this.populate()
                  }

                  public void createEdges(def edges) {
                      println "Preparing edges."
                      edges.each {
                          def relation = it.edge
                          def properties = it.properties
                          def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                          def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                          def newEdge = vertexFrom.addEdge(relation, vertexTo)
                          properties.each {
                              if (it.key == 'score') {
                                  it.value = Float.parseFloat(it.value.toString())
                              }
                              newEdge.property(it.key, it.value)
                          }
                      }
                      this.graph.tx().commit()
                      println "Created edges successfully"
                  }

                  public void createVertexes(def vertexes) {
                      println "Preparing vertices."
                      vertexes.each {
                          def uniqueLabel = it.labels[0]
                          def properties = it.properties
                          def newVertex = this.graph.addVertex(label, uniqueLabel)
                          properties.each {
                              newVertex.property(it.key, it.value)
                          }
                      }
                      this.graph.tx().commit()
                      println "Created vertices successfully"
                  }

                  public void createSchema() {
                      println "Preparing schema."
                      // Do not create indexes while another transaction is in progress
                      this.graph.tx().rollback()
                      this.management = this.graph.openManagement()
                      this.management.set('ids.block-size', 20000000)

                      // Make property keys
                      def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                      def name = this.management.makePropertyKey("name").dataType(String.class).make()
                      def number = this.management.makePropertyKey("number").dataType(String.class).make()
                      def email = this.management.makePropertyKey("email").dataType(String.class).make()
                      def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                      def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                      def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                      def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                      def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                      def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                      def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                      // Define Vertex Labels
                      this.management.makeVertexLabel("person").make();
                      this.management.makeVertexLabel("candidate").make();
                      this.management.makeVertexLabel("recruiter").make();
                      this.management.makeVertexLabel("employee").make();
                      this.management.makeVertexLabel("linkedin").make();
                      this.management.makeVertexLabel("job").make();
                      this.management.makeVertexLabel("company").make();
                      this.management.makeVertexLabel("institute").make();
                      def phoneV = this.management.makeVertexLabel("phone").make();
                      def emailV = this.management.makeVertexLabel("email").make();

                      // Define Edge Labels
                      this.management.makeEdgeLabel("knows").make();
                      this.management.makeEdgeLabel("has").make();
                      this.management.makeEdgeLabel("provided_by").make();
                      this.management.makeEdgeLabel("studied_at").make();
                      this.management.makeEdgeLabel("worked_at").make();
                      this.management.makeEdgeLabel("posted").make();
                      this.management.makeEdgeLabel("liked").make();
                      this.management.makeEdgeLabel("worked_with").make();
                      this.management.makeEdgeLabel("studied_with").make();
                      this.management.makeEdgeLabel("is_a_match_for").make();

                      // Create indexes
                      this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                      this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                      this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                      this.management.commit()
                      this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                      this.management = this.graph.openManagement()
                      this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                      this.management.commit()

                      println "Created schema successfully"
                  }

                  public void populate() {
                      // Create db schema
                      this.createSchema()

                      // Create vertexes from the given dummy data
                      def vertexTransaction = this.graph.newTransaction()
                      def vertexes = this.dummyData.vertexes;
                      this.createVertexes(vertexes)
                      vertexTransaction.commit()
                      this.initGraph()

                      def edgeTransaction = this.graph.newTransaction()
                      // Create edges from the given dummy data
                      def edges = this.dummyData.edges;
                      this.createEdges(edges)
                      edgeTransaction.commit()
                      this.initGraph()

                      println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                  }

                  public void initialize(String jsonPath) {
                      String fileContents = new File(jsonPath).getText('UTF-8')
                      def slurper = new JsonSlurper()
                      def results = slurper.parseText(fileContents)
                      this.dummyData = results;
                      this.resetData()
                  }

                  public void resetData() {
                      // Remove all the data from the storage backend
                      this.graph.close()
                      JanusGraphCleanup.clear(this.graph)
                      this.initGraph()
                  }

                  public void initGraph() {
                      this.graph = JanusGraphFactory.open(this.graphPath)
                      this.traversal = this.graph.traversal()
                  }
              }


              JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
              graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

              I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

              storage.backend=hbase

              ## Google cloud BIGTABLE configuration options
              storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection

              storage.hostname=localhost
              cache.db-cache = true
              cache.db-cache-clean-wait = 20
              cache.db-cache-time = 180000
              cache.db-cache-size = 0.5

              # Bulk Loading
              storage.batch-loading = true
              ids.block-size=20000000

              I execute the script using the following command:

              ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

              The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

              18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
              18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
              18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
              18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
              Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
              at java.util.Arrays.copyOf(Arrays.java:3332)
              at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
              at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
              at java.lang.StringBuilder.append(StringBuilder.java:190)
              at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
              at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
              at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
              at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
              at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
              at JanusGraphBuilder.initialize(Script1.groovy:146)
              at JanusGraphBuilder.main(Script1.groovy:29)
              at JanusGraphBuilder$main.call(Unknown Source)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
              at Script1.run(Script1.groovy:168)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
              at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

              Any help will be much appreciated. Thanks.


              --
              You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
              For more options, visit https://groups.google.com/d/optout.


              Amyth Arora <aroras....@...>
               

              Hi Everyone,

              I am trying to upload some dummy data for testing purposes to janusgraph (google cloud bigtable backend). I have a groovy script as follows that I execute while running the gremlin console that creates the schema, indexes, vertexes and edges.

              import groovy.json.JsonSlurper;
              import java.util.ArrayList;
              import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
              import org.janusgraph.core.JanusGraphFactory;
              import org.janusgraph.core.PropertyKey;
              import org.janusgraph.core.Multiplicity;
              import org.janusgraph.core.schema.SchemaAction;
              import org.janusgraph.core.schema.SchemaStatus;
              import org.janusgraph.core.util.JanusGraphCleanup;
              import org.janusgraph.graphdb.database.StandardJanusGraph;
              import org.janusgraph.graphdb.database.management.ManagementSystem;

              /**
               * Given a json file, populates data into the given JanusGraph DB
               */
              class JanusGraphBuilder {

                  String graphPath;
                  StandardJanusGraph graph;
                  ManagementSystem management;
                  GraphTraversalSource traversal;

                  def dummyData;


                  public void main(String jsonPath, String janusGraphPath) {
                      this.graphPath  = janusGraphPath
                      this.initGraph()
                      this.initialize(jsonPath)
                      this.populate()
                  }

                  public void createEdges(def edges) {
                      println "Preparing edges."
                      edges.each {
                          def relation = it.edge
                          def properties = it.properties
                          def vertexFrom = this.traversal.V().has("uid", it.nodes[0])[0]
                          def vertexTo = this.traversal.V().has("uid", it.nodes[1])[0]
                          def newEdge = vertexFrom.addEdge(relation, vertexTo)
                          properties.each {
                              if (it.key == 'score') {
                                  it.value = Float.parseFloat(it.value.toString())
                              }
                              newEdge.property(it.key, it.value)
                          }
                      }
                      this.graph.tx().commit()
                      println "Created edges successfully"
                  }

                  public void createVertexes(def vertexes) {
                      println "Preparing vertices."
                      vertexes.each {
                          def uniqueLabel = it.labels[0]
                          def properties = it.properties
                          def newVertex = this.graph.addVertex(label, uniqueLabel)
                          properties.each {
                              newVertex.property(it.key, it.value)
                          }
                      }
                      this.graph.tx().commit()
                      println "Created vertices successfully"
                  }

                  public void createSchema() {
                      println "Preparing schema."
                      // Do not create indexes while another transaction is in progress
                      this.graph.tx().rollback()
                      this.management = this.graph.openManagement()
                      this.management.set('ids.block-size', 20000000)

                      // Make property keys
                      def uid = this.management.makePropertyKey("uid").dataType(String.class).make()
                      def name = this.management.makePropertyKey("name").dataType(String.class).make()
                      def number = this.management.makePropertyKey("number").dataType(String.class).make()
                      def email = this.management.makePropertyKey("email").dataType(String.class).make()
                      def score = this.management.makePropertyKey("score").dataType(Float.class).make()
                      def linkedinId = this.management.makePropertyKey("linkedin_id").dataType(String.class).make()
                      def linkedinUrl = this.management.makePropertyKey("profile_url").dataType(String.class).make()
                      def imageUrl = this.management.makePropertyKey("image_url").dataType(String.class).make()
                      def instituteName = this.management.makePropertyKey("institute_name").dataType(String.class).make()
                      def companyName = this.management.makePropertyKey("company_name").dataType(String.class).make()
                      def jobId = this.management.makePropertyKey("job_id").dataType(String.class).make()

                      // Define Vertex Labels
                      this.management.makeVertexLabel("person").make();
                      this.management.makeVertexLabel("candidate").make();
                      this.management.makeVertexLabel("recruiter").make();
                      this.management.makeVertexLabel("employee").make();
                      this.management.makeVertexLabel("linkedin").make();
                      this.management.makeVertexLabel("job").make();
                      this.management.makeVertexLabel("company").make();
                      this.management.makeVertexLabel("institute").make();
                      def phoneV = this.management.makeVertexLabel("phone").make();
                      def emailV = this.management.makeVertexLabel("email").make();

                      // Define Edge Labels
                      this.management.makeEdgeLabel("knows").make();
                      this.management.makeEdgeLabel("has").make();
                      this.management.makeEdgeLabel("provided_by").make();
                      this.management.makeEdgeLabel("studied_at").make();
                      this.management.makeEdgeLabel("worked_at").make();
                      this.management.makeEdgeLabel("posted").make();
                      this.management.makeEdgeLabel("liked").make();
                      this.management.makeEdgeLabel("worked_with").make();
                      this.management.makeEdgeLabel("studied_with").make();
                      this.management.makeEdgeLabel("is_a_match_for").make();

                      // Create indexes
                      this.management.buildIndex('uniqueUid', Vertex.class).addKey(uid).unique().buildCompositeIndex()
                      this.management.buildIndex('uniqueEmail', Vertex.class).addKey(email).indexOnly(emailV).unique().buildCompositeIndex()
                      this.management.buildIndex('uniqueNumber', Vertex.class).addKey(number).indexOnly(phoneV).unique().buildCompositeIndex()
                      this.management.commit()
                      this.management.awaitGraphIndexStatus(this.graph, 'uniqueUid').call()
                      this.management = this.graph.openManagement()
                      this.management.updateIndex(this.management.getGraphIndex('uniqueUid'), SchemaAction.REINDEX).get()

                      this.management.commit()

                      println "Created schema successfully"
                  }

                  public void populate() {
                      // Create db schema
                      this.createSchema()

                      // Create vertexes from the given dummy data
                      def vertexTransaction = this.graph.newTransaction()
                      def vertexes = this.dummyData.vertexes;
                      this.createVertexes(vertexes)
                      vertexTransaction.commit()
                      this.initGraph()

                      def edgeTransaction = this.graph.newTransaction()
                      // Create edges from the given dummy data
                      def edges = this.dummyData.edges;
                      this.createEdges(edges)
                      edgeTransaction.commit()
                      this.initGraph()

                      println "Graph population successfully accomplished. Please hit Ctrl+C to exit."
                  }

                  public void initialize(String jsonPath) {
                      String fileContents = new File(jsonPath).getText('UTF-8')
                      def slurper = new JsonSlurper()
                      def results = slurper.parseText(fileContents)
                      this.dummyData = results;
                      this.resetData()
                  }

                  public void resetData() {
                      // Remove all the data from the storage backend
                      this.graph.close()
                      JanusGraphCleanup.clear(this.graph)
                      this.initGraph()
                  }

                  public void initGraph() {
                      this.graph = JanusGraphFactory.open(this.graphPath)
                      this.traversal = this.graph.traversal()
                  }
              }


              JanusGraphBuilder graphBuilder = new JanusGraphBuilder()
              graphBuilder.main("/tmp/dummy.json", "conf/testconf.properties")

              I have already updated the heap size in the  `JAVA_OPTIONS` environment variable to `-Xmx2048m`. Also here is how my configuration looks like.

              storage.backend=hbase

              ## Google cloud BIGTABLE configuration options
              storage.hbase.ext.hbase.client.connection.impl=com.google.cloud.bigtable.hbase1_0.BigtableConnection
              storage.hbase.ext.google.bigtable.project.id=my-project-id
              storage.hbase.ext.google.bigtable.instance.id=my-instance-id

              storage.hostname=localhost
              cache.db-cache = true
              cache.db-cache-clean-wait = 20
              cache.db-cache-time = 180000
              cache.db-cache-size = 0.5

              # Bulk Loading
              storage.batch-loading = true
              ids.block-size=20000000

              I execute the script using the following command:

              ./bin/gremlin.sh -e ~/projects/xonnect/utils/population_scripts/populateJanus.groovy

              The problem is, even before the schema is created it throws out of memory error for java heap space. Following is the output.

              18:44:29,112  INFO BigtableSession:75 - Bigtable options: BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com, projectId=formal-theater-175812, instanceId=xonnect, userAgent=hbase-1.2.4, credentialType=DefaultCredentials, port=443, dataChannelCount=10, retryOptions=RetryOptions{retriesEnabled=true, allowRetriesWithoutTimestamp=false, statusToRetryOn=[UNAUTHENTICATED, INTERNAL, ABORTED, UNAVAILABLE, DEADLINE_EXCEEDED], initialBackoffMillis=5, maxElapsedBackoffMillis=60000, backoffMultiplier=2.0, streamingBufferSize=60, readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3}, bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true, bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0, maxInflightRpcs=500, maxMemory=190893260, enableBulkMutationThrottling=false, bulkMutationRpcTargetMs=100}, callOptionsConfig=CallOptionsConfig{useTimeout=false, shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000}, usePlaintextNegotiation=false}.
              18:44:30,851  INFO Backend:183 - Initiated backend operations thread pool of size 8
              18:44:35,048  INFO IndexSerializer:85 - Hashing index keys
              18:44:36,910  INFO KCVSLog:744 - Loaded unidentified ReadMarker start time 2017-08-08T13:14:36.898Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@35835fa
              Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
              at java.util.Arrays.copyOf(Arrays.java:3332)
              at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
              at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
              at java.lang.StringBuilder.append(StringBuilder.java:190)
              at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:886)
              at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
              at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
              at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
              at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
              at JanusGraphBuilder.initialize(Script1.groovy:146)
              at JanusGraphBuilder.main(Script1.groovy:29)
              at JanusGraphBuilder$main.call(Unknown Source)
              at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
              at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
              at Script1.run(Script1.groovy:168)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:619)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:448)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:421)
              at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:212)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.evaluate(ScriptExecutor.java:55)
              at org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor.main(ScriptExecutor.java:44)

              Any help will be much appreciated. Thanks.


              Amyth (twiiter.com/mytharora)