REINDEXING Big Graph
Abhay Pandit
Hi Team, Currently I am trying to REINDEX using Hadoop Mapreduce using the reference from Janus document. https://docs.janusgraph.org/index-management/index-reindexing/#reindex-example-on-mapreduce I wrote my implementation using Java. Here it is running fine. but it is running on Local mode. For running on cluster mode I need to pass hadoop configurations but from documentations I am not clear how to pass any external configuration to run on hadoop or on yarn cluster. If anybody has tried against a big graph like having a Billion of nodes, can you guide me on this? My Java implementation: JanusGraph janusGraph = JanusGraphFactory.open(janusConfig); JanusGraphManagement management; management = janusGraph.openManagement(); JanusGraphIndex graphIndex = management.getGraphIndex("AddressId"); MapReduceIndexManagement mapReduceIndexManagement = new MapReduceIndexManagement(janusGraph); ScanMetrics scanMetrics = mapReduceIndexManagement.updateIndex(graphIndex, SchemaAction.REINDEX).get(); janusConfig: gremlin.graph=org.janusgraph.core.JanusGraphFactory storage.backend=cql storage.hostname=127.0.0.1 storage.port=9042 storage.keyspace=janusgraph cache.db-cache = false cache.db-cache-clean-wait = 20 cache.db-cache-time = 180000 cache.db-cache-size = 0.25 index.search.backend=elasticsearch index.search.hostname=127.0.0.1 Console log: [INFO] 2021-02-17 13:37:55,173 LocalJobRunner Map Task Executor #0 org.apache.hadoop.mapred.LocalJobRunner - {} - [INFO] 2021-02-17 13:37:56,141 task-1 org.apache.hadoop.mapreduce.Job - {} - map 14% reduce 0% [INFO] 2021-02-17 13:37:57,384 LocalJobRunner Map Task Executor #0 org.apache.hadoop.mapred.Task - {} - Task:attempt_local67526867_0001_m_000035_0 is done. And is in the process of committing [INFO] 2021-02-17 13:37:57,384 LocalJobRunner Map Task Executor #0 org.apache.hadoop.mapred.LocalJobRunner - {} - map [INFO] 2021-02-17 13:37:57,384 LocalJobRunner Map Task Executor #0 org.apache.hadoop.mapred.Task - {} - Task 'attempt_local67526867_0001_m_000035_0' done. [INFO] 2021-02-17 13:37:57,385 LocalJobRunner Map Task Executor #0 org.apache.hadoop.mapred.Task - {} - Final Counters for attempt_local67526867_0001_m_000035_0: Counters: 16 Thanks, Abhay |
|