HBase ScannerTimeoutException
HadoopMarc <m.c.d...@...>
Hi Joseph, If you want to process all vertices (map operation) you need an OLAP query (currently only works for readonly tasks): http://docs.janusgraph.org/latest/hadoop-tp3.html http://tinkerpop.apache.org/docs/3.2.3/reference/#sparkgraphcomputer If you want to filter the total set of vertices, you need an index on one or more properties of your vertices: http://docs.janusgraph.org/latest/indexes.html What do you want to accomplish apart from looping over the vertices in your graph? HTH, Marc Op donderdag 11 mei 2017 16:18:50 UTC+2 schreef Joseph Obernberger: Hi All - I'm using a loop to do a task on all vertices in fairly large |
|
Joe Obernberger <joseph.o...@...>
Hi Marc - thank you for the reply. I've written java code to take some data and use it to generate a graph. After that data is put into JanusGraph, I then loop over all the nodes (in the graph) so that I can query an external database to add edges/nodes where appropriate for this particular task. This is all in Java. I've not used an OLAP query, but it looks like it's straight from Gremlin; so should be able to do it from Java? Still investigating. -Joe On 5/11/2017 11:14 AM, HadoopMarc
wrote:
|
|
HadoopMarc <m.c.d...@...>
Hi Joseph, Sounds like OLAP is not going to help you here (you would need a gremlin database query step or a customer vertexprogram), You need a JanusGraph index on a unique property of your vertices. Then, a query g.V().has('yourprop', 'yourpropvalue').next() will return the vertex using the index, rather than doing an HBase table scan like you did before. This approach also allows you to make your code multi-threaded as long as you add vertices and edges in the right order. The Janusgraph and Tinkerpop docs on batch loading might provide further insights: http://docs.janusgraph.org/latest/bulk-loading.html http://tinkerpop.apache.org/docs/current/reference/#_loading_with_bulkloadervertexprogram The BulkLoaderVertexProgram is in effect an OLAP approach, but it assumes that you have all data organized before you start the graph loading. Cheers, Marc Op vrijdag 12 mei 2017 21:34:01 UTC+2 schreef Joseph Obernberger:
|
|