Re: Indexes on edge properties
Jason Plurad <plu...@...>
Did you create the index before inserting data? Did you make sure to commit the management transaction, i.e. mgmt.commit()? In this example below, you can see a name property getting created and indexed for edges.
For the "iterating over all vertices", the reason is because the message is hardcoded to say "vertices". Perhaps "elements" might be better. -- Jason
On Tuesday, July 4, 2017 at 9:47:29 AM UTC-4, Thijs Broersen wrote:
|
|
Indexes on edge properties
Thijs Broersen <mht.b...@...>
I have set an index on some edge property but when I execute
it does not use the index and gives me: "Query requires iterating over all vertices..." I created the index using:
any suggestions? Why does it say "iterating over all vertices" instead of "iterating over all edges"
|
|
Handling backend connection issues on Gremlin-Server start up
Carlos <512.qua...@...>
So I'm using Cassandra as my backend and I've noticed that if I've accidentally started my services out of order, Gremlin-Server will still successfully start up, but complain about not being able to connect to Cassandra. Gremlin-Server will still listen for connections and proceed to accept them, however any requests that are traversals will only end up causing an error to be returned. a) Is this normal behavior to have Gremlin-Server continue even though the selected backend cannot be contacted on start up? b) If it's normal behavior, is there a way for me to send a command to the Gremlin-Server so it will attempt a reconnection when I know Cassandra is up?
|
|
Use of mgmt.setTTL to expire edges
ke...@...
Hi, I am experimenting with using the mgmt.setTTL option to automatically expire edges after 7 days - which is working well and generating much less overhead then trying to execute a task that finds and drops the edges. However this option does not drop the entries from external indexes in Elasticsearch. I imaging that this is because the setTTL is being passed directly to the storage backend to handle (Cassandra in my case) and so only happens there. Is this working as expected? Are there plans to sync the deletions in Elasticsearch? Or should I plan to manually run a purge from Elasticsearch in line with the 7 day expiry in Cassandra? Thanks for any of your views! Kevin
|
|
Re: keyspaces in JanusGraph
Jean-Baptiste Musso <jbm...@...>
Hello, Side note to your question - I suggest that you use https://www.npmjs.com/package/gremlin instead - the former is deprecated (unsure if that shows up in the log when installing). Jean-Baptiste
On Wed, 28 Jun 2017 at 17:52, Peter Musial <pmmu...@...> wrote:
-- Jean-Baptiste
|
|
Re: keyspaces in JanusGraph
Ted Wilmes <twi...@...>
Hi Peter, You can setup Janus to use a keyspace of your choose. See the config section of the docs for some relevant properties to set: http://docs.janusgraph.org/latest/config-ref.html#_storage_cassandra. --Ted
On Wednesday, June 28, 2017 at 10:52:04 AM UTC-5, Peter wrote:
|
|
Re: Support for ES 5 and Cassandra datastax driver
Ted Wilmes <twi...@...>
Hi Mountu, Elasticsearch 5.x and a new CQL storage adapter will be in the next release, 0.2.0. Thanks, Ted
On Sunday, July 2, 2017 at 6:44:42 AM UTC-5, Mountu Jinwala wrote:
|
|
Re: java.io.EOFException in kryo+blvp error in bulk loading
marc.d...@...
Hi Eliz and Meng, Did the seqence of gremlin commands work for the tinkerpop-modern.kryo and grateful-dead.kryo example files? How did you create the test.kryo file? Marc Op woensdag 28 juni 2017 15:02:27 UTC+2 schreef Elizabeth:
|
|
Support for ES 5 and Cassandra datastax driver
Mountu Jinwala <maji...@...>
Hi, Does Janus graph have support Elasticsearch 5.4 and cassandra using Datastax CQL driver instead of astyanax client ?
|
|
Re: Who is using JanusGraph in production?
Kelvin Lawrence <kelvin....@...>
Hi Jimmy, as you would expect, here at IBM we have a lot of projects underway that will use Janus Graph. I try not to do product ads on open source lists but for sure we are adopters in fact I have the Gremlin console up in front of me connected to a Janus graph as I type this :-) Cheers, Kelvin
On Friday, April 7, 2017 at 8:15:31 AM UTC-5, Jimmy wrote:
|
|
keyspaces in JanusGraph
Peter Musial <pmmu...@...>
Hi All, Cassandra allows definition of multiple keyspaces. I am using nodejs w/GremlinClient module (npm install gremlin-client) to handle query execution. Although I know how to set up a keyspace, it is not clear to me how to initialize gremlin client to use a specific keyspace. Is there a programatic way of selecting a namespace, or is it only possible when loading gremlin-server with specific configuration file. Thanks, Peter
|
|
java.io.EOFException in kryo+blvp error in bulk loading
Elizabeth <hlf...@...>
Hi all, I was using the Kryo format and BulkLoaderVertexProgram to load large files into Janusgraph, and encountered an error: gremlin> hdfs.copyFromLocal('data/test.kryo','data/test.kryo') ==>null gremlin> graph = GraphFactory.open('conf/hadoop-graph/hadoop-load.properties') ==>hadoopgraph[gryoinputformat->gryooutputformat] gremlin> gremlin> blvp = BulkLoaderVertexProgram.build().writeGraph('conf/janusgraph-hbase-es.properties').create(graph) ==>BulkLoaderVertexProgram[bulkLoader=IncrementalBulkLoader, vertexIdProperty=bulkLoader.vertex.id, userSuppliedIds=false, keepOriginalIds=true, batchSize=0] gremlin> gremlin> result = graph.compute(SparkGraphComputer).program(blvp).submit().get() 20:21:32 ERROR org.apache.spark.executor.Executor - Exception in task 0.0 in stage 0.0 (TID 0) java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.seekToHeader(GryoRecordReader.java:93) at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:85) at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:38) has anyone ever had this error before, please help me with this last step! Any idea is appreciated! Best, Meng
|
|
Loading 10k nodes on Janusgraph/BerkeleyDB
Damien Seguy <damie...@...>
Hi, I'm running Janusgraph 0.1.1, on OSX. Berkeley db is the backend. I used xms256m and xmx5g I'm trying to load a graphson into Janus. There are various graphson of various sizes. When the graphson is below 10k nodes, it usually goes well. It is much faster with 200 tokens than with 9000 (sounds normal). When I reach 10k tokens, something gets wrong and berkeley db emits a lot of errors: 176587 [pool-6-thread-1] WARN org.janusgraph.diskstorage.log.kcvs.KCVSLog - Could not read messages for timestamp [2017-06-27T16:28:42.502Z] (this read will be retried) org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
Caused by: com.sleepycat.je.ThreadInterruptedException: (JE 7.3.7) Environment must be closed, caused by: com.sleepycat.je.ThreadInterruptedException: Environment invalid because of previous exception: (JE 7.3.7) db/berkeley java.lang.InterruptedException THREAD_INTERRUPTED: InterruptedException may cause incorrect internal state, unable to continue. Environment is invalid and must be closed. The load script is simple : graph.io(IoCore.graphson()).readGraph("/tmp/file.graphson"); There are no index (yet). Sometimes, I managed to query the graph with another connexion (the loading never ends), and g.V().count() tells 10000. This looks like a transaction/batch size, but I don't know where to go with that information. I'm sure there is something huge that I'm missing. Any pointer would be helpful. Damien Seguy
|
|
Re: MapReduceIndexManagement reindex not completing successfully
Nigel Brown <nigel...@...>
|
|
MapReduceIndexManagement reindex not completing successfully
nigel...@...
I am using a snapshot build, janusgraph-0.2.0-SNAPSHOT-hadoop2, and I am trying to reindex a mixed index using a map reduce job.
This starts up, and tries to do some stuff (I have tried stepping through the code). There is a warning at start
I don't think that is relevant. After a short time I get multiple warnings like this
Eventually the job fails
Other map-reduce jobs seem to run (e.g. page rank and some of the other demos). I can reindex with the Management API. I am assuming our graphs will get too big for that. This is a small graph with a few thousand nodes. Cassandra running locally on one machine. Any comments or hints on getting this running would be most welcome.
|
|
Re: professional support for JanusGraph
Peter Musial <pmmu...@...>
Good point. Thank you.
On Monday, June 26, 2017 at 2:28:04 PM UTC-4, Kelvin Lawrence wrote:
|
|
Re: professional support for JanusGraph
Kelvin Lawrence <kelvin....@...>
Hi Peter, I try not to do product ads on open source mailing lists so I'll just mention, in case others find this thread, that there are definitely going to be announcements in this area from at least one company that I am familiar with. Cheers Kelvin
On Friday, June 23, 2017 at 10:30:12 AM UTC-5, Peter Musial wrote:
|
|
Re: bulk loading error
HadoopMarc <m.c.d...@...>
And this was the answer that Eliz referred to above: Hi Eliz, Good to hear that you make progress. I do not see this post on
the gremlin users list. Would you be so kind as to post it there?
I'll then add the answers below. As to your questions:
Op maandag 26 juni 2017 15:30:03 UTC+2 schreef Ted Wilmes:
|
|
Re: bulk loading error
Ted Wilmes <twi...@...>
Hi Eliz, For your first code snippet, you'll need to add in a periodic commit every X number of vertices instead of after you've loaded the whole file. That X will vary depending on your hardware, etc. but you can experiment and find what gives you the best performance. I'd suggest starting at 100 and going from there. Once you get that working, you could try loading data in parallel by spinning up multiple threads that are addV'ing and periodically committing. For the second approach, using the TinkerPop BulkLoaderVertexProgram, you do not need to download TP separately. I think from looking at your stacktrace, you may just be missing a bit when you constructed the vertex program. Did you call create at the end of its construction like in this little snippet? blvp = BulkLoaderVertexProgram.build(). bulkLoader(OneTimeBulkLoader). writeGraph(writeGraphConf).create(modern) Create takes the input graph that you're reading from as an argument. --Ted
On Sunday, June 25, 2017 at 8:48:57 PM UTC-5, Elizabeth wrote:
|
|
bulk loading error
Elizabeth <hlf...@...>
Hi Marc, This is for your request for posting here:) Thank so much! I indeed followed "the powers of ten", and made it even simpler to load -- not to check if the vertex is already existent, I have done it beforehand. Here is the code, just readline and addVertex row by row: def loadTestSchema(graph) { g = graph.traversal() t=System. new File("/home/dev/ graph.tx().commit() u = System. print u/1000+" seconds \n" g = graph.traversal() g.V().has('uid', 1) } The schema is as follows: def defineTestSchema(graph) { mgmt = graph.openManagement() g = graph.traversal() // vertex labels userId= mgmt.makeVertexLabel("userId") // edge labels relatedby = mgmt.makeEdgeLabel("relatedby" // vertex and edge properties uid = mgmt.makePropertyKey("uid"). // global indices //mgmt.buildIndex("byuid", Vertex.class).addKey(uid). mgmt.buildIndex("byuid", Vertex.class).addKey(uid). mgmt.commit() //mgmt = graph.openManagement() //mgmt.updateIndex(mgmt. //mgmt.commit() } configuration file is : janusgraph-hbase-es.properties gremlin.graph=org.janusgraph. storage.backend=hbase storage.batch-loading=true schema.default=none storage.hostname=127.0.0.1 cache.db-cache = true cache.db-cache-clean-wait = 20 cache.db-cache-time = 180000 cache.db-cache-size = 0.5 index.search.elasticsearch. index.search.backend= index.search.hostname=127.0.0. However, the loading time is still very long. 100 0.026s 10k 49.001seconds 100k 35.827 seconds 1million 379.05 seconds. 10 million: error gremlin> loadTestSchema(graph) 15:59:27 WARN org.janusgraph. GC overhead limit exceeded Type ':help' or ':h' for help. Display stack trace? [yN]y java.lang.OutOfMemoryError: GC overhead limit exceeded What i am wondering is 1) that why does bulk-loading seem not working, though I have already set storage.batch-loading= 2) how to solve the GC overhead limit exceeding? 3) At the same time, I am using the Kryo+ BulkLoaderVertexProgram to load the last step failed: gremlin> graph.compute( No signature of method: org.apache.tinkerpop.gremlin. Possible solutions: program(org.apache.tinkerpop. Do I need to install tinkerPop 3 besides Janusgraph to use this graph.compute( Many thanks! Eliz
|
|