Re: Backup & Restore of Janusgraph Data with Mixed Index Backend (Elastisearch)
Boxuan Li
Hi Florian,
JanusGraph's philosophy is that your primary storage (ScyllaDB in your case) is the primary and authoritative source of truth, and inconsistency between your mixed index backend and storage layer is tolerable. For example, your transaction would succeed if data is persisted successfully in your primary storage but not the mixed index backend. To fix the inconsistency, you could periodically run the reindex OLAP job, and you could set up the transaction recovery process as described in https://docs.janusgraph.org/advanced-topics/recovery/#transaction-failure. For your use case, I would suggest running reindex job after you restore data. Cheers, Boxuan
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Sai Supraj R
Hi Marc, I have mentioned all the properties in the config file, i am not sure why the configurations are not applied when grem server is restarted. gremlin> :remote connect tinkerpop.server conf/remote.yaml session ==>Configured localhost/127.0.0.1:8182-[35b35e81-8881-420e-9a6a-092114b96202] gremlin> :remote console ==>All scripts will now be sent to Gremlin Server - [localhost/127.0.0.1:8182]-[35b35e81-8881-420e-9a6a-092114b96202] - type ':remote console' to return to local mode gremlin> map = new HashMap(); gremlin> ConfiguredGraphFactory.getGraphNames() gremlin> ConfiguredGraphFactory.open("******") Please create configuration for this graph using the ConfigurationManagementGraph#createConfiguration API. Type ':help' or ':h' for help. Display stack trace? [yN]N Thanks Sai
On Fri, Apr 30, 2021 at 9:30 AM Sai Supraj R via lists.lfaidata.foundation <suprajratakonda=gmail.com@...> wrote:
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Sai Supraj R
Hi Marc, I tried commenting it out and setting it to false but i got the same error message. gremlin> ConfiguredGraphFactory.createConfiguration(new MapConfiguration(map)); Must provide vertex id Type ':help' or ':h' for help. Display stack trace? [yN]N Thanks Sai
On Fri, Apr 30, 2021 at 2:34 AM <hadoopmarc@...> wrote: Hi Sai,
|
|
Backup & Restore of Janusgraph Data with Mixed Index Backend (Elastisearch)
florian.caesar
Hi, Florian
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
hadoopmarc@...
Hi Sai,
I suspect this is related to your setting: #do not auto generate graph vertex id graph.set-vertex-id=true Can you try without? Best wishes, Marc
|
|
Re: Transaction Cache vs. DB Cache Questions
Hi Joe,
just as Boxuan already said, the cache size is crucial for this task. But assuming your graph is large, only a fraction of the vertices will fit into the cache even if scaled appropriately. The problem that I see here is that for large graphs, the chance of finding a vertex in the cache is small, if you iterate over your queries in a random order. If you can come up with an execution order where vertices which have a similar 2-hop neighborhood are processed in temporal proximity to each other, that would greatly improve the cache hit rate. Best regards, Florian
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Sai Supraj R
Hi, This is the gremlin-server.yaml file # Copyright 2019 JanusGraph Authors # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. host: 0.0.0.0 port: 8182 scriptEvaluationTimeout: 30000 channelizer: org.janusgraph.channelizers.JanusGraphWebSocketChannelizer graphManager: org.janusgraph.graphdb.management.JanusGraphManager graphs: { ConfigurationManagementGraph: conf/janusgraph-scylla-configurationgraph.properties } scriptEngines: { gremlin-groovy: { plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {}, org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {}, org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {}, org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]}, org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: []}}}} serializers: - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} # Older serialization versions for backwards compatibility: - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }} processors: - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }} - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }} metrics: { consoleReporter: {enabled: true, interval: 180000}, csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv}, jmxReporter: {enabled: true}, slf4jReporter: {enabled: true, interval: 180000}, gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST}, graphiteReporter: {enabled: false, interval: 180000}} maxInitialLineLength: 4096 maxHeaderSize: 8192 maxChunkSize: 8192 maxContentLength: 65536 maxAccumulationBufferComponents: 1024 resultIterationBatchSize: 64 writeBufferLowWaterMark: 32768 writeBufferHighWaterMark: 65536 This is the properties file: # Copyright 2019 JanusGraph Authors # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # JanusGraph configuration sample: Cassandra over a socket # # This file connects to a Cassandra daemon running on localhost via # Thrift. Cassandra must already be started before starting JanusGraph # with this file. # The implementation of graph factory that will be used by gremlin server # # Default: org.janusgraph.core.JanusGraphFactory # Data Type: String # Mutability: LOCAL # gremlin.graph=org.janusgraph.core.JanusGraphFactory gremlin.graph = org.janusgraph.core.ConfiguredGraphFactory graph.graphname=ConfigurationManagementGraph # The primary persistence provider used by JanusGraph. This is required. # It should be set one of JanusGraph's built-in shorthand names for its # standard storage backends (shorthands: berkeleyje, cassandrathrift, # cassandra, astyanax, embeddedcassandra, cql, hbase, inmemory) or to the # full package and classname of a custom/third-party StoreManager # implementation. # # Default: (no default value) # Data Type: String # Mutability: LOCAL storage.backend=cql # The hostname or comma-separated list of hostnames of storage backend # servers. This is only applicable to some storage backends, such as # cassandra and hbase. # # Default: 127.0.0.1 # Data Type: class java.lang.String[] # Mutability: LOCAL storage.hostname=****** # The name of JanusGraph's keyspace. It will be created if it does not # exist. # # Default: janusgraph # Data Type: String # Mutability: LOCAL storage.cql.keyspace=***** # Whether to enable JanusGraph's database-level cache, which is shared # across all transactions. Enabling this option speeds up traversals by # holding hot graph elements in memory, but also increases the likelihood # of reading stale data. Disabling it forces each transaction to # independently fetch graph elements from storage before reading/writing # them. # # Default: false # Data Type: Boolean # Mutability: MASKABLE cache.db-cache = true # How long, in milliseconds, database-level cache will keep entries after # flushing them. This option is only useful on distributed storage # backends that are capable of acknowledging writes without necessarily # making them immediately visible. # # Default: 50 # Data Type: Integer # Mutability: GLOBAL_OFFLINE # # Settings with mutability GLOBAL_OFFLINE are centrally managed in # JanusGraph's storage backend. After starting the database for the first # time, this file's copy of this setting is ignored. Use JanusGraph's # Management System to read or modify this value after bootstrapping. cache.db-cache-clean-wait = 20 # Default expiration time, in milliseconds, for entries in the # database-level cache. Entries are evicted when they reach this age even # if the cache has room to spare. Set to 0 to disable expiration (cache # entries live forever or until memory pressure triggers eviction when set # to 0). # # Default: 10000 # Data Type: Long # Mutability: GLOBAL_OFFLINE # # Settings with mutability GLOBAL_OFFLINE are centrally managed in # JanusGraph's storage backend. After starting the database for the first # time, this file's copy of this setting is ignored. Use JanusGraph's # Management System to read or modify this value after bootstrapping. cache.db-cache-time = 180000 # Size of JanusGraph's database level cache. Values between 0 and 1 are # interpreted as a percentage of VM heap, while larger values are # interpreted as an absolute size in bytes. # # Default: 0.3 # Data Type: Double # Mutability: MASKABLE cache.db-cache-size = 0.5 storage.cql.write-consistency-level = QUORUM storage.cql.read-consistency-level = QUORUM #storage.cql.replication-strategy-class = "NetworkTopologyStrategy" #storage.cql.replication-strategy-options = "us-east,3" storage.cql.protocol-version=4 storage.read-time=100000 storage.write-time=100000 #do not auto generate graph vertex id graph.set-vertex-id=true When i try to open the graph i am getting this error: gremlin> ConfiguredGraphFactory.open("ConfigurationManagementGraph") Please create configuration for this graph using the ConfigurationManagementGraph#createConfiguration API. Type ':help' or ':h' for help. Display stack trace? [yN]N When trying to create a new graph: gremlin> map = new HashMap<String, Object>(); gremlin> map.put("storage.backend", "cql"); ==>null gremlin> map.put("storage.hostname", "127.0.0.1"); ==>null gremlin> map.put("graph.graphname", "graph1"); ==>null gremlin> ConfiguredGraphFactory.createConfiguration(new MapConfiguration(map)); Must provide vertex id Type ':help' or ':h' for help. Display stack trace? [yN]y java.lang.IllegalArgumentException: Must provide vertex id Thanks Sai
On Thu, Apr 29, 2021 at 1:04 AM Vinayak Bali <vinayakbali16@...> wrote:
|
|
Re: Transaction Cache vs. DB Cache Questions
Boxuan Li
Hi Joe, Thus, I believe the “adjacency lists” wording used in https://docs.janusgraph.org/basics/cache/ actually refers to vertices together with vertex properties (and of course, meta-properties), and edges (and of course, edge properties). If you refactor your code and use multiple threads sharing a common transaction, then yes, the properties will be stored in transaction cache. That cache is not based on thread-local objects, so using multi-threading does not harm the cache here. Regarding the performance, you may need to tune your configs, e.g. try increasing cache.db-cache-size, to reduce the chance of frequent cache eviction. Best regards, Boxuan
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
hadoopmarc@...
Hi Sai,
"ConfigurationManagementGraph" is not meant to be opened. Please follow the exact instructions described in: https://docs.janusgraph.org/basics/configured-graph-factory/#configurationmanagementgraph Best wishes, Marc
|
|
Re: Transaction Cache vs. DB Cache Questions
hadoopmarc@...
Hi Joe,
Good question and I do not know the answer. Indeed, the documentation suggests that the DB cache stores less information than the transaction cache, but it is not explicit about vertex properties. It is not explicit about vertex properties in the transaction cache either, but I cannot remember users having problems with missing vertex properties there. TinkerPop/JanusGraph support multi-threaded transactions. When using these (maybe, you already suggested this in your final line), you are sure that vertices are available from the transaction cache, provided its configs match your traversal. Best wishes, Marc
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Vinayak Bali
Hi, To investigate the issue please share the recent logs and gremlin-server.yaml and janusgraph.sh which is used to start the service.. Thanks & Regards Vinayak
On Thu, 29 Apr 2021, 4:13 am Sai Supraj R, <suprajratakonda@...> wrote:
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Sai Supraj R
Hi, But I am not starting a gremlin server with gremlin-server-cql-es.yaml. I am starting with gremlin-server.yaml and I made the changes as suggested in the janusgraph documentation w.r.t configured graph factory. Thanks Sai
On Wed, Apr 28, 2021 at 3:21 PM Vinayak Bali <vinayakbali16@...> wrote:
|
|
Transaction Cache vs. DB Cache Questions
Joseph Kesting
Hello! I am currently working on a project that computes a 2 hop query for several million vertices. In order to speed up these queries I would like to utilize caching but I am having some trouble finding exact documentation on what is stored by the DB Cache vs. what is stored in the Transaction Cache. The query that I am executing traverses all nodes within a two hop network and then extracts a property from all vertices in that network. Currently these queries are running in different threads that share the DB cache but execute separate transactions and am not seeing the cache performance that I would have hoped. Is this property I am trying to fetch cached in the DB cache or is the DB cache is only used to maintain adjacency lists? Additionally, if I did refactor these threads to share a common transaction would that property be cached in the Transaction cache? Thanks for your assistance! Joe
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Vinayak Bali
Hi, Make changes in gremlin-server-cql-es.yaml file. Thanks
On Wed, 28 Apr 2021, 11:52 pm Sai Supraj R, <suprajratakonda@...> wrote:
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Sai Supraj R
Hi, 0.5.3 Thanks Sai
On Wed, Apr 28, 2021 at 2:21 PM Vinayak Bali <vinayakbali16@...> wrote:
|
|
Re: Configured graph factory not working after making changes to gremlin-server.yaml
Vinayak Bali
Hi, Which is the janusgraph version being used ??? Regards, Vinayak
On Wed, 28 Apr 2021, 11:23 pm , <suprajratakonda@...> wrote: I am trying to use configured graph factory. i made changes to gremlin-server.yaml and configuration-management.properties. I am getting the following error.
|
|
Configured graph factory not working after making changes to gremlin-server.yaml
Sai Supraj R
I am trying to use configured graph factory. i made changes to gremlin-server.yaml and configuration-management.properties. I am getting the following error.
gremlin> :remote connect tinkerpop.server conf/remote.yaml session
==>Configured localhost/127.0.0.1:8182-[b1b934d6-3f17-40b6-b6cb-fd735c605c5a]
gremlin> :remote console
==>All scripts will now be sent to Gremlin Server - [localhost/127.0.0.1:8182]-[b1b934d6-3f17-40b6-b6cb-fd735c605c5a] - type ':remote console' to return to local mode
gremlin> ConfiguredGraphFactory.getGraphNames()
gremlin> ConfiguredGraphFactory.open("ConfigurationManagementGraph");
Please create configuration for this graph using the ConfigurationManagementGraph#createConfiguration API.
Type ':help' or ':h' for help.
Display stack trace? [yN]N
gremlin> ConfiguredGraphFactory.create("ConfigurationManagementGraph");
Please create a template Configuration using the ConfigurationManagementGraph#createTemplateConfiguration API.
Type ':help' or ':h' for help.
Display stack trace? [yN]N
|
|
Re: Mapreduce index repair job fails in Kerberos+SSL enabled cluster
hadoopmarc@...
Hi Shiva,
This sound more like a cluster management question than a JanusGraph question, so my suggested steps are:
|
|
Re: Strange behaviors for Janusgraph 0.5.3 on AWS EMR
asivieri@...
Hi Marc,
yes, the deployMode was specified in the Gremlin Console and not in the properties file, as in the Tinkerpop example, so that's way it was not explicit here. I am not sure why EMR would be limiting anything, since any different Spark application spawns more executors. But I am still investigating this, I will compare the entire properties list (which is reported in Spark UI as well), maybe there is something different. For the output folder, yes it is working correctly in a way: I tried executing the CloneVertexProgram and it creates 768 files, all empty... and by zero I mean 0, while any other query (such as valueMap()) returns just nothing. Best regards, Alessandro
|
|
Re: P.neq() predicate uses wrong ES mapping
hadoopmarc@...
https://github.com/JanusGraph/janusgraph/issues/2588
toggle quoted messageShow quoted text
For further explicitness I added the following example: gremlin> g.V().has('x', neq('lion')).elementMap() ==>[id:4264,label:Some,x:x2,y:??] ==>[id:4224,label:Some,x:x1,y:y1] ==>[id:4192,label:Some,x:watch the dog]
On Sun, Apr 25, 2021 at 09:42 AM, <hadoopmarc@...> wrote:
gremlin> g.V().has('x', neq('watch the dog')).elementMap()
|
|