Date   

Re: FW: Edge Index Creation Error

hadoopmarc@...
 

What JanusGraph version do you use? Recent TinkerPop versions use Order.asc instead of Order.incr.

Best wishes,    Marc


FW: Edge Index Creation Error

pd.vanlill@...
 

HI

 

I am trying to create an Edge Index from the gremlin console.

 

I am executing the following

 

mgmt = graph.openManagement()

gate_to = mgmt.getEdgeLabel('gate_to')

stargate_id = mgmt.makePropertyKey("stargate_id").dataType(Long.class).make()

mgmt.buildEdgeIndex(gate_to, 'GateToEdges', Direction.BOTH, Order.incr, stargate_id)

mgmt.commit()

 

And I am receiving this error

 

No such property: incr for class: org.apache.tinkerpop.gremlin.process.traversal.Order

 

I have checked the JavaDoc and this static property does exist and the docs specify to use it here: https://docs.janusgraph.org/v0.3/index-management/index-performance/ under “Vertex-centric Indexes”


Issue #2181: Could not find type for id

Umesh Gade
 

Hi,
    There is an issue reported for long time:https://github.com/JanusGraph/janusgraph/issues/2181
We hit this issue 2 time till now. Very very rarely occuring issue but its impact is major. I have commented more details on above link.

Does anybody know about this issue and its cause ?

--
Sincerely,
Umesh Gade


Re: JanusGraph database cache on distributed setup

Boxuan Li
 

Thanks Marc for making it clear.

@Wasantha, how did you implement your void invalidate(StaticBuffer key, List<CachableStaticBuffer> entries) method? Make sure you evict this key from your Redis cache. The default implementation in JanusGraph does not evict it immediately. Rather, it records this key in a local HashMap called expiredKeys and evicts the entry after a timeout. If you use this approach, and you don’t store expiredKeys on Redis, then your other instance could still read stale data. I personally think the usage of expiredKeys is not necessary in your case - you could simply evict the entry from Redis in the invalidate call.

If you still have a problem, probably a better way is to share your code so that we could take a look at your implementation.

Best,
Boxuan

On Feb 20, 2022, at 6:23 AM, hadoopmarc@... wrote:

If you do not use sessions, remote requests to Gremlin Server are committed automatically, see: https://tinkerpop.apache.org/docs/current/reference/#considering-transactions .

Are you sure that committing a modification is sufficient to move over the change from the transaction cache to the database cache, botth in the current and in your new ReDis implementation? Maybe you can test by having a remote modification test followed by a retrieval request of the same vertex from the same client, so that the database cache is filled explicitly (before the second client attempts to retrieve it).

Marc


Re: JanusGraph database cache on distributed setup

hadoopmarc@...
 

If you do not use sessions, remote requests to Gremlin Server are committed automatically, see: https://tinkerpop.apache.org/docs/current/reference/#considering-transactions .

Are you sure that committing a modification is sufficient to move over the change from the transaction cache to the database cache, botth in the current and in your new ReDis implementation? Maybe you can test by having a remote modification test followed by a retrieval request of the same vertex from the same client, so that the database cache is filled explicitly (before the second client attempts to retrieve it).

Marc


Re: JanusGraph database cache on distributed setup

Boxuan Li
 

Hi Wasantha,

I am not familiar with the transaction scope when using a remote Gremlin server, so I could be wrong, but could you try rolling back the transaction explicitly on JG instance B? Just to make sure you are not accessing the stale data cached in a local transaction.

Best,
Boxuan

On Feb 19, 2022, at 11:51 AM, washerath@... wrote:

Hi Boxuan,

I was not using a session on gremlin console. So i guess it does not need to commit explicitly. Anyway i have tried commiting the transaction [ g.tx().commit() ] after opening a session, but same behaviour observered. 

Thanks

Wasantha



Re: JanusGraph database cache on distributed setup

washerath@...
 

Hi Boxuan,

I was not using a session on gremlin console. So i guess it does not need to commit explicitly. Anyway i have tried commiting the transaction [ g.tx().commit() ] after opening a session, but same behaviour observered. 

Thanks

Wasantha


Re: JanusGraph database cache on distributed setup

Boxuan Li
 

Hi Wasantha,

In your example, it looks like you didn't commit your transaction on JG instance A. Uncommitted changes are only visible to the local transaction on the local instance. Can you try committing it first on A and then query on B?

Best,
Boxuan


Re: JanusGraph database cache on distributed setup

washerath@...
 

Hi Boxuan,

I was able to change ExpirationKCVSCache class to persist the cache on Redis DB,

But i could still see some data anomaly between two JG instances. As example when i change a property of a vertex from one JG server [ g.V(40964200).property('data', 'some_other_value')

JG instance A
JG instance A

it does not reflect on other JG instance. 

JG instance B
JG instance B

When debuging the flow we could identify that when we triggering a vertex property modification, it gets persists on guava cache using  GuavaVertexCache add method and when retrieving it reads data using  get method on same class. This could be the reason for above observation.

Feels like we might need to do modifications on transaction-wise cache as well. Correct me if i am missing something here and happy to contribute the implementation to the community once this done.

Thanks

Wasantha


Re: JanusGraph database cache on distributed setup

Boxuan Li
 

Hi Wasantha,


It's great to see that you have made some progress. If possible, it would be awesome if you could contribute your implementation to the community!

Yes, modifying `ExpirationKCVSCache`  is enough. `GuavaSubqueryCache` and `GuavaVertexCache` and transaction-wise cache, so you don't want to make them global. `RelationQueryCache` and `SchemaCache` are graph-wise cache, you could make them global, but not necessary since they only store schema rather than real data - actually I would recommend not doing so because JanusGraph already has a mechanism of evicting stale schema cache.

Best,
Boxuan


Re: JanusGraph database cache on distributed setup

washerath@...
 

Hi Boxuan,

I am evaluating the approach of rewriting ExpirationKCVSCache as suggested. There i could replace existing guava cache implementation to connect with remote Redis db.  So that Redis DB will be act as centralized cache which can connects with all other JG instances.

While going through the JG source it could find same guava cache implementation (cachebuilder = CacheBuilder.newBuilder()) uses on several other places. Eg . GuavaSubqueryCache, GuavaVertexCache, ...

Will it be sufficient to have modification only on ExpirationKCVSCache or do we need to look for modification on several other places as well ?

Thanks

Wasantha


Re: Removed graphs still open in muti node cluster

hadoopmarc@...
 

Hi Lixu,

JanusGraph-0.6.0 had various changes to the ConfiguredGraphFactory which might have solved your issue:

https://github.com/JanusGraph/janusgraph/issues/2236
https://github.com/JanusGraph/janusgraph/blob/v0.5.3/janusgraph-core/src/main/java/org/janusgraph/core/ConfiguredGraphFactory.java
https://github.com/JanusGraph/janusgraph/blob/v0.6.1/janusgraph-core/src/main/java/org/janusgraph/core/ConfiguredGraphFactory.java

Can you recheck with version 0.6.1?

BTW, the release notes of v0.6.0 form an impressive list! Merely reading it takes minutes.

Best wishes,

Marc


Re: Preserve IDs when importing graphml

hadoopmarc@...
 

Hi Laura,

No answer but some relevant search results:

https://groups.google.com/g/gremlin-users/c/jUBuhhKuf0M/m/kiKMY0eHAwAJ
The  graph.set-vertex-id property at: https://docs.janusgraph.org/configs/configuration-reference/#graph

In general, when working with JanusGraph, it is better to first transform the input graphml and make the id into a property.

Best wishes,   Marc


Preserve IDs when importing graphml

Laura Morales <lauretas@...>
 

I think I've read once that it's possible to preserve the IDs when importing graphml data. Unfortunately, I cannot remember where I read that. All my IDs are integers.
How do I do that?


Re: Importing a schema

hadoopmarc@...
 

Hi Laura,

JanusGraph only allows to configure a custom SchemaMaker with the schema.default property. Googling on SchemaMaker hits some (unmaintained?) projects that could help:

https://github.com/graph-lab/janusgraph-schema-manager

Best wishes,   Marc


Importing a schema

Laura Morales <lauretas@...>
 

Is there a way to import a schema, instead of creating it with a script?
For example importing this file:

<?xml version='1.0' ?>
<graphml xmlns='http://graphml.graphdrawing.org/xmlns'>
<key id='type' for='node' attr.name='type' attr.type='string'></key>
<key id='code' for='node' attr.name='code' attr.type='string'></key>
<key id='icao' for='node' attr.name='icao' attr.type='string'></key>
<key id='desc' for='node' attr.name='desc' attr.type='string'></key>
<key id='region' for='node' attr.name='region' attr.type='string'></key>
<key id='runways' for='node' attr.name='runways' attr.type='int'></key>
<key id='longest' for='node' attr.name='longest' attr.type='int'></key>
<key id='elev' for='node' attr.name='elev' attr.type='int'></key>
<key id='country' for='node' attr.name='country' attr.type='string'></key>
<key id='city' for='node' attr.name='city' attr.type='string'></key>
<key id='lat' for='node' attr.name='lat' attr.type='double'></key>
<key id='lon' for='node' attr.name='lon' attr.type='double'></key>
<key id='dist' for='edge' attr.name='dist' attr.type='int'></key>
<key id='labelV' for='node' attr.name='labelV' attr.type='string'></key>
<key id='labelE' for='edge' attr.name='labelE' attr.type='string'></key>
</graphml>

like this: graph.io(graphml()).readGraph("file.graphml")

is there a way to make Janus create the schema from the XML file? This would be very convenient because it means I don't have to write groovy scripts for creating the schema.


Re: JanusGraph database cache on distributed setup

Boxuan Li
 

Hi Wasantha,

A centralized cache is a good idea in many use cases. What you could do is to maintain a centralized cache by yourself. This, however, requires some changes to your application code (e.g. your app might need to do a look up in cache and then query JanusGraph). A more advanced approach is to rewrite ExpirationKCVSCache (https://javadoc.io/doc/org.janusgraph/janusgraph-core/latest/org/janusgraph/diskstorage/keycolumnvalue/cache/ExpirationKCVSCache.html) by yourself and let it store cache in a centralized cache rather than the local cache. Then, the db.cache feature should still work except that the cache is synced across JanusGraph instances.

Best,
Boxuan

On Feb 10, 2022, at 10:59 PM, washerath@... wrote:

Actually the concern is with db.cache feature.

Once we enable the db.cache, what ever the modification done for particular vertex only visible for that JG instance untill the cache expires. So if we have multiple JG instances, the modifications done from one instance does not reflect on other immediately. 

If we can have centralized cache which syncs up with all JG instances this can be avoided.

Thanks, Wasantha


Re: JanusGraph database cache on distributed setup

washerath@...
 

Actually the concern is with db.cache feature.

Once we enable the db.cache, what ever the modification done for particular vertex only visible for that JG instance untill the cache expires. So if we have multiple JG instances, the modifications done from one instance does not reflect on other immediately. 

If we can have centralized cache which syncs up with all JG instances this can be avoided.

Thanks, Wasantha


can we dynamically create multiple graphs with customized schema files

Yingjie Li
 

Hello,

Currently we run gremlin.sh with  customized schema file in groovy  (contains backend config, graph name, cache size as well as property key, vertex/edge index)  to initialize graph. It seems that we have to restart the server to make sure that the graph is accessible. 

Is there a way to dynamically create multiple graphs with customized schema files  without the need to restart the server?  


Removed graphs still open in muti node cluster

lixu
 

Hi, I'm using JnausGraphManager and JanusGraphWsAndHttpChannelizer to manage dynamic graph operating.
When dropping graph in multi-nodes cluster, removed graphs only closed in specific node, the removed graphs are still open in other nodes.
 
Version: 0.5.3 
Storage Backend: Hbase
Mixed Index Backend: elasticsearch
Expected Behavior: all nodes in cluster should close the removed graphs
Current Behavior: only the node executing dropping script close the removed graphs
 
Related codes:
JanusGraphManager:
 
    private class GremlinExecutorGraphBinder implements Runnable {
        final JanusGraphManager graphManager;
        final GremlinExecutor gremlinExecutor;
 
        public GremlinExecutorGraphBinder(JanusGraphManager graphManager, GremlinExecutor gremlinExecutor) {
            this.graphManager = graphManager;
            this.gremlinExecutor = gremlinExecutor;
        }
 
        @Override
        public void run() {
            ConfiguredGraphFactory.getGraphNames().forEach(it -> {
                try {
                    final Graph graph = ConfiguredGraphFactory.open(it);
                    updateTraversalSource(it, graph, this.gremlinExecutor, this.graphManager);
                } catch (Exception e) {
                    // cannot open graph, do nothing
                    log.error(String.format("Failed to open graph %s with the following error:\n %s.\n" +
                    "Thus, it and its traversal will not be bound on this server.", it, e.toString()));
                }
            });
        }
    }
 
In above codes, run() method will get all the graph names from hbase, so it could find all the added graphs, 
but those graphs removed from the other nodes are still open in current node.
 
 
 
 

241 - 260 of 6663