Date   

Re: Poor read performance on get vertex after creating a new index

Tanroop Dhillon <dhillon...@...>
 

Just to reiterate, first I created a composite index on ID with vertex type. Then I created composite index on ID with edge type.
The index on vertex type seems to have stopped working. The problem could exist in elastic search as well. 

Maybe index for vertex has been overwritten and not getting populated anymore because of same property key (ID)

On Mon, Jul 13, 2020 at 9:47 PM sparshneel chanchlani <sparshneel...@...> wrote:
Actually, IF u are using cassandra as backend, Repair your keyspace to 80%. Also These queries on cassandra on unique key which  is part of your partition key, now if you have row by user id it basically scan the partitions until it finds the key. Where as with index backend like elastic, its inverted index your searching becomes fast. We have seen this same issue for our use case and we had to query the index backend to improve performance coz as your data grows it will be slower.

On 13-Jul-2020, at 21:26, sparshneel chanchlani <sparshneel...@...> wrote:

Tanroop,
These queries on graph ,it is better to use index backend. What is your query can u paste.

On 13-Jul-2020, at 21:23, Tanroop Dhillon <dhillon...@...> wrote:

Sparshneel, 
Unique constraint works on vertex type only. 
Currently I am not using any indexing backend. 
The problem is that 50 ms should have been there even after creating edge instance afterwards. 
I am suspecting that reason could be same property key for both vexted and edge. 

On Mon 13 Jul, 2020, 20:51 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1C52C51F-F826-4EDB-855E-6A410E01536A%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CAGEM45WmOVa53O7Z1c0wPTfD37nmvTf32cOOaXbbP34jwnjh3Q%40mail.gmail.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/2E5F6F9B-F04A-49FE-85CA-D896FD8E1280%40gmail.com.


Re: Poor read performance on get vertex after creating a new index

Tanroop Dhillon <dhillon...@...>
 

I totally agree that elastic search would improve performance. But I am trying to figure out why the performance has degraded after creating edge index
A query on edge ID after creating edge index is working absolutely fine. The 99 percentile is at 29ms , that too, for a higher load.

The query in which I am querying edge by ID is as follows

Suppose edge has property value "edge1" for ID
traversal().E().has("e", "ID", "edge1")


On Mon, Jul 13, 2020 at 9:47 PM sparshneel chanchlani <sparshneel...@...> wrote:
Actually, IF u are using cassandra as backend, Repair your keyspace to 80%. Also These queries on cassandra on unique key which  is part of your partition key, now if you have row by user id it basically scan the partitions until it finds the key. Where as with index backend like elastic, its inverted index your searching becomes fast. We have seen this same issue for our use case and we had to query the index backend to improve performance coz as your data grows it will be slower.

On 13-Jul-2020, at 21:26, sparshneel chanchlani <sparshneel...@...> wrote:

Tanroop,
These queries on graph ,it is better to use index backend. What is your query can u paste.

On 13-Jul-2020, at 21:23, Tanroop Dhillon <dhillon...@...> wrote:

Sparshneel, 
Unique constraint works on vertex type only. 
Currently I am not using any indexing backend. 
The problem is that 50 ms should have been there even after creating edge instance afterwards. 
I am suspecting that reason could be same property key for both vexted and edge. 

On Mon 13 Jul, 2020, 20:51 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1C52C51F-F826-4EDB-855E-6A410E01536A%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CAGEM45WmOVa53O7Z1c0wPTfD37nmvTf32cOOaXbbP34jwnjh3Q%40mail.gmail.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/2E5F6F9B-F04A-49FE-85CA-D896FD8E1280%40gmail.com.


Re: Poor read performance on get vertex after creating a new index

sparshneel chanchlani <sparshneel...@...>
 

Actually, IF u are using cassandra as backend, Repair your keyspace to 80%. Also These queries on cassandra on unique key which  is part of your partition key, now if you have row by user id it basically scan the partitions until it finds the key. Where as with index backend like elastic, its inverted index your searching becomes fast. We have seen this same issue for our use case and we had to query the index backend to improve performance coz as your data grows it will be slower.

On 13-Jul-2020, at 21:26, sparshneel chanchlani <sparshneel...@...> wrote:

Tanroop,
These queries on graph ,it is better to use index backend. What is your query can u paste.

On 13-Jul-2020, at 21:23, Tanroop Dhillon <dhillon...@...> wrote:

Sparshneel, 
Unique constraint works on vertex type only. 
Currently I am not using any indexing backend. 
The problem is that 50 ms should have been there even after creating edge instance afterwards. 
I am suspecting that reason could be same property key for both vexted and edge. 

On Mon 13 Jul, 2020, 20:51 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1C52C51F-F826-4EDB-855E-6A410E01536A%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CAGEM45WmOVa53O7Z1c0wPTfD37nmvTf32cOOaXbbP34jwnjh3Q%40mail.gmail.com.



Re: Poor read performance on get vertex after creating a new index

Tanroop Dhillon <dhillon...@...>
 

Sparshneel, 
Just to check if there is any problem with just getting vertex by ID, I am running the basic query as follows
Suppose my ID is "user" and label "USER" (vertex index is defined on ID)
GraphTraversal<Vertex, Vertex> traversal = graph.traversal().V().has("USER", "ID", "user");
if (traversal.hasNext()) {
  Vertex v = traversal.next();
}


On Mon 13 Jul, 2020, 21:26 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
These queries on graph ,it is better to use index backend. What is your query can u paste.

On 13-Jul-2020, at 21:23, Tanroop Dhillon <dhillon...@...> wrote:

Sparshneel, 
Unique constraint works on vertex type only. 
Currently I am not using any indexing backend. 
The problem is that 50 ms should have been there even after creating edge instance afterwards. 
I am suspecting that reason could be same property key for both vexted and edge. 

On Mon 13 Jul, 2020, 20:51 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1C52C51F-F826-4EDB-855E-6A410E01536A%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CAGEM45WmOVa53O7Z1c0wPTfD37nmvTf32cOOaXbbP34jwnjh3Q%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/B7915F83-76C0-4205-9D79-18C8E1C8AB7B%40gmail.com.


Re: Poor read performance on get vertex after creating a new index

sparshneel chanchlani <sparshneel...@...>
 

Tanroop,
These queries on graph ,it is better to use index backend. What is your query can u paste.

On 13-Jul-2020, at 21:23, Tanroop Dhillon <dhillon...@...> wrote:

Sparshneel, 
Unique constraint works on vertex type only. 
Currently I am not using any indexing backend. 
The problem is that 50 ms should have been there even after creating edge instance afterwards. 
I am suspecting that reason could be same property key for both vexted and edge. 

On Mon 13 Jul, 2020, 20:51 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1C52C51F-F826-4EDB-855E-6A410E01536A%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/CAGEM45WmOVa53O7Z1c0wPTfD37nmvTf32cOOaXbbP34jwnjh3Q%40mail.gmail.com.


Re: Poor read performance on get vertex after creating a new index

Tanroop Dhillon <dhillon...@...>
 

Sparshneel, 
Unique constraint works on vertex type only. 
Currently I am not using any indexing backend. 
The problem is that 50 ms should have been there even after creating edge instance afterwards. 
I am suspecting that reason could be same property key for both vexted and edge. 


On Mon 13 Jul, 2020, 20:51 sparshneel chanchlani, <sparshneel...@...> wrote:
Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1C52C51F-F826-4EDB-855E-6A410E01536A%40gmail.com.


Re: Poor read performance on get vertex after creating a new index

sparshneel chanchlani <sparshneel...@...>
 

Tanroop,
50ms is still on the higher side, Also would like to know what index backend you are using, If EalsticSearch I would suggest create mixed index, you ID data type should be string and do query with textContains. Also I see one difference in your code the vertex index is unique and the Edge one is non unique.

On 13-Jul-2020, at 20:41, Tanroop Dhillon <dhillon...@...> wrote:

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/827ad4db-ed89-4f56-8328-13968a0ff215o%40googlegroups.com.


Re: How much time does it take to reindex data

Tanroop Dhillon <dhillon...@...>
 

I ran reindexing. Sharing for others' knowledge. It took me approximately 24 hrs are index 270 million edges


On Friday, July 10, 2020 at 11:42:43 AM UTC+5:30, Tanroop Dhillon wrote:
Hi,

I am introducing a new index on the graph. I have around 70G data in the graph already. Approximately how much time would it take to reindex the existing data?

Regards,
Tanroop


Poor read performance on get vertex after creating a new index

Tanroop Dhillon <dhillon...@...>
 

Hi,

I have a property ID on both my vertex and edge.
Initially I has created a composite index on Vertex property = ID and read performance on get vertex by ID was very good. Then I created a composite index on edge property = ID. After this, get edge by ID is working great and get vertex by ID response time has drastically increased (from 50ms to 1 sec). Any help on this?

My code

try {
JanusGraphIndex index = mgmt.buildIndex("byId", Vertex.class).addKey(mgmt.getPropertyKey("ID")).unique().buildCompositeIndex();
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byId on Vertex is already defined");
}

try {
JanusGraphIndex edgeIndex = mgmt.buildIndex("byEdgeId", Edge.class).addKey(mgmt.getPropertyKey(
"ID")).buildCompositeIndex();
mgmt.setConsistency(edgeIndex, ConsistencyModifier.LOCK);
} catch (Exception e) {
log.info("Composite index: byEdgeId on Edge is already defined");
}


Re: janusgraph how to bulkload to hbase use spark on yarn,i find this method is Deprecated?

Nitin Poddar <hitk.ni...@...>
 

On Tuesday, July 7, 2020 at 1:14:39 PM UTC-4, nat...@... wrote:
also need some basic example of using addV and addE with Spark. Have you found solution? Or any other way for bulk loading vertices and edges. 

вторник, 7 января 2020 г., 18:23:58 UTC+3 пользователь pandagungun написал:
i want to know if i use addV and addE  on spark executor code,how do i write new code?

在 2020年1月5日星期日 UTC+8下午9:47:53,marc.d...@gmail.com写道:

This is basically the question: "who will do the work in an open source community?"  Apache TinkerPop concluded that a generic BulkLoaderVertexProgram ran into too many implementation specific issues, see here for the JanusGraph case, and they deprecated the library.

If the deprecated BulkLoaderVertexProgram works for you, it would be easy to copy the existing java source code into your scala project and do some minor fixes in case of API changes in future TinkerPop versions. Reworking the BulkLoaderVertexProgram into a general, well documented tool for JanusGraph would be a significant piece of work. Also note that the current BulkLoaderVertexProgram does not do much in preprocessing your data for efficient inserts (janusgraph cache hits only occur for vertices that have so many edges that the vertex is present in the janusgraph cache on each executor). I believe this is the reason that most JanusGraph users simply use the gremlin addV and addE steps on their spark executor code, close to the more complex code where the data is prepared for ingestion.

So, concluding, if you have little time, using the deprecated BulkLoaderVertexProgram is not a large risk. If resource usage (and thus ingesting speed) is important to you, investing in a targeted solution may be worthwhile (look here for inspiration).

HTH,    Marc

Op donderdag 2 januari 2020 03:54:23 UTC+1 schreef pandagungun:
i want to use scala and spark on yarn bulkload data to janusgraph where storage is hbase,but i find this method is Deprecated, BulkLoaderVertexProgram and OneTimeBulkLoader is @deprecated As of release 3.2.10, not directly replaced - consider graph provider specific bulk loading methods,how do i write new code ,the tinkerpop3.4.1 did not  help.
this next  is details,
my hadoop-graphson.properties config,
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
# i/o formats for graphs
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat
# i/o locations
gremlin.hadoop.inputLocation=data/tinkerpop-modern.json
gremlin.hadoop.outputLocation=output
# if the job jars are not on the classpath of every hadoop node, then they must be provided to the distributed cache at runtime
gremlin.hadoop.jarsInDistributedCache=true
# the vertex program to execute
gremlin.vertexProgram=org.apache.tinkerpop.gremlin.process.computer.ranking.pagerank.PageRankVertexProgram

####################################
# SparkGraphComputer Configuration #
####################################
spark.master=yarn
spark.deploy-mode=client
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

my ws-http-jhe-test.properties config
gremlin.graph=org.janusgraph.core.JanusGraphFactory
schema.default=none
storage.backend=hbase
storage.batch-loading=true
storage.hbase.table = testgraph
storage.hbase.region-count = 50
storage.buffer-size=102400
storage.hostname=tcd-***:2181,tcd-***:2181,tcd-***:2181
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5

index.search.backend=elasticsearch
index.search.index-name=testgraph
index.search.hostname=tcd-***,tcd-***,tcd-***
graph.set-vertex-id=true

ids.block-size=100000000

my scala and spark code, if you use BulkLoaderVertexProgram and OneTimeBulkLoader will prompt deprecated.
def bulkLoad(): Unit ={
val readGraph = GraphFactory.open("janusgraph/hadoop-graphson.properties")
val blvp = BulkLoaderVertexProgram.build().bulkLoader(classOf[OneTimeBulkLoader])
.writeGraph("janusgraph/ws-http-jhe-test.properties").create(readGraph)
readGraph.compute(classOf[SparkGraphComputer]).program(blvp).submit().get()
readGraph.close()
}


Re: Unable to use ConfiguredGraphFactory from Java Application

Nitin Poddar <hitk.ni...@...>
 

I recently encountered these issue and worked my way out. So I thought of writing few post to share the information. You can read this may be it will be helpful





On Saturday, December 8, 2018 at 9:36:20 PM UTC-5, rah...@... wrote:
I'm trying to remotely open an existing graph from Java application. But I keep on getting the following error.

"org.janusgraph.graphdb.management.utils.ConfigurationManagementGraphNotEnabledException: Please add a key named "ConfigurationManagementGraph" to the "graphs" property in your YAML file and restart the server to be able to use the functionality of the ConfigurationManagementGraph class."

Here is my Java Code sample:
RemoteGraph.java
---
01| cluster = Cluster.open(conf.getString("gremlin.remote.driver.clusterFile"));
02| client = cluster.connect();
03| JanusGraph airroutes = ConfiguredGraphFactory.open("airroutes");

I'm always getting that error on the line number 03. 

jgex-remote.properties
---
gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

conf/remote-objects.yaml
---
hosts: [localhost]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
}
}

Below is the server configuration.

conf/gremlin-server/gremlin-server.yaml
---
host: 0.0.0.0
 port: 8182
 scriptEvaluationTimeout: 30000
 channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
 graphManager: org.janusgraph.graphdb.management.JanusGraphManager
 graphs: {
     ConfigurationManagementGraph: conf/gremlin-server/janusgraph-cql-es-server.properties
 }
 plugins:
   - janusgraph.imports
 scriptEngines: {
   gremlin-groovy: {
     imports: [java.lang.Math],
     staticImports: [java.lang.Math.PI],
     scripts: [scripts/empty-sample.groovy]}}
...

conf/gremlin-server/janusgraph-cql-es-server.properties
---
gremlin.graph=org.janusgraph.core.JanusGraphFactory
graph.graphname=ConfigurationManagementGraph
storage.backend=cql
storage.hostname=127.0.0.1
storage.cql.keyspace=janusgraph
...
index.search.backend=elasticsearch
index.search.hostname=127.0.0.1
index.search.elasticsearch.client-only=true

I'm not able to figure out what am I missing. Whatever API I'm trying from ConfiguredGraphFactory I'm getting the same error. Anyone has any idea or any link that I can refer to? I have tried everything from the Janusgraph documentation.


Re: Configuring Transaction Log feature

Sandeep Mishra <sandy...@...>
 

Hi,

If you are using same identifier to start the logProcessor, there is no need to explicitly set previous time.
logProcessor keeps a marker of last record read. It should be able to recover from that point.

Do check again.

Regards,
Sandeep

On Thursday, July 9, 2020 at 9:25:17 PM UTC+8, anj...@... wrote:
Hi All,

We are using Janus graph with Cassandra. I am able to capture event using logProcessor and can see table created in Cassandra.

Was trying to figure out, if for some reason logProcessor stops then how to get changes which was done after logProcessor was stopped? 
I tried to start logProcessor by passing previous time thinking it will give all events which were done after that but it does not gave previous changes.


Thanks,
Anjani



On Sunday, 25 February 2018 at 16:01:06 UTC+5:30 sa...@... wrote:
Yeah Jason. I never bothered to look in Janusgraph table, expecting a new table to be created.
I can find a new column family in my setup too.

Thanks and Regards,
Sandeep


On Wednesday, February 21, 2018 at 12:09:14 AM UTC+8, Jason Plurad wrote:
I suppose it could be just confusion on the terminology:

Cassandra -> Keyspace -> Table
HBase -> Table -> Column Family

On Tuesday, February 20, 2018 at 11:05:10 AM UTC-5, Jason Plurad wrote:
Not sure what else to tell you. I just tried the same script from before against HBase 1.3.1, and it created the column family 'ulog_addedPerson' right after the logProcessor.addLogProcessor("addedPerson")...build()command was issued.

hbase(main):001:0> describe 'janusgraph'
Table janusgraph is ENABLED                                                                                                                          
janusgraph                                                                                                                                            
COLUMN FAMILIES DESCRIPTION                                                                                                                          
{NAME => 'e', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'f', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'g', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'h', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'i', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'l', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '60480
0 SECONDS (7 DAYS)'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                  
{NAME => 'm', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 's', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 't', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'ulog_addedPerson', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE'
, TTL => 'FOREVER', COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                  
10 row(s) in 0.0230 seconds




On Tuesday, February 20, 2018 at 9:46:26 AM UTC-5, Sandeep Mishra wrote:
Hi Jason,

I tried it with Hbase backend, and I am getting control passed to change processor. 
Appreciate your help, on careful checking I notice that mutation was happening under default transaction initiated by Janusgraph, hence the issue.

However, the problem right now, I am unable to locate a table for data.
I have taken a snapshot of table in hbase using HBase shell before and after processing, but there is no new table created. 
Any idea what could be wrong? Is there as possibility that, its saving log data in janusgraph table meant for saving actual data?

Thanks and Regards,
Sandeep



On Sunday, February 18, 2018 at 11:56:49 PM UTC+8, Sandeep Mishra wrote:
Both groovy and java code works with backend as berkeleyje. Tomorrow in office i will try with Hbase as backend. 
Noted on your point.

Thanks and Regards,
Sandeep

On Sunday, February 18, 2018 at 11:14:48 PM UTC+8, Jason Plurad wrote:
You can use the same exact code in a simple Java program and prove that it works.
I'd think the main thing to watch out for is that your mutations are on a transaction that have the log identifier on it.
Is the Gremlin Server involved in your scenario?

tx = graph.buildTransaction().logIdentifier("addedPerson").start();


On Sunday, February 18, 2018 at 1:00:08 AM UTC-5, Sandeep Mishra wrote:
Hi Jason,

Thanks for a prompt reply.
Sample code attached below works well when executed from Gremlin console.
However, execution of Java version doesn't trigger callback. Probably something wrong with my code.
Unfortunately I can't copy code from my office machine.
I will check it again and keep you posted.

Regards,
Sandeep 

On Wednesday, February 7, 2018 at 10:58:41 PM UTC+8, Jason Plurad wrote:
It means that it will use the 'storage.backend' value as the storage. See the code in GraphDatabaseConfiguration.java. It looks like your only choice is 'default', and it seems like the option is there for the future possibility to use a different backend.

The code in the docs seemed to work ok, other than a minor change in the setStartTime() parameters. You can cut and paste this code into the Gremlin Console to use with the prepackaged distribution.

import java.util.concurrent.atomic.*;
import org.janusgraph.core.log.*;
import java.util.concurrent.*;

graph
= JanusGraphFactory.open('conf/janusgraph-cassandra-es.properties');

totalHumansAdded
= new AtomicInteger(0);
totalGodsAdded
= new AtomicInteger(0);
logProcessor
= JanusGraphFactory.openTransactionLog(graph);
logProcessor
.addLogProcessor("addedPerson").
        setProcessorIdentifier
("addedPersonCounter").
        setStartTime
(Instant.now()).
        addProcessor
(new ChangeProcessor() {
           
public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
               
for (v in changeState.getVertices(Change.ADDED)) {
                   
if (v.label().equals("human")) totalHumansAdded.incrementAndGet();
                   
System.out.println("total humans = " + totalHumansAdded);
               
}
           
}
       
}).
        addProcessor
(new ChangeProcessor() {
           
public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
               
for (v in changeState.getVertices(Change.ADDED)) {
                   
if (v.label().equals("god")) totalGodsAdded.incrementAndGet();
                   
System.out.println("total gods = " + totalGodsAdded);
               
}
           
}
       
}).
        build
()

tx
= graph.buildTransaction().logIdentifier("addedPerson").start();
u
= tx.addVertex(T.label, "human");
u
.property("name", "proteros");
u
.property("age", 36);
tx
.commit();

If you inspect the keyspace in Cassandra afterwards, you'll see that a separate table is created for "ulog_addedPerson".

Did you have some example code of what you are attempting?


On Wednesday, February 7, 2018 at 5:55:58 AM UTC-5, Sandeep Mishra wrote:
Hi Guys,

We are trying to used transaction log feature of Janusgraph, which is not working as expected.No callback is received at
public void process(JanusGraphTransaction janusGraphTransaction, TransactionId transactionId, ChangeState changeState) {

Janusgraph documentation says value for log.[X].backend is 'default'.
Not sure what exactly it means. does it mean HBase which is being used as backend for data.

Please let  me know, if anyone has configured it.

Thanks and Regards,
Sandeep Mishra


How much time does it take to reindex data

Tanroop Dhillon <dhillon...@...>
 

Hi,

I am introducing a new index on the graph. I have around 70G data in the graph already. Approximately how much time would it take to reindex the existing data?

Regards,
Tanroop


Transaction support of InMemoryStoreManager?

christian...@...
 

Hi.

I am doing some experiments with the InMemoryStoreManager and facing some issue with transactions in that case. When I produce vertices in one thread and commit them in bulk, that commit seems to be not atomic for other threads. I wrote the following test to reproduce the issue:

  @Test
  void testTxAtomicity () throws InterruptedException {
    final int ITERATIONS = 100;
    final int CHUNK_SIZE = 10;

    final JanusGraph graph = JanusGraphFactory.open (GraphDatabaseConfiguration.buildGraphConfiguration ()
                                                                               .set (GraphDatabaseConfiguration.STORAGE_BACKEND,
                                                                                     StandardStoreManager.IN_MEMORY.getManagerClass ())
                                                                               .getConfiguration ());

    final Thread producer = new Thread ( () -> {
      for (int i = 0; i < ITERATIONS; i++) {
        for (int j = 0; j < CHUNK_SIZE; j++) {
          graph.addVertex ("id", i * CHUNK_SIZE + j);
          Thread.yield ();
        }
        graph.tx ().commit ();
      }
    });
    producer.start ();

    try {
      final GraphTraversalSource g = graph.traversal ();
      for (int i = 0; i < ITERATIONS; i++) {
        final int count = g.V ().toList ().size ();
        assertTrue ("Expecting a multitude of " + CHUNK_SIZE + " but received " + count, count % CHUNK_SIZE == 0);
        Thread.sleep (10);
      }
      cache.rollback ();
    } finally {
      producer.join ();
    }
  }

The producer creates vertices and commits them in chunks of 10. Hence, I would expect that other threads will see all of these 10 or none of them. Unfortunately, there seems to be no such atomicity, at least the reading thread sometimes sees only a few of the commited vertices. Am I doing something wrong here or does the InMemoryStoreManager not offer any atomicity guarantees?

Regards
Christian


Re: Configuring Transaction Log feature

"anj...@gmail.com" <anjani...@...>
 

Hi All,

We are using Janus graph with Cassandra. I am able to capture event using logProcessor and can see table created in Cassandra.

Was trying to figure out, if for some reason logProcessor stops then how to get changes which was done after logProcessor was stopped? 
I tried to start logProcessor by passing previous time thinking it will give all events which were done after that but it does not gave previous changes.


Thanks,
Anjani



On Sunday, 25 February 2018 at 16:01:06 UTC+5:30 sa...@... wrote:
Yeah Jason. I never bothered to look in Janusgraph table, expecting a new table to be created.
I can find a new column family in my setup too.

Thanks and Regards,
Sandeep


On Wednesday, February 21, 2018 at 12:09:14 AM UTC+8, Jason Plurad wrote:
I suppose it could be just confusion on the terminology:

Cassandra -> Keyspace -> Table
HBase -> Table -> Column Family

On Tuesday, February 20, 2018 at 11:05:10 AM UTC-5, Jason Plurad wrote:
Not sure what else to tell you. I just tried the same script from before against HBase 1.3.1, and it created the column family 'ulog_addedPerson' right after the logProcessor.addLogProcessor("addedPerson")...build()command was issued.

hbase(main):001:0> describe 'janusgraph'
Table janusgraph is ENABLED                                                                                                                          
janusgraph                                                                                                                                            
COLUMN FAMILIES DESCRIPTION                                                                                                                          
{NAME => 'e', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'f', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'g', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'h', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'i', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'l', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '60480
0 SECONDS (7 DAYS)'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                  
{NAME => 'm', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 's', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 't', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREV
ER'
, COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                  
{NAME => 'ulog_addedPerson', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE'
, TTL => 'FOREVER', COMPRESSION => 'GZ', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                  
10 row(s) in 0.0230 seconds




On Tuesday, February 20, 2018 at 9:46:26 AM UTC-5, Sandeep Mishra wrote:
Hi Jason,

I tried it with Hbase backend, and I am getting control passed to change processor. 
Appreciate your help, on careful checking I notice that mutation was happening under default transaction initiated by Janusgraph, hence the issue.

However, the problem right now, I am unable to locate a table for data.
I have taken a snapshot of table in hbase using HBase shell before and after processing, but there is no new table created. 
Any idea what could be wrong? Is there as possibility that, its saving log data in janusgraph table meant for saving actual data?

Thanks and Regards,
Sandeep



On Sunday, February 18, 2018 at 11:56:49 PM UTC+8, Sandeep Mishra wrote:
Both groovy and java code works with backend as berkeleyje. Tomorrow in office i will try with Hbase as backend. 
Noted on your point.

Thanks and Regards,
Sandeep

On Sunday, February 18, 2018 at 11:14:48 PM UTC+8, Jason Plurad wrote:
You can use the same exact code in a simple Java program and prove that it works.
I'd think the main thing to watch out for is that your mutations are on a transaction that have the log identifier on it.
Is the Gremlin Server involved in your scenario?

tx = graph.buildTransaction().logIdentifier("addedPerson").start();


On Sunday, February 18, 2018 at 1:00:08 AM UTC-5, Sandeep Mishra wrote:
Hi Jason,

Thanks for a prompt reply.
Sample code attached below works well when executed from Gremlin console.
However, execution of Java version doesn't trigger callback. Probably something wrong with my code.
Unfortunately I can't copy code from my office machine.
I will check it again and keep you posted.

Regards,
Sandeep 

On Wednesday, February 7, 2018 at 10:58:41 PM UTC+8, Jason Plurad wrote:
It means that it will use the 'storage.backend' value as the storage. See the code in GraphDatabaseConfiguration.java. It looks like your only choice is 'default', and it seems like the option is there for the future possibility to use a different backend.

The code in the docs seemed to work ok, other than a minor change in the setStartTime() parameters. You can cut and paste this code into the Gremlin Console to use with the prepackaged distribution.

import java.util.concurrent.atomic.*;
import org.janusgraph.core.log.*;
import java.util.concurrent.*;

graph
= JanusGraphFactory.open('conf/janusgraph-cassandra-es.properties');

totalHumansAdded
= new AtomicInteger(0);
totalGodsAdded
= new AtomicInteger(0);
logProcessor
= JanusGraphFactory.openTransactionLog(graph);
logProcessor
.addLogProcessor("addedPerson").
        setProcessorIdentifier
("addedPersonCounter").
        setStartTime
(Instant.now()).
        addProcessor
(new ChangeProcessor() {
           
public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
               
for (v in changeState.getVertices(Change.ADDED)) {
                   
if (v.label().equals("human")) totalHumansAdded.incrementAndGet();
                   
System.out.println("total humans = " + totalHumansAdded);
               
}
           
}
       
}).
        addProcessor
(new ChangeProcessor() {
           
public void process(JanusGraphTransaction tx, TransactionId txId, ChangeState changeState) {
               
for (v in changeState.getVertices(Change.ADDED)) {
                   
if (v.label().equals("god")) totalGodsAdded.incrementAndGet();
                   
System.out.println("total gods = " + totalGodsAdded);
               
}
           
}
       
}).
        build
()

tx
= graph.buildTransaction().logIdentifier("addedPerson").start();
u
= tx.addVertex(T.label, "human");
u
.property("name", "proteros");
u
.property("age", 36);
tx
.commit();

If you inspect the keyspace in Cassandra afterwards, you'll see that a separate table is created for "ulog_addedPerson".

Did you have some example code of what you are attempting?


On Wednesday, February 7, 2018 at 5:55:58 AM UTC-5, Sandeep Mishra wrote:
Hi Guys,

We are trying to used transaction log feature of Janusgraph, which is not working as expected.No callback is received at
public void process(JanusGraphTransaction janusGraphTransaction, TransactionId transactionId, ChangeState changeState) {

Janusgraph documentation says value for log.[X].backend is 'default'.
Not sure what exactly it means. does it mean HBase which is being used as backend for data.

Please let  me know, if anyone has configured it.

Thanks and Regards,
Sandeep Mishra


Re: How to dynamically load graph with multiple keyspace with remote Janus server

Abhijeet Kumar <searcha...@...>
 

+1


On Friday, June 28, 2019 at 5:05:41 PM UTC+5:30, Saini Datta wrote:
Hi,

I am developing a java application where there is a requirement, depending upon input, load a graph of particular keyspace. Janus server is running remotely. I am unable to do it through ConfiguredGraphFactory as the client is not getting the class instance of ConfiguredGraphFactory created by remote server and hence throwing the exception :
"java.lang.RuntimeException: org.janusgraph.graphdb.management.utils.ConfigurationManagementGraphNotEnabledException: Please add a key named "ConfigurationManagementGraph" to the "graphs" property in your YAML file and restart the server to be able to use the functionality of the ConfigurationManagementGraph class."

I have the following queries:
               1. Is there any way to connect remote server to get graph instance using ConfiguredGraphFactory? I know about withRemote(config) but it works only on traversal.
               2. Is there any alternative approach where I can load a particular keyspace graph dynamically from java client.

Please guide..

Thanks
Saini


Re: Unable to use ConfiguredGraphFactory from Java Application

sparshneel chanchlani <sparshneel...@...>
 

U need to change the Janusgraph-cql-es-server.properties to user configuredGraphFactory instead of Janusgraph Factory.
 Then When u open graph using Java API use the code below
graph=ConfiguredGraphFactory.open("graph.name"); 


On Wed, Jul 8, 2020, 5:18 PM <searcha...@...> wrote:
+1

On Sunday, December 9, 2018 at 8:06:20 AM UTC+5:30, rah...@... wrote:
I'm trying to remotely open an existing graph from Java application. But I keep on getting the following error.

"org.janusgraph.graphdb.management.utils.ConfigurationManagementGraphNotEnabledException: Please add a key named "ConfigurationManagementGraph" to the "graphs" property in your YAML file and restart the server to be able to use the functionality of the ConfigurationManagementGraph class."

Here is my Java Code sample:
RemoteGraph.java
---
01| cluster = Cluster.open(conf.getString("gremlin.remote.driver.clusterFile"));
02| client = cluster.connect();
03| JanusGraph airroutes = ConfiguredGraphFactory.open("airroutes");

I'm always getting that error on the line number 03. 

jgex-remote.properties
---
gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

conf/remote-objects.yaml
---
hosts: [localhost]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
}
}

Below is the server configuration.

conf/gremlin-server/gremlin-server.yaml
---
host: 0.0.0.0
 port: 8182
 scriptEvaluationTimeout: 30000
 channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
 graphManager: org.janusgraph.graphdb.management.JanusGraphManager
 graphs: {
     ConfigurationManagementGraph: conf/gremlin-server/janusgraph-cql-es-server.properties
 }
 plugins:
   - janusgraph.imports
 scriptEngines: {
   gremlin-groovy: {
     imports: [java.lang.Math],
     staticImports: [java.lang.Math.PI],
     scripts: [scripts/empty-sample.groovy]}}
...

conf/gremlin-server/janusgraph-cql-es-server.properties
---
gremlin.graph=org.janusgraph.core.JanusGraphFactory
graph.graphname=ConfigurationManagementGraph
storage.backend=cql
storage.hostname=127.0.0.1
storage.cql.keyspace=janusgraph
...
index.search.backend=elasticsearch
index.search.hostname=127.0.0.1
index.search.elasticsearch.client-only=true

I'm not able to figure out what am I missing. Whatever API I'm trying from ConfiguredGraphFactory I'm getting the same error. Anyone has any idea or any link that I can refer to? I have tried everything from the Janusgraph documentation.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/20cf47ea-4d45-4271-ac8c-923692e0eb96o%40googlegroups.com.


Re: Unable to use ConfiguredGraphFactory from Java Application

searcha...@...
 

+1


On Sunday, December 9, 2018 at 8:06:20 AM UTC+5:30, rah...@... wrote:
I'm trying to remotely open an existing graph from Java application. But I keep on getting the following error.

"org.janusgraph.graphdb.management.utils.ConfigurationManagementGraphNotEnabledException: Please add a key named "ConfigurationManagementGraph" to the "graphs" property in your YAML file and restart the server to be able to use the functionality of the ConfigurationManagementGraph class."

Here is my Java Code sample:
RemoteGraph.java
---
01| cluster = Cluster.open(conf.getString("gremlin.remote.driver.clusterFile"));
02| client = cluster.connect();
03| JanusGraph airroutes = ConfiguredGraphFactory.open("airroutes");

I'm always getting that error on the line number 03. 

jgex-remote.properties
---
gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
gremlin.remote.driver.sourceName=g

conf/remote-objects.yaml
---
hosts: [localhost]
port: 8182
serializer: {
className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,
config: {
ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
}
}

Below is the server configuration.

conf/gremlin-server/gremlin-server.yaml
---
host: 0.0.0.0
 port: 8182
 scriptEvaluationTimeout: 30000
 channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
 graphManager: org.janusgraph.graphdb.management.JanusGraphManager
 graphs: {
     ConfigurationManagementGraph: conf/gremlin-server/janusgraph-cql-es-server.properties
 }
 plugins:
   - janusgraph.imports
 scriptEngines: {
   gremlin-groovy: {
     imports: [java.lang.Math],
     staticImports: [java.lang.Math.PI],
     scripts: [scripts/empty-sample.groovy]}}
...

conf/gremlin-server/janusgraph-cql-es-server.properties
---
gremlin.graph=org.janusgraph.core.JanusGraphFactory
graph.graphname=ConfigurationManagementGraph
storage.backend=cql
storage.hostname=127.0.0.1
storage.cql.keyspace=janusgraph
...
index.search.backend=elasticsearch
index.search.hostname=127.0.0.1
index.search.elasticsearch.client-only=true

I'm not able to figure out what am I missing. Whatever API I'm trying from ConfiguredGraphFactory I'm getting the same error. Anyone has any idea or any link that I can refer to? I have tried everything from the Janusgraph documentation.


exporting as graphson: Unknown compressor type for id

Bharat Dighe <bdi...@...>
 

I am getting this error. I am assigning my own vertex Ids, could it be due to that?

java.lang.IllegalArgumentException: Unknown compressor type for id: 5
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer$CompressionType.getFromId(StringSerializer.java:273)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:104)
at org.janusgraph.graphdb.database.serialize.attribute.StringSerializer.read(StringSerializer.java:38)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObjectInternal(StandardSerializer.java:250)
at org.janusgraph.graphdb.database.serialize.StandardSerializer.readObject(StandardSerializer.java:238)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:205)
at org.janusgraph.graphdb.database.EdgeSerializer.readPropertyValue(EdgeSerializer.java:195)
at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:129)
at org.janusgraph.graphdb.database.EdgeSerializer.readRelation(EdgeSerializer.java:73)
at org.janusgraph.graphdb.transaction.RelationConstructor.readRelation(RelationConstructor.java:70)
at org.janusgraph.graphdb.transaction.RelationConstructor$1.next(RelationConstructor.java:57)
at org.janusgraph.graphdb.transaction.RelationConstructor$1.next(RelationConstructor.java:45)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at org.apache.tinkerpop.gremlin.structure.util.star.StarGraph.of(StarGraph.java:210)
at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertex(GraphSONWriter.java:82)
at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeVertices(GraphSONWriter.java:110)
at org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter.writeGraph(GraphSONWriter.java:71)

Thanks
Bharat


Server times out running query

Ben Fulton <benmar...@...>
 

I am trying to run a relatively simple query on a remote server that looks like this:

gremlin> g.V().has('Paper', 'year', 2015).inE('AuthorOf').subgraph('sg').cap('sg').next()

This results in a hang. I've increased the timeout to 50 minutes and the query still doesn't return.

The server log looks like this:

2020-07-07 17:01:23,600 [gremlin-server-worker-1] DEBUG org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor  - Preparing to evaluate script - g.V().has('Paper', 'year', 2015).inE('AuthorOf').subgraph('sg').cap('sg').next() - in thread [gremlin-server-worker-1]
2020-07-07 17:01:23,601 [gremlin-server-session-1] DEBUG org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor  - Evaluating script - g.V().has('Paper', 'year', 2015).inE('AuthorOf').subgraph('sg').cap('sg').next() - in thread [gremlin-server-session-1]
2020-07-07 17:01:23,602 [gremlin-server-worker-1] DEBUG log-io  - [id: 0xc7e983e1, L:/127.0.0.1:8182 - R:/127.0.0.1:53390] READ COMPLETE
2020-07-07 17:01:23,602 [gremlin-server-worker-1] DEBUG log-decoder-aggregator  - [id: 0xc7e983e1, L:/127.0.0.1:8182 - R:/127.0.0.1:53390] READ COMPLETE
2020-07-07 17:01:23,602 [gremlin-server-worker-1] DEBUG log-aggregator-encoder  - [id: 0xc7e983e1, L:/127.0.0.1:8182 - R:/127.0.0.1:53390] READ COMPLETE
2020-07-07 17:01:23,602 [gremlin-server-worker-1] DEBUG log-aggregator-encoder  - [id: 0xc7e983e1, L:/127.0.0.1:8182 - R:/127.0.0.1:53390] READ COMPLETE
2020-07-07 17:01:23,602 [gremlin-server-worker-1] DEBUG log-aggregator-encoder  - [id: 0xc7e983e1, L:/127.0.0.1:8182 - R:/127.0.0.1:53390] READ COMPLETE
2020-07-07 17:01:23,610 [gremlin-server-session-1] DEBUG org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine  - Script compilation g.V().has('Paper', 'year', 2015).inE('AuthorOf').subgraph('sg').cap('sg').next() took 8ms
2020-07-07 17:01:23,670 [gremlin-server-session-1] DEBUG org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Guava vertex cache size: requested=20000 effective=20000 (min=100)
2020-07-07 17:01:23,670 [gremlin-server-session-1] DEBUG org.janusgraph.graphdb.transaction.vertexcache.GuavaVertexCache  - Created dirty vertex map with initial size 32
2020-07-07 17:01:23,670 [gremlin-server-session-1] DEBUG org.janusgraph.graphdb.transaction.vertexcache.GuavaVertexCache  - Created vertex cache with max size 20000

The log is then filled with several gigabytes of the final two lines repeating.

I'm not sure what the problem is or how to debug it. I've tried adding profile() and timeout() steps with no change. What's the best next step in figuring out what's happening here?

Thanks,
--
Ben Fulton
Indiana University