Date   

Re: janusgraph 0.5.0 release date

Oleksandr Porunov <alexand...@...>
 

You can build your own jars from source code (master branch is 0.5.0 branch). "mvn clean install -DskipTests=true". Then you can use necessary jars in your project. Just don't include CI jars in your project.
Also, "vavr" and "high-scale-lib" may be necessary also.
For example, if you put necesary jars in "libs" folder in your project and you are using "Gradle" then you may use next dependencies:
compile group: 'io.vavr', name: 'vavr', version: '0.9.2'
compile group: 'com.github.stephenc.high-scale-lib', name: 'high-scale-lib', version: '1.1.1'
compile fileTree(dir: 'libs', include: ['*.jar'])

On Friday, October 18, 2019 at 5:19:55 PM UTC+3, Baskar Vangili wrote:
We are using Elastic Search 7.3.2 as index backend. Janusgraph latest version 0.4.0 doesn't support this version but 0.5.0 supports. When is the release date for 0.5.0? If it is far, any workaround which I can make in 0.4.0 to support ES 7.3.2?


Concurrent TimeoutException on connection to gremlin server remotely

sarthak...@...
 

Hi,
I have a Gremlin Server Running v3.3.3
I am connecting to it remotely to run my gremlin queries via Java. But recently I'm bombarded with this error

`org.apache.tinkerpop.gremlin.process.remote.RemoteConnectionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timed out while waiting for an available host - check the client configuration and connectivity to the server if this message persists`

Initially, when I faced this issue, I used to restart gremlin service and it used to work again but this that doesn't solve the problem anymore. I'm not sure what is the issue here

Here is my remote-objects.yaml file
```
hosts: [fci-graph-writer-gremlin]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
connectionPool: {
  channelizer: Channelizer.WebSocketChannelizer,
  maxContentLength: 81928192
}
```

gremlin-server.yaml
```
host: 0
port: 8182
scriptEvaluationTimeout: 120000
threadPoolWorker: 4
gremlinPool: 16
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  fci: conf/janusgraph-hbase.properties,
  insights: conf/janusgraph-insights-hbase.properties
}
scriptEngines: {
  gremlin-groovy: {
    plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/empty-sample.groovy]}},
    scripts: [scripts/empty-sample.groovy], 
    staticImports: ['org.opencypher.gremlin.process.traversal.CustomPredicates.*']}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry,org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
  - { className: org.opencypher.gremlin.server.op.cypher.CypherOpProcessor, config: { sessionTimeout: 28800000}}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: false, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 81928192
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
```

pom.xml
```

<dependency>

<groupId>org.janusgraph</groupId>

<artifactId>janusgraph-all</artifactId>

<version>0.3.1</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>gremlin-driver</artifactId>

<version>3.3.3</version>

</dependency>

<dependency>

<groupId>org.apache.tinkerpop</groupId>

<artifactId>tinkergraph-gremlin</artifactId>

<version>3.3.3</version>

</dependency>
```

I'm really stuck here. Any help is appreciated. Thanks!!


Option storage.transactions does not work

nicolas...@...
 

Hello,
For my tests, I want to use embedded janusGraph with inmemory storage but without transaction supports. I found the option storage.transactions in https://docs.janusgraph.org/v0.3/basics/configuration-reference/ but it seems to have no effect. Should this option be compatible with  inmemory storage ?

Note: I do not want transactions as in production, I used a remote connection and thus, the auto commit mode is used. 

Regards,
Nicolas


Janusgraph Hadoop Spark standalone cluster - Janusgraph job always creates constant number 513 of Spark tasks

dimitar....@...
 

Hello,

I have setup Janusgraph 0.4.0 with Hadoop 2.9.0 and Spark 2.4.4 in a K8s cluster.
I connect to Janusgraph from gremlin console and execute: 
gremlin> og
==>graphtraversalsource[hadoopgraph[cassandra3inputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
==>1889

It takes 25min to do the count! The same time took when there were no vertices - e.g. -> 0.  Spark job shows that there were 513 tasks run! Number of task is always constant 513 no matter of the number of vertices.
I have set "spark.sql.shuffle.partitions=4" at spark job's environment, but again the number of Spark tasks was 513! My assumption is that Janusgraph somehow specifies this number of tasks when submits the job to Spark.
The questions are:
- Why Janusgraph job submitted to Spark is always palatalized to 513 tasks? 
- How to manage the number of tasks which are created for a Janusgrap job? 
- How to minimize the execution time of OLAP query for this small graph (OLTP query takes less than a second to execute)?

Thanks,
Dimitar


janusgraph 0.5.0 release date

Baskar Vangili <vanb...@...>
 

We are using Elastic Search 7.3.2 as index backend. Janusgraph latest version 0.4.0 doesn't support this version but 0.5.0 supports. When is the release date for 0.5.0? If it is far, any workaround which I can make in 0.4.0 to support ES 7.3.2?


Upgraded Janus Version to 0.4 and tinkerpop gremlin-server/console version to 3.4.1

Baskar Vangili <vanb...@...>
 

I have upgraded janus version to 0.4.0 and tinker pop version to 3.4.1. 
Index Backend ES version: 7.3.2
Cassandra Version: 3.3.0

After the upgrade, I am getting this error. Any idea what's happening here? 

"error":"com.google.common.util.concurrent.UncheckedExecutionException: org.janusgraph.core.JanusGraphException: StorageBackend version is incompatible with current JanusGraph version: storage [0.4.0] vs. runtime [0.2.0]\n\tat com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)\n\tat com.google.common.cache.LocalCache.get(LocalCache.java:3937)\n\tat com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941)\n\tat com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)




Re: [QUESTION] Usage of the cassandraembedded

faro...@...
 

- Did you increase the block-size? It should large than number of vertices you want insert in the bulk load.
  (ids.block-size * ids.renew-percentage should be large than the vertices to insert during bulk loading. To prevent any complex id block generation)
- We have also memory issues with cql. Did your GC often a cleanup?
- Without bulk loading, it will took longer due a lot of more checks.

An extra Question. Do you store the data for a long time or only for a short analysis?


Am Donnerstag, 10. Oktober 2019 17:38:38 UTC+2 schrieb Lilly:

I now experimented with many types of settings now for the cql connection and timed how long it took.
My observation is the following:
-Embedded with bulk loading took 16 min
- CQL without bulk loading is extremly slow > 2h
- CQL with bulk loading (same settings as for embedded for parameters: storage.batch.loading, ids.block.size, ids.renew.timeout, cache.db-cache, cache.db-cache-clean-wait, cache.db-cache-time, cache.db-cache-size) took 27 min and took up considerable amounts of my RAM (not the case for embedded mode).
- CQL as above but with additionally storage.cql.batch-statement-size = 500 and storage.batch-loading = true took 24 min and not quite as much RAM.

I honestly do not now what else might be the issue..

Am Mittwoch, 9. Oktober 2019 08:17:13 UTC+2 schrieb fa...@...:
For "violation of unique key"  it could be the case that cql checks id's to be unique (JanusGraph could run out of id's in the batch loading mode) but i'm not sure what the embedded backend is doing.


I never used the batch loading mode, see also here: https://docs.janusgraph.org/advanced-topics/bulk-loading/.


Am Dienstag, 8. Oktober 2019 17:50:23 UTC+2 schrieb Lilly:
Hi Jan,

So I tried it again. First of all, I remembered, that for cql I need to commit after each step. Otherwise, I get "violation of unique key" errors, even though I am actually not. Is this supposed to be the case (having to commit each time)?
Now on doing the commit after each function call, I found that with the adaption in the properties configuration (see last reply) it is really super slow. If I use the "default" configuration for cql, it is a bit faster but still much slower than in the embedded case.

I also tried it with another graph  which I persisted like this:
public void persist(Map<Integer, Map<String,Object>> nodes, Map<Integer,Integer> edges, Map<Integer,Map<String,String>> names) {
g = graph.traversal();

int counter = 0;
for(Map.Entry<Integer, Map<String,Object>> e: nodes.entrySet()) {


Vertex v = g.addV().property("taxId",e.getKey()).
property("rank",e.getValue().get("rank")).
property("divId",e.getValue().get("divId")).
property("genId",e.getValue().get("genId")).next();
g.tx().commit();
Map<String,String> n = names.get(e.getKey());
if(n != null) {
for(Map.Entry<String,String> vals: n.entrySet()) {
g.V(v).property(vals.getKey(),vals.getValue()).iterate();
g.tx().commit();
}
}

if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;

}


counter = 0;
for(Map.Entry<Integer,Integer> e: edges.entrySet()) {
g.V().has("taxId",e.getKey()).as("v1").V().
has("taxId",e.getValue()).as("v2").
addE("has_parent").from("v1").to("v2").iterate();
g.tx().commit();
if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;
}

g.V().has("taxId",1).as("v").outE().filter(__.inV().where(P.eq("v"))).drop().iterate();
g.tx().commit();
System.out.println("Done with persistence");
}

And had the same problem in either case.

I am probably using the cql backend wrong somehow and would appreciate any help on what else to do!
Thanks,
Lilly

Am Dienstag, 8. Oktober 2019 09:05:56 UTC+2 schrieb Lilly:
Hi Jan,
Ok then I probably screwed up somewhere. I kind of thought this was to be expected, which is why I did not check it more thoroughly.
Maybe the way I persisted is not working well for cql.
I will try to create a test scenario where I do not have to persist all my data and see how it performs with cql again.

In principle, what I do is call this function :
public void updateEdges(String kmer, int pos, boolean strand, int record, List<SequenceParser.Feature> features){

if(features == null) {
features = Arrays.asList();
}

g.withSideEffect("features",features)
.V().has("prefix", kmer.substring(0,kmer.length()-1)).fold().coalesce(__.unfold(),
__.addV("prefix_node").property("prefix",kmer.substring(0,kmer.length()-1)) ).as("v1").
coalesce(__.V().has("prefix", kmer.substring(1,kmer.length())),
__.addV("prefix_node").property("prefix",kmer.substring(1,kmer.length())) ).as("v2").
sideEffect(__.choose(__.select("features").unfold().count().is(P.eq(0)),
__.addE("suffix_edge").property("record",record).
property("strand",strand).property("pos",pos).from("v1").to("v2")).
select("features").unfold().
addE("suffix_edge").property("record",record).property("strand",strand).property("pos",pos)
.property(__.map(t -> ((SequenceParser.Feature)t.get()).category),
__.map(t -> ((SequenceParser.Feature)t.get()).feature)).from("v1").to("v2")).
iterate();

}
and every roughly 50000 calls I do a commit. As a side remark, all of the above properties possess indecees. And Feature is a simple class with two attributes category and feature.

Also I adapted the configuration file in the following way:
storage.batch-loading = true
ids.block-size = 100000
ids.authority.wait-time = 2000 ms
ids.renew-timeout = 1000000 ms

I tried the same with cql and embedded.

I will get back to you once I have tested it once again. But maybe you already spot an issue?
Thanks
Lilly
Am Montag, 7. Oktober 2019 20:14:29 UTC+2 schrieb fa...@...:
We don't see this problem on persistence.
It would be good know what takes longer. Do like to give some more informations?

Jan



Re: Compatiblity with Spark 2.3

Juraj Polačok <polaco...@...>
 

Hi, 

Currently, I am getting this error: 


java
.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$GetNewApplicationRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google.protobuf.Message


TinkerPop 3.4+ should support Apache Spark 2.3, so I think it should work. Have you tried running some queries using Spark 2.3? 


On Wednesday, 24 April 2019 15:11:26 UTC+2, HadoopMarc wrote:

Hi,

JanusGraph/TinkerPop have all necessary Spark dependencies included in their distributions, so Spark compatibility on Spark/Yarn is not an issue as long as you keep the Spark jars of the Yarn cluster from the various CLASSPATHS involved. You can get the general idea by combining:


and 



What storage backend do you use for JanusGraph? SparkGraphComputer is known to be painfully slow on JanusGraph/HBase (although I am not sure whether reading from the HBase Snaphot feature was ever implemented in JanusGraph, have not ever tried that).

Cheers,    Marc


Op woensdag 24 april 2019 12:42:17 UTC+2 schreef Evgeniy Ignatiev:

Hello.

As far as I remember, Spark compatibility is dictated solely by the corresponding spark-gremlin module from Tinkerpop and 3.3.x release train officially supports only 2.2.
Maybe it is worth duplicating this question to the gremlin-users list? The change that lifted Spark support to 2.3 wasn't large though - https://github.com/apache/tinkerpop/pull/886
it will probably work out of the box with properly fixed Netty dependencies versions.

Best regards,
Evgeniy Ignatiev.

On 4/24/2019 2:25 PM, pol...@... wrote:
Hi,

is it possible to run olap queries using SparkGraphComputer via YARN, with Apache Spark 2.3.2 ?
https://docs.janusgraph.org/latest/version-compat.html states that only 2.2.x is supported. Has anyone tried compatibility with new version of Apache Spark?

Thanks.
--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janu...@....
To post to this group, send email to janu...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/1a4414e8-abde-4432-8a18-94d9a8ed65c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: olap connection with spark standalone cluster

Lilly <lfie...@...>
 

Hi Abhay,

It seems to work fine now unless I am overseeing something. Why do you think it should not?
It also worked beforehand on using the spark.master=local setting.

Thanks,
Lilly

Am Mittwoch, 16. Oktober 2019 08:26:47 UTC+2 schrieb Abhay Pandit:

Hi Lilly,

SparkGraphComputer will not support direct gremlin queries using Java programs.
You can try using this as below.

String query = "g.V().count()";
ComputerResult result = graph.compute(SparkGraphComputer.class)

            .result(GraphComputer.ResultGraph.NEW)

            .persist(GraphComputer.Persist.EDGES)

            .program(TraversalVertexProgram.build()

                    .traversal(

                            graph.traversal().withComputer(SparkGraphComputer.class),

                            "gremlin-groovy",

                            query)

                    .create(graph))

            .submit()

            .get();

System.out.println( computerResult.memory().get("gremlin.traversalVertexProgram.haltedTraversers"));


Join my facebook group: https://www.facebook.com/groups/Janusgraph/

Thanks,
Abhay

On Tue, 15 Oct 2019 at 19:25, <ma...@...> wrote:
Hi Lilly,

This error says that are somehow two versions of the TinkerPop jars in your project. If you use maven you check this with the dependency plugin.

If other problems appear, also be sure that the spark cluster is doing fine by running one of the examples from the spark distribution with spark-submit.

HTH,    Marc

Op dinsdag 15 oktober 2019 09:38:08 UTC+2 schreef Lilly:
Hi everyone,

I downloaded a fresh spark binary relaese (spark-2.4.0-hadoop2.7) and set the master to spark://127.0.0.1:7077. I then started all services via $SPARK_HOME/sbin/start-all.sh.
I checked that spark works with the provided example programs.

I am further using the janusgraph-0.4.0-hadoop2 binary.

Now I configured the read-cassandra-3.properties as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=spark://127.0.0.1:7077
spark.executor.memory=8g
spark.executor.extraClassPath=/home/janusgraph-0.4.0-hadoop2/lib/*
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

where the janusgraph libraries are stored in /home/janusgraph-0.4.0-hadoop2/lib/*

In my java application I now tried
Graph graph = GraphFactory.open('...')
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
and then g.V().count().next()
I get the error message:
ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread "main" java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 15, 192.168.178.32, executor 0): java.io.InvalidClassException: org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal; local class incompatible: stream classdesc serialVersionUID = -3191185630641472442, local class serialVersionUID = 6523257080464450267

Any ideas as to what might be the problem?
Thanks!
Lilly


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/e7336651-4265-4508-985b-64ed53935fff%40googlegroups.com.


Re: olap connection with spark standalone cluster

Lilly <lfie...@...>
 

Hi Marc,

Great this dependency plugin is precisely what I needed!! I tried to manually figure this out via maven central but one goes crazy that way!
It now works perfect thanks so much!

Lilly

Am Dienstag, 15. Oktober 2019 15:55:58 UTC+2 schrieb ma...@...:

Hi Lilly,

This error says that are somehow two versions of the TinkerPop jars in your project. If you use maven you check this with the dependency plugin.

If other problems appear, also be sure that the spark cluster is doing fine by running one of the examples from the spark distribution with spark-submit.

HTH,    Marc

Op dinsdag 15 oktober 2019 09:38:08 UTC+2 schreef Lilly:
Hi everyone,

I downloaded a fresh spark binary relaese (spark-2.4.0-hadoop2.7) and set the master to spark://127.0.0.1:7077. I then started all services via $SPARK_HOME/sbin/start-all.sh.
I checked that spark works with the provided example programs.

I am further using the janusgraph-0.4.0-hadoop2 binary.

Now I configured the read-cassandra-3.properties as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=spark://127.0.0.1:7077
spark.executor.memory=8g
spark.executor.extraClassPath=/home/janusgraph-0.4.0-hadoop2/lib/*
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

where the janusgraph libraries are stored in /home/janusgraph-0.4.0-hadoop2/lib/*

In my java application I now tried
Graph graph = GraphFactory.open('...')
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
and then g.V().count().next()
I get the error message:
ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread "main" java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 15, 192.168.178.32, executor 0): java.io.InvalidClassException: org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal; local class incompatible: stream classdesc serialVersionUID = -3191185630641472442, local class serialVersionUID = 6523257080464450267

Any ideas as to what might be the problem?
Thanks!
Lilly



Re: olap connection with spark standalone cluster

Abhay Pandit <abha...@...>
 

Hi Lilly,

SparkGraphComputer will not support direct gremlin queries using Java programs.
You can try using this as below.

String query = "g.V().count()";
ComputerResult result = graph.compute(SparkGraphComputer.class)

            .result(GraphComputer.ResultGraph.NEW)

            .persist(GraphComputer.Persist.EDGES)

            .program(TraversalVertexProgram.build()

                    .traversal(

                            graph.traversal().withComputer(SparkGraphComputer.class),

                            "gremlin-groovy",

                            query)

                    .create(graph))

            .submit()

            .get();

System.out.println( computerResult.memory().get("gremlin.traversalVertexProgram.haltedTraversers"));


Join my facebook group: https://www.facebook.com/groups/Janusgraph/

Thanks,
Abhay

On Tue, 15 Oct 2019 at 19:25, <marc.d...@...> wrote:
Hi Lilly,

This error says that are somehow two versions of the TinkerPop jars in your project. If you use maven you check this with the dependency plugin.

If other problems appear, also be sure that the spark cluster is doing fine by running one of the examples from the spark distribution with spark-submit.

HTH,    Marc

Op dinsdag 15 oktober 2019 09:38:08 UTC+2 schreef Lilly:
Hi everyone,

I downloaded a fresh spark binary relaese (spark-2.4.0-hadoop2.7) and set the master to spark://127.0.0.1:7077. I then started all services via $SPARK_HOME/sbin/start-all.sh.
I checked that spark works with the provided example programs.

I am further using the janusgraph-0.4.0-hadoop2 binary.

Now I configured the read-cassandra-3.properties as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=spark://127.0.0.1:7077
spark.executor.memory=8g
spark.executor.extraClassPath=/home/janusgraph-0.4.0-hadoop2/lib/*
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

where the janusgraph libraries are stored in /home/janusgraph-0.4.0-hadoop2/lib/*

In my java application I now tried
Graph graph = GraphFactory.open('...')
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
and then g.V().count().next()
I get the error message:
ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread "main" java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 15, 192.168.178.32, executor 0): java.io.InvalidClassException: org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal; local class incompatible: stream classdesc serialVersionUID = -3191185630641472442, local class serialVersionUID = 6523257080464450267

Any ideas as to what might be the problem?
Thanks!
Lilly


--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/e7336651-4265-4508-985b-64ed53935fff%40googlegroups.com.


Issue creating vertex with a List property having large number of elements

Aswin Karthik P <zas...@...>
 

There is a small change in last line of the python code.

Updated code

from gremlin_python.driver import client

client = client.Client('ws://localhost:8182/gremlin', 'g')

mgmtScript = "mgmt = graph.openManagement()\n" + \
"if (mgmt.getPropertyKey('name') != null) return false\n" + \
"mgmt.makePropertyKey('name').dataType(String.class).make()\n" + \
"mgmt.makePropertyKey('vl_prop').dataType(Float.class).cardinality(LIST).make()\n" +\
"mgmt.commit()\n" + \
"return true";

client.submit(mgmtScript).next()

f = open("only_vertex.txt", "r")

create_query = f.read()

client.submit(create_query).next()


Issue creating vertex with a List property having large number of elements

Aswin Karthik P <zas...@...>
 

Hi,
For a use case, I'm trying to create a vertex with some list properties which contains large number of elements using gremlin-python.
But the server gets crashed and I'm getting java.lang.StackOverflowError error in gremlin-server.log 

Since the query is too big, I have attached it as txt file. Along with it, I have attached gremlin-server.yaml file too for reference, where I have tried manipulating the content size etc.

Server initiation
The default one with Cassandra as backend storage

$JANUSHOME/bin/janusgraph.sh start

Python Code

from gremlin_python.driver import client
 
client =  client.Client('ws://localhost:8182/gremlin', 'g')

mgmtScript = "mgmt = graph.openManagement()\n" + \
"if (mgmt.getPropertyKey('name') != null) return false\n" + \
"mgmt.makePropertyKey('name').dataType(String.class).make()\n" + \
"mgmt.makePropertyKey('vl_prop').dataType(Float.class).cardinality(LIST).make()\n" +\
"mgmt.commit()\n" + \
"return true";

client.submit(mgmtScript).next()

f = open("/home/aswin/Desktop/only_vertex.txt", "r")

create_query = f.read()

client.submit(the_text).next()

The Python code is just a glimpse, I have to create 5 such properties for each node and there will be few thousand nodes in the graph.

I'm not sure whether it is a shortcoming of JanusGraph or the gremlin server. And is it even feasible to have such a graph model in Janus Graph.

I would also like to know, if there is an easier/ crisp way to create List properties than repeating the same property name with different values.


Re: olap connection with spark standalone cluster

marc.d...@...
 

Hi Lilly,

This error says that are somehow two versions of the TinkerPop jars in your project. If you use maven you check this with the dependency plugin.

If other problems appear, also be sure that the spark cluster is doing fine by running one of the examples from the spark distribution with spark-submit.

HTH,    Marc

Op dinsdag 15 oktober 2019 09:38:08 UTC+2 schreef Lilly:

Hi everyone,

I downloaded a fresh spark binary relaese (spark-2.4.0-hadoop2.7) and set the master to spark://127.0.0.1:7077. I then started all services via $SPARK_HOME/sbin/start-all.sh.
I checked that spark works with the provided example programs.

I am further using the janusgraph-0.4.0-hadoop2 binary.

Now I configured the read-cassandra-3.properties as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=spark://127.0.0.1:7077
spark.executor.memory=8g
spark.executor.extraClassPath=/home/janusgraph-0.4.0-hadoop2/lib/*
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

where the janusgraph libraries are stored in /home/janusgraph-0.4.0-hadoop2/lib/*

In my java application I now tried
Graph graph = GraphFactory.open('...')
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
and then g.V().count().next()
I get the error message:
ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread "main" java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 15, 192.168.178.32, executor 0): java.io.InvalidClassException: org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal; local class incompatible: stream classdesc serialVersionUID = -3191185630641472442, local class serialVersionUID = 6523257080464450267

Any ideas as to what might be the problem?
Thanks!
Lilly



How roll back works in janus graph, will it roll back the storage write in one transaction

Lighter <yangch...@...>
 

Hi, for below sample code.  storage backend is Hbase,  "name" is used as  index, it will at least has two rows update: but what if index update succeed while vertex update failed(throw exception). when we call rollback, Will it roll back the index write to storage? 

try {

    user = graph.addVertex()
    user.property("name", name)
    graph.tx().commit()
} catch (Exception e) {
    //Recover, retry,  or return error message
    println(e.getMessage())
    graph.tx().rollback()   // <------- Added line 
}


olap connection with spark standalone cluster

Lilly <lfie...@...>
 

Hi everyone,

I downloaded a fresh spark binary relaese (spark-2.4.0-hadoop2.7) and set the master to spark://127.0.0.1:7077. I then started all services via $SPARK_HOME/sbin/start-all.sh.
I checked that spark works with the provided example programs.

I am further using the janusgraph-0.4.0-hadoop2 binary.

Now I configured the read-cassandra-3.properties as follows:
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.janusgraph.hadoop.formats.cassandra.Cassandra3InputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output
gremlin.spark.persistContext=true
janusgraphmr.ioformat.conf.storage.backend=cassandra
janusgraphmr.ioformat.conf.storage.hostname=127.0.0.1
janusgraphmr.ioformat.conf.storage.port=9160
janusgraphmr.ioformat.conf.storage.cassandra.keyspace=janusgraph
cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner
spark.master=spark://127.0.0.1:7077
spark.executor.memory=8g
spark.executor.extraClassPath=/home/janusgraph-0.4.0-hadoop2/lib/*
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator

where the janusgraph libraries are stored in /home/janusgraph-0.4.0-hadoop2/lib/*

In my java application I now tried
Graph graph = GraphFactory.open('...')
GraphTraversalSource g = graph.traversal().withComputer(SparkGraphComputer.class);
and then g.V().count().next()
I get the error message:
ERROR org.apache.spark.scheduler.TaskSetManager - Task 3 in stage 0.0 failed 4 times; aborting job
Exception in thread "main" java.lang.IllegalStateException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 4 times, most recent failure: Lost task 3.3 in stage 0.0 (TID 15, 192.168.178.32, executor 0): java.io.InvalidClassException: org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal; local class incompatible: stream classdesc serialVersionUID = -3191185630641472442, local class serialVersionUID = 6523257080464450267

Any ideas as to what might be the problem?
Thanks!
Lilly



New committer: Dmitry Kovalev

"Florian Hockmann" <f...@...>
 

On behalf of the JanusGraph Technical Steering Committee (TSC), I'm pleased to welcome a new committer on the project!

Dmitry Kovalev made a major contribution with the production-ready in-memory backend. He is quite responsive and patient during the review process and he also contributed to development decisions.


Congratulations, Dmitry!


Re: [QUESTION] Usage of the cassandraembedded

Lilly <lfie...@...>
 

I now experimented with many types of settings now for the cql connection and timed how long it took.
My observation is the following:
-Embedded with bulk loading took 16 min
- CQL without bulk loading is extremly slow > 2h
- CQL with bulk loading (same settings as for embedded for parameters: storage.batch.loading, ids.block.size, ids.renew.timeout, cache.db-cache, cache.db-cache-clean-wait, cache.db-cache-time, cache.db-cache-size) took 27 min and took up considerable amounts of my RAM (not the case for embedded mode).
- CQL as above but with additionally storage.cql.batch-statement-size = 500 and storage.batch-loading = true took 24 min and not quite as much RAM.

I honestly do not now what else might be the issue..

Am Mittwoch, 9. Oktober 2019 08:17:13 UTC+2 schrieb fa...@...:

For "violation of unique key"  it could be the case that cql checks id's to be unique (JanusGraph could run out of id's in the batch loading mode) but i'm not sure what the embedded backend is doing.


I never used the batch loading mode, see also here: https://docs.janusgraph.org/advanced-topics/bulk-loading/.


Am Dienstag, 8. Oktober 2019 17:50:23 UTC+2 schrieb Lilly:
Hi Jan,

So I tried it again. First of all, I remembered, that for cql I need to commit after each step. Otherwise, I get "violation of unique key" errors, even though I am actually not. Is this supposed to be the case (having to commit each time)?
Now on doing the commit after each function call, I found that with the adaption in the properties configuration (see last reply) it is really super slow. If I use the "default" configuration for cql, it is a bit faster but still much slower than in the embedded case.

I also tried it with another graph  which I persisted like this:
public void persist(Map<Integer, Map<String,Object>> nodes, Map<Integer,Integer> edges, Map<Integer,Map<String,String>> names) {
g = graph.traversal();

int counter = 0;
for(Map.Entry<Integer, Map<String,Object>> e: nodes.entrySet()) {


Vertex v = g.addV().property("taxId",e.getKey()).
property("rank",e.getValue().get("rank")).
property("divId",e.getValue().get("divId")).
property("genId",e.getValue().get("genId")).next();
g.tx().commit();
Map<String,String> n = names.get(e.getKey());
if(n != null) {
for(Map.Entry<String,String> vals: n.entrySet()) {
g.V(v).property(vals.getKey(),vals.getValue()).iterate();
g.tx().commit();
}
}

if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;

}


counter = 0;
for(Map.Entry<Integer,Integer> e: edges.entrySet()) {
g.V().has("taxId",e.getKey()).as("v1").V().
has("taxId",e.getValue()).as("v2").
addE("has_parent").from("v1").to("v2").iterate();
g.tx().commit();
if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;
}

g.V().has("taxId",1).as("v").outE().filter(__.inV().where(P.eq("v"))).drop().iterate();
g.tx().commit();
System.out.println("Done with persistence");
}

And had the same problem in either case.

I am probably using the cql backend wrong somehow and would appreciate any help on what else to do!
Thanks,
Lilly

Am Dienstag, 8. Oktober 2019 09:05:56 UTC+2 schrieb Lilly:
Hi Jan,
Ok then I probably screwed up somewhere. I kind of thought this was to be expected, which is why I did not check it more thoroughly.
Maybe the way I persisted is not working well for cql.
I will try to create a test scenario where I do not have to persist all my data and see how it performs with cql again.

In principle, what I do is call this function :
public void updateEdges(String kmer, int pos, boolean strand, int record, List<SequenceParser.Feature> features){

if(features == null) {
features = Arrays.asList();
}

g.withSideEffect("features",features)
.V().has("prefix", kmer.substring(0,kmer.length()-1)).fold().coalesce(__.unfold(),
__.addV("prefix_node").property("prefix",kmer.substring(0,kmer.length()-1)) ).as("v1").
coalesce(__.V().has("prefix", kmer.substring(1,kmer.length())),
__.addV("prefix_node").property("prefix",kmer.substring(1,kmer.length())) ).as("v2").
sideEffect(__.choose(__.select("features").unfold().count().is(P.eq(0)),
__.addE("suffix_edge").property("record",record).
property("strand",strand).property("pos",pos).from("v1").to("v2")).
select("features").unfold().
addE("suffix_edge").property("record",record).property("strand",strand).property("pos",pos)
.property(__.map(t -> ((SequenceParser.Feature)t.get()).category),
__.map(t -> ((SequenceParser.Feature)t.get()).feature)).from("v1").to("v2")).
iterate();

}
and every roughly 50000 calls I do a commit. As a side remark, all of the above properties possess indecees. And Feature is a simple class with two attributes category and feature.

Also I adapted the configuration file in the following way:
storage.batch-loading = true
ids.block-size = 100000
ids.authority.wait-time = 2000 ms
ids.renew-timeout = 1000000 ms

I tried the same with cql and embedded.

I will get back to you once I have tested it once again. But maybe you already spot an issue?
Thanks
Lilly
Am Montag, 7. Oktober 2019 20:14:29 UTC+2 schrieb fa...@...:
We don't see this problem on persistence.
It would be good know what takes longer. Do like to give some more informations?

Jan



Re: index not used for query

Anatoly Belikov <awbe...@...>
 

index.search.backend=elasticsearch
index
.search.hostname=127.0.0.1
index
.search.elasticsearch.client-only=true

Do you think it is due to elastic search?


On Wednesday, 2 October 2019 14:06:01 UTC+3, arnab kumar pan wrote:
facing same issue while creating mixed index, can you share your elasticsearch configuration?

On Tuesday, September 24, 2019 at 7:26:43 PM UTC+5:30, aw...@... wrote:
Hello

I have made an index for vertex property "id", the index is enabled, but still it is not used for the query according to the profiler. Please, give me advice on how to make index work.

gremlin> vindex = mgmt.getGraphIndex("byId")
gremlin
> vindex.fieldKeys
==>id

mgmt
.awaitGraphIndexStatus(graph, vindex.name()).status(SchemaStatus.ENABLED).call()
==>GraphIndexStatusReport[success=true, indexName='byId', targetStatus=[ENABLED], notConverged={}, converged={id=ENABLED}, elapsed=PT0.001S]

gremlin
> g.V().has('id', '-9032656531829342390').profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[id.eq(-9032656531829342390)])                       1           1        2230.851   100.00
   
\_condition=(id = -9032656531829342390)
   
\_isFitted=false
   
\_query=[]
   
\_orders=[]
   
\_isOrdered=true
  optimization                                                                                
0.005
  optimization                                                                                
0.026
  scan                                                                                        
0.000
   
\_condition=VERTEX
   
\_query=[]
   
\_fullscan=true
                                           
>TOTAL                     -           -        2230.851  



Re: [QUESTION] Usage of the cassandraembedded

faro...@...
 

For "violation of unique key"  it could be the case that cql checks id's to be unique (JanusGraph could run out of id's in the batch loading mode) but i'm not sure what the embedded backend is doing.


I never used the batch loading mode, see also here: https://docs.janusgraph.org/advanced-topics/bulk-loading/.


Am Dienstag, 8. Oktober 2019 17:50:23 UTC+2 schrieb Lilly:

Hi Jan,

So I tried it again. First of all, I remembered, that for cql I need to commit after each step. Otherwise, I get "violation of unique key" errors, even though I am actually not. Is this supposed to be the case (having to commit each time)?
Now on doing the commit after each function call, I found that with the adaption in the properties configuration (see last reply) it is really super slow. If I use the "default" configuration for cql, it is a bit faster but still much slower than in the embedded case.

I also tried it with another graph  which I persisted like this:
public void persist(Map<Integer, Map<String,Object>> nodes, Map<Integer,Integer> edges, Map<Integer,Map<String,String>> names) {
g = graph.traversal();

int counter = 0;
for(Map.Entry<Integer, Map<String,Object>> e: nodes.entrySet()) {


Vertex v = g.addV().property("taxId",e.getKey()).
property("rank",e.getValue().get("rank")).
property("divId",e.getValue().get("divId")).
property("genId",e.getValue().get("genId")).next();
g.tx().commit();
Map<String,String> n = names.get(e.getKey());
if(n != null) {
for(Map.Entry<String,String> vals: n.entrySet()) {
g.V(v).property(vals.getKey(),vals.getValue()).iterate();
g.tx().commit();
}
}

if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;

}


counter = 0;
for(Map.Entry<Integer,Integer> e: edges.entrySet()) {
g.V().has("taxId",e.getKey()).as("v1").V().
has("taxId",e.getValue()).as("v2").
addE("has_parent").from("v1").to("v2").iterate();
g.tx().commit();
if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;
}

g.V().has("taxId",1).as("v").outE().filter(__.inV().where(P.eq("v"))).drop().iterate();
g.tx().commit();
System.out.println("Done with persistence");
}

And had the same problem in either case.

I am probably using the cql backend wrong somehow and would appreciate any help on what else to do!
Thanks,
Lilly

Am Dienstag, 8. Oktober 2019 09:05:56 UTC+2 schrieb Lilly:
Hi Jan,
Ok then I probably screwed up somewhere. I kind of thought this was to be expected, which is why I did not check it more thoroughly.
Maybe the way I persisted is not working well for cql.
I will try to create a test scenario where I do not have to persist all my data and see how it performs with cql again.

In principle, what I do is call this function :
public void updateEdges(String kmer, int pos, boolean strand, int record, List<SequenceParser.Feature> features){

if(features == null) {
features = Arrays.asList();
}

g.withSideEffect("features",features)
.V().has("prefix", kmer.substring(0,kmer.length()-1)).fold().coalesce(__.unfold(),
__.addV("prefix_node").property("prefix",kmer.substring(0,kmer.length()-1)) ).as("v1").
coalesce(__.V().has("prefix", kmer.substring(1,kmer.length())),
__.addV("prefix_node").property("prefix",kmer.substring(1,kmer.length())) ).as("v2").
sideEffect(__.choose(__.select("features").unfold().count().is(P.eq(0)),
__.addE("suffix_edge").property("record",record).
property("strand",strand).property("pos",pos).from("v1").to("v2")).
select("features").unfold().
addE("suffix_edge").property("record",record).property("strand",strand).property("pos",pos)
.property(__.map(t -> ((SequenceParser.Feature)t.get()).category),
__.map(t -> ((SequenceParser.Feature)t.get()).feature)).from("v1").to("v2")).
iterate();

}
and every roughly 50000 calls I do a commit. As a side remark, all of the above properties possess indecees. And Feature is a simple class with two attributes category and feature.

Also I adapted the configuration file in the following way:
storage.batch-loading = true
ids.block-size = 100000
ids.authority.wait-time = 2000 ms
ids.renew-timeout = 1000000 ms

I tried the same with cql and embedded.

I will get back to you once I have tested it once again. But maybe you already spot an issue?
Thanks
Lilly
Am Montag, 7. Oktober 2019 20:14:29 UTC+2 schrieb fa...@...:
We don't see this problem on persistence.
It would be good know what takes longer. Do like to give some more informations?

Jan


2441 - 2460 of 6651