Re: [QUESTION] Usage of the cassandraembedded


Lilly <lfie...@...>
 

Hi Jan,

So I tried it again. First of all, I remembered, that for cql I need to commit after each step. Otherwise, I get "violation of unique key" errors, even though I am actually not. Is this supposed to be the case (having to commit each time)?
Now on doing the commit after each function call, I found that with the adaption in the properties configuration (see last reply) it is really super slow. If I use the "default" configuration for cql, it is a bit faster but still much slower than in the embedded case.

I also tried it with another graph  which I persisted like this:
public void persist(Map<Integer, Map<String,Object>> nodes, Map<Integer,Integer> edges, Map<Integer,Map<String,String>> names) {
g = graph.traversal();

int counter = 0;
for(Map.Entry<Integer, Map<String,Object>> e: nodes.entrySet()) {


Vertex v = g.addV().property("taxId",e.getKey()).
property("rank",e.getValue().get("rank")).
property("divId",e.getValue().get("divId")).
property("genId",e.getValue().get("genId")).next();
g.tx().commit();
Map<String,String> n = names.get(e.getKey());
if(n != null) {
for(Map.Entry<String,String> vals: n.entrySet()) {
g.V(v).property(vals.getKey(),vals.getValue()).iterate();
g.tx().commit();
}
}

if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;

}


counter = 0;
for(Map.Entry<Integer,Integer> e: edges.entrySet()) {
g.V().has("taxId",e.getKey()).as("v1").V().
has("taxId",e.getValue()).as("v2").
addE("has_parent").from("v1").to("v2").iterate();
g.tx().commit();
if(counter % BULK_CHOP_SIZE == 0) {

System.out.println(counter);
}
counter++;
}

g.V().has("taxId",1).as("v").outE().filter(__.inV().where(P.eq("v"))).drop().iterate();
g.tx().commit();
System.out.println("Done with persistence");
}

And had the same problem in either case.

I am probably using the cql backend wrong somehow and would appreciate any help on what else to do!
Thanks,
Lilly

Am Dienstag, 8. Oktober 2019 09:05:56 UTC+2 schrieb Lilly:

Hi Jan,
Ok then I probably screwed up somewhere. I kind of thought this was to be expected, which is why I did not check it more thoroughly.
Maybe the way I persisted is not working well for cql.
I will try to create a test scenario where I do not have to persist all my data and see how it performs with cql again.

In principle, what I do is call this function :
public void updateEdges(String kmer, int pos, boolean strand, int record, List<SequenceParser.Feature> features){

if(features == null) {
features = Arrays.asList();
}

g.withSideEffect("features",features)
.V().has("prefix", kmer.substring(0,kmer.length()-1)).fold().coalesce(__.unfold(),
__.addV("prefix_node").property("prefix",kmer.substring(0,kmer.length()-1)) ).as("v1").
coalesce(__.V().has("prefix", kmer.substring(1,kmer.length())),
__.addV("prefix_node").property("prefix",kmer.substring(1,kmer.length())) ).as("v2").
sideEffect(__.choose(__.select("features").unfold().count().is(P.eq(0)),
__.addE("suffix_edge").property("record",record).
property("strand",strand).property("pos",pos).from("v1").to("v2")).
select("features").unfold().
addE("suffix_edge").property("record",record).property("strand",strand).property("pos",pos)
.property(__.map(t -> ((SequenceParser.Feature)t.get()).category),
__.map(t -> ((SequenceParser.Feature)t.get()).feature)).from("v1").to("v2")).
iterate();

}
and every roughly 50000 calls I do a commit. As a side remark, all of the above properties possess indecees. And Feature is a simple class with two attributes category and feature.

Also I adapted the configuration file in the following way:
storage.batch-loading = true
ids.block-size = 100000
ids.authority.wait-time = 2000 ms
ids.renew-timeout = 1000000 ms

I tried the same with cql and embedded.

I will get back to you once I have tested it once again. But maybe you already spot an issue?
Thanks
Lilly
Am Montag, 7. Oktober 2019 20:14:29 UTC+2 schrieb fa...@...:
We don't see this problem on persistence.
It would be good know what takes longer. Do like to give some more informations?

Jan


Join {janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.