Date   

Re: [ANNOUNCE] JanusGraph 0.6.0 Release

schwartz@...
 

Nice!! Gonna give this a spin in a few days


Re: [ANNOUNCE] JanusGraph 0.6.0 Release

Oleksandr Porunov
 


Re: [ANNOUNCE] JanusGraph 0.6.0 Release

Oleksandr Porunov
 

Jan Jansen is working on upgrading the docker image to 0.6.0 in this PR: https://github.com/JanusGraph/janusgraph-docker/pull/91
I believe the Docker image should be available soon.


Re: [ANNOUNCE] JanusGraph 0.6.0 Release

schwartz@...
 

This is great! Was looking forward to this. 

Any ETA for the docker image?

Thanks a lot,
Assaf


[ANNOUNCE] JanusGraph 0.6.0 Release

Oleksandr Porunov
 

The JanusGraph Technical Steering Committee is excited to announce the release of JanusGraph 0.6.0.

JanusGraph is an Apache TinkerPop enabled property graph database with support for a variety of storage and indexing backends. Thank you to all of the contributors.

Notable new features in this release include:
  • Upgrade to TinkerPop 3.5.1
  • Java 11 support
  • Spark 3 support
  • Added mixed index usage for count and has("propertyKey") queries
  • Optimized adjacency checks with unique index
  • Index selection algorithms optimization. Added possibility to configure index selection algorithms.
  • Index repair jobs improvements
  • General index construction optimizations
  • Optimized LevenshteinDistance computation used in Fuzzy predicates
  • Update DataStax Cassandra driver to 4.13.0 version
  • Update Lucene / Solr to 8.9.0
  • Metrics collection improvements
  • Many general optimizations in core
  • GraphBinary serialization format support
  • Added new schema maker and improvement of previous schema makers
  • Added DataStax request logger
  • Replaced GremlinServer with JanusGraphServer
  • Added GRPC server to janusgraph-server for basic schema management
  • Transactions improvements
  • Improved inmemory storage backend
  • Added support for Amazon Managed KeySpace
  • Enhanced profiling
  • Added many new configurations to better control storage and index backends
  • Added configuration to use barrier size as batch size limit
  • Added CacheVertex::refresh method to allow clearing vertex cache
  • Added negations to all text predicates
  • Added exists clause to negated Text predicates
  • Make ExecutorService configurable for parallel backend queries and CQL Store Manager
  • Make CQL executor service usage optional
  • Accept optional custom hadoop config in MapReduceIndexManagement
  • Added multi-query and pre-fetch options to transaction builder
  • Added possibility to configure internal Cassandra driver
The release artifacts can be found at this location:
    https://github.com/JanusGraph/janusgraph/releases/tag/v0.6.0

A full binary distribution is provided for user convenience:
        https://github.com/JanusGraph/janusgraph/releases/download/v0.6.0/janusgraph-full-0.6.0.zip
 
A truncated binary distribution is provided:
        https://github.com/JanusGraph/janusgraph/releases/download/v0.6.0/janusgraph-0.6.0.zip

The online docs can be found here:
    https://docs.janusgraph.org
 
To view the resolved issues and commits check the milestone here:
    https://github.com/JanusGraph/janusgraph/milestone/17?closed=1

Thank you very much,
Oleksandr Porunov


Re: Removing a vertex is not removing recently added properties in different transaction

hadoopmarc@...
 

Hi Priyanka,

The case you describe sounds suspect and might be a JanusGraph issue. Your last remark ("If i add some delay b/w two operations then vertices are getting removed correctly.") gives an important clue as to what is going on.

A few additional questions:
  • Do have the JanusGraph database cache disabled? (This is the default setting for JanusGraph-0.5+)?
  • See the tunability section of https://hbase.apache.org/acid-semantics.html. Did you enable any HBase client settings that impact the HBase visibility guarantees (https://hbase.apache.org/book.html#arch.timelineconsistent.reads )? Note that you may have a CLASSPATH that picks up hbase-site.xml configs from your cluster.
If this turns out to be a JanusGraph issue, is it possible for you do the graph operations in a single transaction (workaround)?

Best wishes,   Marc


Re: Confused about GraphSON edges definition

Laura Morales <lauretas@...>
 

People do not want to put effort in explaining graphSON because that is not the way to go
May I ask why it is not the way to go, and what is the way instead?
I thought my problem was fairly easy: have a graph in a file, load the file. But GraphML is lossy, and GraphSON is not the way to go. What is left other than having to write my own groovy scripts and using the tinkerpop api?


Re: Confused about GraphSON edges definition

hadoopmarc@...
 

Hi Laura,

https://tinkerpop.apache.org/javadocs/current/full/org/apache/tinkerpop/gremlin/structure/io/graphson/GraphSONReader.html

People do not want to put effort in explaining graphSON because that is not the way to go. As said above, you can just use a TinkerGraph with addV, eddEdge and property() and export the graph to graphSON.

Best wishes,   Marc


Looking for deeper understanding of the systemlog table.

jason.mccarthy@...
 

Hi all,

I'm hoping someone can help me understand something better.  I'm curious about the size of the systemlog table for a number of our graphs.  On our backend data store this is the only table which reports having large cells.  On some nodes there is only a few of them, but on other nodes they number in the hundreds (the large cells that is).  

I have a few basic questions:
a) what is stored in this table?
b) what kind of maintenance can I safely perform on it from the backend, if any?
c) what might cause these large cells to show up in this table (and what could be done to avoid it)?

Thanks,
Jason


Re: Confused about GraphSON edges definition

Laura Morales <lauretas@...>
 

Hi,
I've asked my question over there (here's the thread https://groups.google.com/g/gremlin-users/c/_H3UZyfdvtE) and the possible solution seems to be to use readVertices() instead of read() or readGraph(). But I'm very confused and I'd really appreciate if you guys could help me make sense of it. I haven't used Gremlin, Groovy, and Janus before, so I'm basically relying on the Janus documentation but I cannot find any examples for this.
How can I load a GraphSON file using readVertices()?




Sent: Thursday, September 02, 2021 at 8:07 AM
From: hadoopmarc@...
To: janusgraph-users@...
Subject: Re: [janusgraph-users] Confused about GraphSON edges definition
Hi Laura,

If you want to know, you would better ask on the TinkerPop users list. Note that graphSON is not designed as a human-readable or standardized interchange format, but rather as interchange format between TinkerPop-compatible processes. If you want to create or modify a graphSON file, it is easier to instantiate a TinkerGraph and use the TinkerPop API.

Best wishes,   Marc


Re: Removing a vertex is not removing recently added properties in different transaction

Priyanka Jindal
 

Please find my answers inline:

  • do you use CompositeIndex or MixedIndex?
- i am suing composite index
  • is it certain that the two transaction do not overlap in time (as "next" suggests)?
- They do not overlap in time.
  • do the two transactions occur in the same janusgraph instance?
- Yes they do
  • is hbase configured as a single host or as a cluster?
- Its a cluster.

If i add some delay b/w two operations then vertices are getting removed correctly.


Re: Confused about GraphSON edges definition

hadoopmarc@...
 

Hi Laura,

If you want to know, you would better ask on the TinkerPop users list. Note that graphSON is not designed as a human-readable or standardized interchange format, but rather as interchange format between TinkerPop-compatible processes. If you want to create or modify a graphSON file, it is easier to instantiate a TinkerGraph and use the TinkerPop API.

Best wishes,   Marc


Re: CQL scaling limit?

hadoopmarc@...
 

Nice work!


Confused about GraphSON edges definition

Laura Morales <lauretas@...>
 

I'm looking at this example from TinkerPop https://tinkerpop.apache.org/docs/current/dev/io/#graphson

{"id":{"@type":"g:Int32","@value":1},"label":"person","outE":{"created":[{"id":{"@type":"g:Int32","@value":9},"inV":{"@type":"g:Int32","@value":3},"properties":{"weight":{"@type":"g:Double","@value":0.4}}}],"knows":[{"id":{"@type":"g:Int32","@value":7},"inV":{"@type":"g:Int32","@value":2},"properties":{"weight":{"@type":"g:Double","@value":0.5}}},{"id":{"@type":"g:Int32","@value":8},"inV":{"@type":"g:Int32","@value":4},"properties":{"weight":{"@type":"g:Double","@value":1.0}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":0},"value":"marko"}],"age":[{"id":{"@type":"g:Int64","@value":1},"value":{"@type":"g:Int32","@value":29}}]}}
{"id":{"@type":"g:Int32","@value":2},"label":"person","inE":{"knows":[{"id":{"@type":"g:Int32","@value":7},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Double","@value":0.5}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":2},"value":"vadas"}],"age":[{"id":{"@type":"g:Int64","@value":3},"value":{"@type":"g:Int32","@value":27}}]}}
{"id":{"@type":"g:Int32","@value":3},"label":"software","inE":{"created":[{"id":{"@type":"g:Int32","@value":9},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Double","@value":0.4}}},{"id":{"@type":"g:Int32","@value":11},"outV":{"@type":"g:Int32","@value":4},"properties":{"weight":{"@type":"g:Double","@value":0.4}}},{"id":{"@type":"g:Int32","@value":12},"outV":{"@type":"g:Int32","@value":6},"properties":{"weight":{"@type":"g:Double","@value":0.2}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":4},"value":"lop"}],"lang":[{"id":{"@type":"g:Int64","@value":5},"value":"java"}]}}
{"id":{"@type":"g:Int32","@value":4},"label":"person","inE":{"knows":[{"id":{"@type":"g:Int32","@value":8},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Double","@value":1.0}}}]},"outE":{"created":[{"id":{"@type":"g:Int32","@value":10},"inV":{"@type":"g:Int32","@value":5},"properties":{"weight":{"@type":"g:Double","@value":1.0}}},{"id":{"@type":"g:Int32","@value":11},"inV":{"@type":"g:Int32","@value":3},"properties":{"weight":{"@type":"g:Double","@value":0.4}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":6},"value":"josh"}],"age":[{"id":{"@type":"g:Int64","@value":7},"value":{"@type":"g:Int32","@value":32}}]}}
{"id":{"@type":"g:Int32","@value":5},"label":"software","inE":{"created":[{"id":{"@type":"g:Int32","@value":10},"outV":{"@type":"g:Int32","@value":4},"properties":{"weight":{"@type":"g:Double","@value":1.0}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":8},"value":"ripple"}],"lang":[{"id":{"@type":"g:Int64","@value":9},"value":"java"}]}}
{"id":{"@type":"g:Int32","@value":6},"label":"person","outE":{"created":[{"id":{"@type":"g:Int32","@value":12},"inV":{"@type":"g:Int32","@value":3},"properties":{"weight":{"@type":"g:Double","@value":0.2}}}]},"properties":{"name":[{"id":{"@type":"g:Int64","@value":10},"value":"peter"}],"age":[{"id":{"@type":"g:Int64","@value":11},"value":{"@type":"g:Int32","@value":35}}]}}

I don't understand two things, can anyone help me understand them?

- why do I need an "outE" *and* "inE" definition for the same edge? Why can't I just define one or the other? If I define both, the edge is created when importing the file, otherwise if I only use "outE" the edge is not created

- why is everything given an id? Including edges and properties (for example "properties":{"name":[{"id":{"@type":"g:Int64","@value":0},"value":"marko"}). Removing all the "id" except for nodes IDs seems to work fine


Re: Removing a vertex is not removing recently added properties in different transaction

hadoopmarc@...
 

The behavior you describe sounds like the behavior one experiences for transactions occurring in parallel. So let us investigate some further:
  • do you use CompositeIndex or MixedIndex?
  • is it certain that the two transaction do not overlap in time (as "next" suggests)?
  • do the two transactions occur in the same janusgraph instance?
  • is hbase configured as a single host or as a cluster?

Marc


Re: Index stuck on INSTALLED (single instance of JanusGraph)

fredrick.eisele@...
 

It still does not work for me.

graph.getOpenTransactions().forEach { tx ->  tx.commit() }


Re: CQL scaling limit?

madams@...
 

Hi Marc,

I tried rerunning the scaling test on a fresh graph with ids.block-size=10000000 , unfortunately I haven't seen any performance gain.

I also tried ids.block-size=10000000 and ids.authority.conflict-avoidance-mode=GLOBAL_AUTO, but there also there was no performance gain.
I used GLOBAL_AUTO as it was the easiest to test, I ran the test twice to make sure the result was not just due to unlucky random tag assignment. I didn't do the math, but I guess I would have to be very unlucky to get twice a very bad random tag allocation!

 

I tried something else which turned out to be very successful:

instead of inserting all the properties in the graph, I tried only inserting the ones necessary to feed the composite indexes and vertex-centric indexes. The indexes are used to execute efficiently the "get element or create it" logic. This test scaled quite nicely up to 64 indexers (instead of 4 before)!




Out of all the tests I tried so far, the two most successful ones were:

  1. decreasing the cql consistency level (from Quorum to ANY/ONE)
  2. decreasing the number of properties


What's interesting with these two cases, is that they didn't significantly increased the performance of a single indexer, they really increased the horizontal scalability we could achieve.

My best guess for why it is the case: they reduced the amount of work the ScyllaDB coordinators had to do by:

  1. decreasing the amount of coordination necessary to get a majority answer (Quorum)
  2. decreasing the size in bytes of the cql unlogged batches, some of our properties can be quite big ( > 1KB )

I would happily continue digging into this, unfortunately we have other priorities that turned up. We're putting the testing on the side for the moment.

I thought I would post my complete findings/guess anyway in case they are useful to someone.

 

Thank you so much for your help!
Cheers,
Marc


Removing a vertex is not removing recently added properties in different transaction

Priyanka Jindal
 

I am using janus client with hbase as storage backend.
In my case, I am using index ind1 to fetch vertices from the graph. Upon fetching I am adding some properties (e.g one such property is p1) to the vertices and committing the transaction.
In the next transaction, I am fetching the vertices using index ind2 where one key in the index is the property (p1) added in the last transaction. I get the vertices and remove them. Vertices are reported to be removed successfully. But sometimes they are still present with only the properties (p1) added in the previous transaction. Although other properties/edges have been removed. This is happening very intermittently. 
It would be really helpful if someone has an idea about this and can explain me.


TTL for write-ahead logs not working

Radhika Kundam
 

Hi,

I enabled write-ahead logs to support index recovery for secondary persistence failures. I am trying to set TTL for write-ahead logs through JanusGraphManagement setting "log.tx.ttl".
Tried below use case.
1. Set write-ahead log TTL as 10 min.
2. Created few failed entries by bringing Solr(Index Client) down.
3. Waited for more than TTL time(even waited for 1hr) and bring Solr UP.
Expected behavior is failed entries should not recovered as write-ahead log might be gone by then.
Actual behavior is failed entries are recovered successfully.

I triaged and able to see that it's updating "root.log.ttl" properly while creating instance of KCVLogManager for tx log.
Please let me know if any additional configuration is required or if my understanding about expected behavior is not correct.

Thank you,
Radhika


Re: Not able to enable Write-ahead logs using tx.log-tx for existing JanusGraph setup

Radhika Kundam
 

Thank you Boxuan for the confirmation. I used the same approach of reopening graph as of now.

It would be good if "logTransactions" can be refreshed on update of tx.log-tx by providing setter method without reopening graph.
Because as per my understanding, reopening of graph is required only for this tx.log-tx management setting(but not for any other management settings) as this property should be reflected for logTransactions.

521 - 540 of 6666