Re: JanusGraph Best Practice to Store the Data


Boxuan Li
 

Hi,

There are a few factors you might want to consider:

1. An increase of your transaction-wise cache and database-level cache memory usage.
2. Cassandra does not support large column value well. 100-500kb is far less than the hard limit, but some say that this scale can also lead to performance issue (disclaimer: I’ve never tried it myself).
3. Serialization and deserialization cost. To reduce storage and network overhead, JanusGraph encodes and compresses your string value (see StringSerializer). That being said, I believe this overhead should (usually) still be much smaller than an additional network call (if you store docValue somewhere else).

The best option depends on your use case and your testing, of course.

Best,
Boxuan


On Mar 9, 2022, at 8:22 AM, kaintharinder@... wrote:

[Edited Message Follows]

Hi Team,

We are running JanusGraph + Cassandra combination for storing through Gremlin Commands from Java Api.
Thinking of saving the full JSON document into Graph alongside relationship.

Gremlin Query is Like Below :
g.addV('Segment').property(\"docId\",docId).property(\"docValue\",docValue).property(\"docSize\",docSize)

The "docValue" value will be huge lying in the range of 100-500kb. It is a JSON document.
Wanted to understand whether it is a good practice to save full documents in Graph or should we only store the references.

Join janusgraph-users@lists.lfaidata.foundation to automatically receive all group messages.