Backend data model deserialization
At zeotap we ve taken the same route to enable olap consumers via apache spark. We presented it in the recent janusgraph meet-up at https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376. We are using ScyllaDB as the backend.
If you want to resort to the source code, you could check out EdgeSerializer and IndexSerializer. Here is a simple code snippet demonstrating how to deserialize an edge:
If you look back at this week's OLAP presentations (https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376) you will see that one of the presenters exactly did what you propose: they exported rows from scylladb and converted it to gryo format for import into TinkerPop HadoopGraph. You might want to contact them to coordinate a possible contribution to the JanusGraph project.
Hi Elliot,I am not aware of existing utilities for deserialization, but as Marc has suggested, you might want to see if there are old Titan resources regarding it, since the data model hasn’t been changed since Titan -> JanusGraph migration.If you want to resort to the source code, you could check out EdgeSerializer and IndexSerializer. Here is a simple code snippet demonstrating how to deserialize an edge:final Entry colVal = StaticArrayEntry.of(StaticArrayBuffer.of(Bytes.fromHexString("0x70a0802140803800")), StaticArrayBuffer.of(Bytes.fromHexString("0x0180a076616c75e5"))); // I retrieved this hex string from Cassandra cqlsh consolefinal StandardSerializer serializer = new StandardSerializer();final EdgeSerializer edgeSerializer = new EdgeSerializer(serializer);RelationCache edgeCache = edgeSerializer.readRelation(colVal, false, (StandardJanusGraphTx) tx); // this is the deserialized edgeBytes.fromHexString is an utility method provided by datastax cassandra driver. You might use any other library/code to convert hex string to bytes.As you can see, there is no single easy-to-use API to deserialize raw data. If you end up creating one, I think it would be helpful if you could contribute back to the community.Best regards,BoxuanOn May 20, 2021, at 8:07 PM, hadoopmarc@... wrote:Hi Elliot,
There should be some old Titan resources that describe how the data model is binary coded into the row keys and row values. Of course, it is also implicit from the JanusGraph source code.
If you look back at this week's OLAP presentations (https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376) you will see that one of the presenters exactly did what you propose: they exported rows from scylladb and converted it to gryo format for import into TinkerPop HadoopGraph. You might want to contact them to coordinate a possible contribution to the JanusGraph project.
Best wishes, Marc
final Entry colVal = StaticArrayEntry.of(StaticArrayBuffer.of(Bytes.fromHexString("0x70a0802140803800")), StaticArrayBuffer.of(Bytes.fromHexString("0x0180a076616c75e5"))); // I retrieved this hex string from Cassandra cqlsh consolefinal StandardSerializer serializer = new StandardSerializer();final EdgeSerializer edgeSerializer = new EdgeSerializer(serializer);RelationCache edgeCache = edgeSerializer.readRelation(colVal, false, (StandardJanusGraphTx) tx); // this is the deserialized edge
On May 20, 2021, at 8:07 PM, hadoopmarc@... wrote:Hi Elliot,
There should be some old Titan resources that describe how the data model is binary coded into the row keys and row values. Of course, it is also implicit from the JanusGraph source code.
If you look back at this week's OLAP presentations (https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376) you will see that one of the presenters exactly did what you propose: they exported rows from scylladb and converted it to gryo format for import into TinkerPop HadoopGraph. You might want to contact them to coordinate a possible contribution to the JanusGraph project.
Best wishes, Marc
There should be some old Titan resources that describe how the data model is binary coded into the row keys and row values. Of course, it is also implicit from the JanusGraph source code.
If you look back at this week's OLAP presentations (https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376) you will see that one of the presenters exactly did what you propose: they exported rows from scylladb and converted it to gryo format for import into TinkerPop HadoopGraph. You might want to contact them to coordinate a possible contribution to the JanusGraph project.
Best wishes, Marc
Hello,
Is there any supported way (e.g. a class/API) for deserializing raw data model rows, i.e. to get from raw Bigtable bytes to Vertex/edge list objects (in Java)?
https://docs.janusgraph.org/advanced-topics/data-model/
We're on the Cloud Bigtable storage backend, and it has excellent support for bulk exporting Bigtable rows (e.g. to Parquet in GCS), but we're unclear how to deserialize the raw Bigtable row/cell bytes back into usable Vertex objects. If we were to build support for something like this, would it be a candidate for contribution back into the project? Or is it misunderstanding the intended API/usage path?
Any thoughts greatly appreciated. Thank you!
- Elliot