Re: [PROPOSAL] Strict Schema


Ted Wilmes <twi...@...>
 

Hello,
That's helpful input, Ranier, and brings up a good question as to how far we want to 
go with this. I think one option would be to keep the PropertyKey type definitions as 
they are now (global), but allow them to be mapped to specific vertex and edge 
labels. The second would be more inline with what you're suggesting, if I'm understanding 
correctly, which would be properties are only created in the context of a specific vertex
 or edge label. This would be much more familiar to the way folks are used to using 
an RDBMS, eg. the "name" property on Person, could be of a different type than 
the "name" Property on a "Building" vertex. I think this could be particularly helpful 
if we add other constraints in later. For example, say we have an "age" property 
on a Person vertex and allow a user to specify a min & a max, or a not-null. 
Ideally, they'd be able to specify a different constraint in the context of another 
vertex/edge label. This could still be done with a global propertykey definition, but the 
constraints then would be tied to the element label/propertykey tuple vs just the 
unique propertykey.

I had put together some examples of the first simpler approach, but now that I 
think about it, I'd like us to determine how far down this rabbit hole we should 
go on the first pass of this schema support work with the high level options being:

1) Define property keys globally as they are now, but allow the user to map 
them to vertex and edge labels. The implications is there is only one of each 
property key (e.g. name is always a String)

2) Define property keys in the context of a specific vertex or edge label. There 
can be more than one property key with the same name. Think column definitions in an RDBMS.

Historically, the first would be adequate for me in the majority of cases, but the 
flexibility of the second would be quite powerful.

What do you all think would be most helpful based upon your day-to-day modeling work?

Thanks,
Ted

On Tuesday, December 19, 2017 at 10:55:01 AM UTC-6, Rainer Pichler wrote:
We at CELUM also put a custom model on top of JanusGraph that supports a type system and multi-inheritance for vertex/edge types.

The global scope of property key definitions forces us to define all properties' data type as Object as same-named properties on elements of different types might have different types
(this also revealed the issue https://groups.google.com/forum/#!topic/janusgraph-dev/3KIDmHuTcwo). Overcoming this limitation should then reduce storage overhead when we can work with concrete property value types.

We solved the traversal-time schema enforcement by having a (compile-time) type-safe query language on top of Gremlin that also implements the type inheritance logic (Intro: https://www.celum.com/en/blog/technology/a-querys-quest). Type inheritance is modelled via additional properties. Soon, I will release a blog article that elaborates on one of our use cases and highlights the benefits of a strict schema and type-safety.

-Rainer Pichler
https://twitter.com/rainerpichler

Join {janusgraph-dev@lists.lfaidata.foundation to automatically receive all group messages.