Re: Serialization error in JanusGraph libraries for Python (Geo Predicate)


Florian Hockmann <f...@...>
 

There is a difference between Geoshapes and Geo predicates. You are right that Geo predicates are similar to Text predicates and don't need their own serializer as the normal serializer for P (that's the PSerializer you listed) can be used that is already included in the TinkerPop GLVs like Gremlin-Python. Geoshapes however are objects that represent things like coordinates or a circle around coordinates. See the Geoshape Data Type section of the docs for more information. So, you need one serializer and one deserializer for each Geoshape data type (point, line, circle, and so on).

All the above serializers implement _GraphSONTypeIO class. So do I need to write another Serilizer namely Geoshape serializer / desirilizer include in list of existing serilizers?

Yes, exactly, you need to write similar serializers and deserializers for the Geoshape types.

If so, that can be done, then my next question arises is how do I register the same serilizer whenever I call my Geo predicates? i.e. how do I make system know to use the above created Serilizer?

The TinkerPop docs show how such a new serializer can be registered for Gremlin-Python.

Also, thanks for idea on insertion. I was doing just Read operations till now, but I will also test out Write operation using Geo shapes. But then that will also need Serilizer to be implemented right?

Yes, that would require a serializer, but you can't really use Geo predicates without being able to serialize Geoshape types.

Am Freitag, 10. August 2018 20:28:19 UTC+2 schrieb Debasish Kanhar:
Hi Florian,

Thanks for response. Well my understanding was that the implementation was going to be similar for all Geoshapes. I guess my understanding was wrong here. Thanks for pointing that out.

If I got it right, do you mean to say that for implementing Geoshapes we will have to write our own Serializer and deserializer for geo predicates? We didn't need to do that while working on Text predicated though.

Anyways, I was going through source code for Serializer written in Python (Graphson 2.0), and I've following serializers/deserilizers implemented:

  1. _BytecodeSerializer
  2. TraversalSerializer
  3. VertexSerializer
  4. EdgeSerializer
  5. VertexPropertySerializer
  6. PropertySerializer
  7. TraversalStrategySerializer
  8. EnumSerializer
  9. PSerializer
  10. BindingSerializer
  11. LambdaSerializer
All the above serializers implement _GraphSONTypeIO class. So do I need to write another Serilizer namely Geoshape serializer / desirilizer include in list of existing serilizers?

If so, that can be done, then my next question arises is how do I register the same serilizer whenever I call my Geo predicates? i.e. how do I make system know to use the above created Serilizer?

Also, thanks for idea on insertion. I was doing just Read operations till now, but I will also test out Write operation using Geo shapes. But then that will also need Serilizer to be implemented right?

Thanks

On Friday, 10 August 2018 21:30:46 UTC+5:30, Florian Hockmann wrote:
First of all, great to hear that someone is working on a Python driver for JanusGraph!

At a first glance I'd say that the serialization of Geoshape.circle() looks wrong. You serialize them as if they were predicates when they really are objects. In your stack trace it looks like this:

{
    "predicate": "Geoshape.circle",
    "value": "37.97, 23.72, 50"
}

whereas it should look something like this:

{
    "@type": "janusgraph:Geoshape",
    "@value": {
        "geometry": {
            "type": "Circle",
            "coordinates": [
                {
                    "@type": "g:Double",
                    "@value": 37
                },
                {
                    "@type": "g:Double",
                    "@value": 25
                }
            ],
            "radius": {
                "@type": "g:Double",
                "@value": 50
            },
            "properties": {
                "radius_units": "km"
            }
        }
    }
}

The serialization of Geoshapes is really not exactly pretty. I'd say start with the easiest one, namely Geoshape.point. I would also first only insert a Geoshape as a property to JanusGraph and test whether this works. Then, you can retrieve such a property back. (The graph of the gods which is frequently used for integration tests already contains properties for this. I created a Docker image for integration tests that comes already loaded with this graph.) That way, you can be sure that your serialization and deserialization of Geoshapes already work before you use them together with Geo predicates.
You can see how JanusGraph deserializes Geoshapes here.

Am Freitag, 10. August 2018 12:09:34 UTC+2 schrieb Debasish Kanhar:
Hi all,

I'm currently building JanusGraph libraries for Python so that we can extend functionalities of JanusGraph indexed lookup and Schema management using non JVM based languages.

I was planning a 0.1 release in few weeks with following features as starting point:

1: Implement Text Predicated, like textContains etc for Python. (Done)
2: Implement Geo predicates, like geoWithin etc for Python.
3: To be able to serialize Edge IDs. (Done).

Once 0.1 is out, and we have made that project part of official JanusGraph, along with docs added, I was planning to add Schema management utility to Python, though that is for later stage.

Will there be any big feature which I'm missing out on? Please point out.

Now back to original query, so when I try Geo predicates, my queries are failing. I feel that I'm doing something silly. Please suggest me if I'm doing anything wrong.

So, we have Gremlin Python's Predicate class which I'm using and extending my functionality to include JanusGraph functionalities like Text and Geo.

Usual declaration for P class for using TinkerPop predicates:

@staticmethod
def between(*args):
    return P("between", *args)

// Query is g.V().has("age", between(10,20)).next()
Following directive above, I implemented the similar for Text predicates as follows:
@staticmethod
def textContains(value):
    predicate = P("textContains", value)
    return predicate

// Query is: g.V().has("name", textContains("saturn")).next()

The above method works, and I'm able to make Text predicate queries work from Python using lib I created.

When I try to introduce Geo predicates, everything fails. Maybe because the way I used Predicates (I use nested predicates as follows).

// Query for Geo
g
.E().has("place", geoWithin(Geoshape.circle(37,25,50))).next()

NOTE: So we have 2 predicate here, first is geoWithin predicate, and inside that we have Geoshape.circle(37,25,5). So, I use the following method defination for geoWithin predicate, a nested predicate system:


def geoWithin(self, value):
    shape = value.getShape()

    shapeP = None

    if shape == "CIRCLE":
        shapeP = P("Geoshape.circle", "{}, {}, {}".format(value.getLatitude(), value.getLongitude(), value.getRadius()))
    elif shape == "POINT":
        shapeP = P("Geoshape.point", "{}, {}".format(value.getLatitude(), value.getLongitude()))

    withinP = P("geoWithin", shapeP)

    return withinP

As you can see, first I call Predicate class with "Geoshape.circle" and "37,25,50". Then use the same object to pass that as value to Predicate with "geoWithin".

But the above fails with following Gremlin server error:

1447671 [gremlin-server-worker-1] WARN  org.apache.tinkerpop.gremlin.driver.ser.AbstractGraphSONMessageSerializerV2d0  - Request [PooledUnsafeDirectByteBuf(ridx: 394, widx: 394, cap: 428)] could not be deserialized by org.apache.tinkerpop.gremlin.driver.ser.AbstractGraphSONMessageSerializerV2d0.
org
.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not deserialize the JSON value as required. Nested exception: org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not deserialize the JSON value as required. Nested exception: org.apache.tinkerpop.shaded.jackson.databind.JsonMappingException: Could not deserialize the JSON value as required. Nested exception: java.lang.IllegalStateException: org.apache.tinkerpop.gremlin.process.traversal.P.Geoshape.circle(java.lang.Object)
 at
[Source: (byte[])"{"requestId":{"@type":"g:UUID","@value":"b053215c-5a60-41d8-bdb7-05b8583ac901"},"processor":"traversal","op":"bytecode","args":{"gremlin":{"@type":"g:Bytecode","@value":{"step":[["E"],["has","place",{"@type":"g:P","@value":{"predicate":"geoWithin","value":{"@type":"g:P","@value":{"predicate":"Geoshape.circle","value":"37.97, 23.72, 50"}}}}],["inV"],["valueMap",true]]}},"aliases":{"g":"gg"}}}"; line: 1, column: 338]
 at
[Source: (byte[])"{"requestId":{"@type":"g:UUID","@value":"b053215c-5a60-41d8-bdb7-05b8583ac901"},"processor":"traversal","op":"bytecode","args":{"gremlin":{"@type":"g:Bytecode","@value":{"step":[["E"],["has","place",{"@type":"g:P","@value":{"predicate":"geoWithin","value":{"@type":"g:P","@value":{"predicate":"Geoshape.circle","value":"37.97, 23.72, 50"}}}}],["inV"],["valueMap",true]]}},"aliases":{"g":"gg"}}}"; line: 1, column: 338]
 at
[Source: (byte[])"{"requestId":{"@type":"g:UUID","@value":"b053215c-5a60-41d8-bdb7-05b8583ac901"},"processor":"traversal","op":"bytecode","args":{"gremlin":{"@type":"g:Bytecode","@value":{"step":[["E"],["has","place",{"@type":"g:P","@value":{"predicate":"geoWithin","value":{"@type":"g:P","@value":{"predicate":"Geoshape.circle","value":"37.97, 23.72, 50"}}}}],["inV"],["valueMap",true]]}},"aliases":{"g":"gg"}}}"; line: 1, column: 338] (through reference chain: java.util.LinkedHashMap["args"]->java.util.LinkedHashMap["gremlin"])
        at org
.apache.tinkerpop.shaded.jackson.databind.JsonMappingException.from(JsonMappingException.java:270)
        at org
.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:1711)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserialize(GraphSONTypeDeserializer.java:194)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserializeTypedFromAny(GraphSONTypeDeserializer.java:101)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserializeWithType(UntypedObjectDeserializer.java:712)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:529)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserialize(GraphSONTypeDeserializer.java:219)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserializeTypedFromAny(GraphSONTypeDeserializer.java:101)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.UntypedObjectDeserializer$Vanilla.deserializeWithType(UntypedObjectDeserializer.java:712)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:529)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserialize(GraphSONTypeDeserializer.java:212)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserializeTypedFromObject(GraphSONTypeDeserializer.java:86)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.deserializeWithType(MapDeserializer.java:400)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68)
        at org
.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.readValue(DeserializationContext.java:759)
        at org
.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.readValue(DeserializationContext.java:746)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.AbstractObjectDeserializer.deserialize(AbstractObjectDeserializer.java:48)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserialize(GraphSONTypeDeserializer.java:212)
        at org
.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONTypeDeserializer.deserializeTypedFromAny(GraphSONTypeDeserializer.java:101)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.std.StdDeserializer.deserializeWithType(StdDeserializer.java:136)
        at org
.apache.tinkerpop.shaded.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68)
        at org
.apache.tinkerpop.shaded.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4001)
        at org
.apache.tinkerpop.shaded.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3079)

Looks like some sort of serialization error. Ihave set GraphSON 2.0 on both Gremlin Server and on my Python drivers also.

Is there I'm missing here?

Join janusgraph-dev@lists.lfaidata.foundation to automatically receive all group messages.