Date   

Re: Janusgraph evaluation/POC with large semiconductor measurement data advice needed

eric.neufeld@...
 

I forgot:

In that example parvalue contains 5 double values (list property) for each parameter. Might be a bit confusing. However that PropertyMapStep is slow. When i put some similar data into e.g. MongoDB for example i can query that as pandas dataframe in less than 1s or even half a second. But in janusgraph it could take up to 60s.

I run this with JanusGraph 0.6.1 on an old simulation server (32 CPUs,64GB memory or something like that).

Greetings, Eric


Janusgraph evaluation/POC with large semiconductor measurement data advice needed

eric.neufeld@...
 

Hi all,

i am working on a proof of concept if Janusgraph could be used for measurement data in semiconductor industry. Now it's that point i need some advice. What i did was some comparisons theoretically and some even practically with other NoSQL solutions (MongoDB, Cassandra, HBase, ElasticSearch, TimescaleDB, MariaDB...) in our context and use cases. We know that handling measurement data in graph databases is not that common but we just want to try it out. Goal is in future handling about 30TB measurement data (e.g. Process Control Monitor data).

One reason why going with graphs are especially two different use cases. The first is that we want to query data from up to 25 related measurements. Each measurement capture different kind and amount of parameters (e.g. 2500 double and boolean values whatever). The second use case is that we want to query one parameter over given timerange over all existing data (as soon some measurement includes this). The problem is that each measurement (or group of 25 measurements) could include total different parameters. Just image you perform a measurement like a break down voltage and the next time this information is not required case the process looks different and this measurement is not performed (the parameter wont exist in that measurement). Anyway...the graph allows us now to query quite cool stuff e.g. we can traverse over the graph counting all measurements, process modules... or teststructures where most parameters violating limits and so on. This is realy impressive.

It's fast e.g. calcuating some mean or standard derivation over all values from given parametername. But as i somehow already expected, janusgraph does not perform fast when getting a lot of data e.g.


gremlin> lots=g.V().hasLabel('Lot').has('name',"abc").out('lotFile').out('thxxFilePValue').valueMap('name','parvalue').profile()
 
...
  optimization                                                                                 0.027
  backend-query                                                    15540                      10.713
    \_query=thxxFilePValue:SliceQuery[0x74E0,0x74E1)
NoOpBarrierStep(2500)                                              15540       15540          32.170     0.25
PropertyMapStep([name, parvalue],value)                            15540       15540       13011.496    99.38
                                            >TOTAL                     -           -       13092.833 


I am using Janusgraph default with a Cassandra and ES Backend. Seems that Cassandra Backend this is too slow handling that much queries, right? How could this improved? Should i install hadoop/spark and calling SparkComputer? 

Thank you,
Eric


Re: Geo Mapping. How to index/query a non-point geo property?

hadoopmarc@...
 

Hi Dmytro,

Can you please present an easily reproducible scenario, preferably using the default "bin/janusgraph.sh start", like I showed, with gremlin console output and starting with an empty db/cassandra and db/es directories. From your description it is not clear what exactly happened.
And to be sure, when you first added nodes and then defined the mixed index, you also made sure that the graph was reindexed.

Regards,   Marc


Re: Geo Mapping. How to index/query a non-point geo property?

dmitryzezix@...
 

Hi Marc,

The same works for me as well. But it stops when the index is added. You can query only indexed points. Indexed shapes are not retrievable. Please, try to add index and query everything again.

Best wishes,

Dmytro


Re: Geo Mapping. How to index/query a non-point geo property?

hadoopmarc@...
 
Edited

Hi Dmitry,

Here you go again:

g.addV().property('location', Geoshape.line([[52, 0] as double[], [52, 2] as double[]]))

g.V().elementMap()
21:09:39 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>[id:4240,label:vertex,location:LINESTRING (52 0, 52 2)]

gremlin> g.V().has("location", geoWithin(Geoshape.circle(1, 52, 200.0))).elementMap()
==>[id:4240,label:vertex,location:LINESTRING (52 0, 52 2)]

gremlin> g.V().has("location", geoContains(Geoshape.point(1, 52)))
==>v[4240]

As you see, the coordinate order conventions are really warped. This may have led you believe things do not work. I had to find this out too ... Still do not not know which coordinate is latitude and which one longitude :-)

If you want, you can make an issue of it, because the coordinate order for the geoWithin predicate is different for Geoshape.point and Geoshape.circle!

Best wishes,    Marc


Re: Geo Mapping. How to index/query a non-point geo property?

dmitryzezix@...
 

Hi, Marc!

This works. But I need to query complex geoshape, not point. I am able to query complex geoshape only if there is no index created. And after the index for property "location" created - complex geoshapes are no longer available to query.


Re: Geo Mapping. How to index/query a non-point geo property?

hadoopmarc@...
 

Hi Dmitry,

It is not clear to me whether you have problems to get any geo predicate working or that your specific example is the issue.
Can you first confirm that the following works for you (on a clean janusgraph-full-0.6.1):

$ bin/janusgraph.sh start
$ bin/gremlin.sh
    graph = JanusGraphFactory.open('conf/janusgraph-cql-es.properties')
    mgmt = graph.openManagement()
    location = mgmt.makePropertyKey('location').dataType(Geoshape.class).cardinality(Cardinality.SINGLE).make()
    mgmt.buildIndex('byLocation', Vertex.class).addKey(location, Mapping.PREFIX_TREE.asParameter()).buildMixedIndex('search')
    mgmt.commit()

    mgmt = graph.openManagement()
    mgmt.printSchema()
    mgmt.close()    

    g = graph.traversal()
    g.addV().property('location', Geoshape.point(52, 1))
    g.V().has("location", geoWithin(Geoshape.circle(51.9, 1.1, 20.0)))

Note that you may have erred on the radius units: these seem to be in kilometers (not miles, I hope, did not do the calculation...).

Best wishes,  Marc


On Wed, Mar 30, 2022 at 03:57 PM, <dmitryzezix@...> wrote:
g.V().has("location", geoWithin(Geoshape.circle({lat}, {lon}, {radius})))


Re: Integrate CustomVertexProgram to janusgraph

Nikita Pande
 

Hey Marc,

Thanks for help.
I just ran the sample vertex program and created a blog https://medium.com/@nikita15p/integrating-custom-vertex-program-with-janusgraph-33dce1deffda . Hope it helps others as well.

Regards,
Nikita


Re: Integrate CustomVertexProgram to janusgraph

hadoopmarc@...
 

Answer to your other question:

What you do is certainly allowed, but has drawbacks compared to building a separate java package:
  • building janusgraph-core takes more time (slower development cycle)
  • giving different things the same name will cause confusion sooner or later

Best wishes,    Marc


Re: Integrate CustomVertexProgram to janusgraph

hadoopmarc@...
 

Hi Nikita,

gremlin console does a lot of imports for you, but does not cover the full JanusGraph and TinkerPop APIs. So, sometimes you have to do an import yourself. In this case:

gremlin> import org.janusgraph.graphdb.olap.computer.*

Best wishes,     Marc


Re: Integrate CustomVertexProgram to janusgraph

Nikita Pande
 
Edited

Hi @hadoopmarc,

I actually added a customVertexprogram in https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/olap/computer and then built janusgraph-core jar file. I added this jar file in the lib folder of jansugraph releases https://github.com/JanusGraph/janusgraph/releases/download/v0.6.0/janusgraph-0.6.0-doc.zip . However when I try to run this VertexProgram from gremlin console, it gives error:
gremlin> ComputerResult result = graph.compute().program(CustomVertexProgram.build().name("jane").create()).submit().get();
No such property: CustomVertexProgram for class: groovysh_evaluate
Type ':help' or ':h' for help.
Display stack trace? [yN]n

Am I allowed to add the vertex program in https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/olap/computer?
 


Re: Integrate CustomVertexProgram to janusgraph

hadoopmarc@...
 

Yes, I forgot that one, VertexPrograms need to be written in java. If you are not familiar with setting up java projects, you can take a look at the examples at:

https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-examples


Re: Integrate CustomVertexProgram to janusgraph

Nikita Pande
 

Hey @hadoopmarc, currently only supported language for Vertex Program is java? 


Re: Integrate CustomVertexProgram to janusgraph

hadoopmarc@...
 

The easiest way is to write your custom VertexProgram for TinkerPop. If it runs in TinkerPop, it will also run in JanusGraph.
There are other ways, though, see two examples in:
https://github.com/JanusGraph/janusgraph/tree/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/olap/computer


Geo Mapping. How to index/query a non-point geo property?

dmitryzezix@...
 

Hello, guys!

I've tried to query complex geoshape, but got empty response. What am I doing wrong?

 

My query:

transaction = client.submit(
f"""
g.V().has("location", geoWithin(Geoshape.circle({lat}, {lon}, {radius})))
"""
)

 

Index creation:

transaction = client.submit(
"""
mgmt = graph.openManagement()
location = mgmt.makePropertyKey('location').dataType(Geoshape.class).cardinality(Cardinality.SINGLE).make()
mgmt.buildIndex('byLocation', Vertex.class).addKey(location, Mapping.PREFIX_TREE.asParameter()).buildMixedIndex('search')
mgmt.commit()
"""
)

 

Vertex creation:

transaction = client.submit(
f"""
g.addV("place").property("location", {graph_geoshape.to_janus_graph_property()})
"""
)

 

Schema looks like this:

[         '------------------------------------------------------------------------------------------------\n'

          'Vertex Label Name              | Partitioned | Static                                             |\n'

          '---------------------------------------------------------------------------------------------------\n'

          'place                          | false       | false                                              |\n'

          'location                       | false       | false                                              |\n'

          '---------------------------------------------------------------------------------------------------\n'

          'Edge Label Name                | Directed    | Unidirected | Multiplicity                         |\n'

          '---------------------------------------------------------------------------------------------------\n'

          '---------------------------------------------------------------------------------------------------\n'

          'Property Key Name              | Cardinality | Data Type                                          |\n'

          '---------------------------------------------------------------------------------------------------\n'

          'location                       | SINGLE      | class org.janusgraph.core.attribute.Geoshape       |\n'

          'place                          | SINGLE      | class org.janusgraph.core.attribute.Geoshape       |\n'

          '---------------------------------------------------------------------------------------------------\n'

          'Graph Index (Vertex)           | Type        | Unique    | Backing        | Key:           Status |\n'

          '---------------------------------------------------------------------------------------------------\n'

          'byLocation                     | Mixed       | false     | search         | location:     ENABLED |\n'

          '---------------------------------------------------------------------------------------------------\n'

          'Graph Index (Edge)             | Type        | Unique    | Backing        | Key:           Status |\n'

          '---------------------------------------------------------------------------------------------------\n'

          '---------------------------------------------------------------------------------------------------\n'

          'Relation Index (VCI)           | Type        | Direction | Sort Key       | Order    |     Status |\n'

          '---------------------------------------------------------------------------------------------------\n']

 
 
My Vertexes:
V: [{<T.id: 1>: 4096, <T.label: 4>: 'place', 'location': {'@type': 'janusgraph:Geoshape', '@value': {'geometry': {'type': 'LineString', 'coordinates': [[42.546782, 1.699474], [42.546422, 1.699865], [42.545915, 1.700278], [42.545598, 1.700634], [42.545173, 1.701465], [42.544649, 1.703692], [42.544446, 1.705885], [42.544222, 1.708992], [42.544067, 1.711593], [42.543972, 1.71567], [42.543751, 1.722193], [42.543764, 1.723969], [42.543877, 1.725626], [42.544064, 1.726973], [42.544383, 1.728158], [42.544966, 1.729793], [42.545616, 1.73112], [42.546221, 1.732211], [42.54678, 1.733105]]}}}}, {<T.id: 1>: 4136, <T.label: 4>: 'location', 'place': {'@type': 'janusgraph:Geoshape', '@value': {'geometry': {'type': 'LineString', 'coordinates': [[42.546782, 1.699474], [42.546422, 1.699865], [42.545915, 1.700278], [42.545598, 1.700634], [42.545173, 1.701465], [42.544649, 1.703692], [42.544446, 1.705885], [42.544222, 1.708992], [42.544067, 1.711593], [42.543972, 1.71567], [42.543751, 1.722193], [42.543764, 1.723969], [42.543877, 1.725626], [42.544064, 1.726973], [42.544383, 1.728158], [42.544966, 1.729793], [42.545616, 1.73112], [42.546221, 1.732211], [42.54678, 1.733105]]}}}}]
 
 
Recources used:
https://docs.janusgraph.org/index-backend/text-search/#geo-mapping


Integrate CustomVertexProgram to janusgraph

Nikita Pande
 
Edited

What is the current method of integrating new VertexProgram as part of janusgraph. Is it getting the code in tinker pop and then building janusgraph code. Is java only supported language?


Re: Kerberos authentication of gremlin console with Janusgraph server

hadoopmarc@...
 

Kerberos has a reputation for being complex. I would try to first get the pure TinkerPop example working, using the TInkerPop Gremlin-server and Gremlin Console distributions. Also check the log output of Gremlin Server in case of exceptions in Gremlin Console. The command graph = JanusGraphFactory.open('') is not the best example to start with in Gremlin Console. Better is g.V().limit(5).


Re: Kerberos authentication of gremlin console with Janusgraph server

Nikita Pande
 
Edited

Hi Marc,

In my case it's both, gremlin acts as client to kerberised hbase and gremlin acts as kerberised server to gremlin console/clients. Also I have already tested hbase separately along with janus, it works fine. Now I want to add kerberized authentication of janusserver on top of this. So I want gremlin console to get authenticated

Thanks,
Nikita


Re: Kerberos authentication of gremlin console with Janusgraph server

hadoopmarc@...
 

You are mixing up two procedures:
  1. Gremlin Server Krb5Authenticator is for authenticating gremlin clients towards Gremlin Server. Apparently, you do not want it, so remove it from your configs.
  2. Apparently you are trying to have Gremlin Server authenticate againts HBase. This has nothing to do with Gremlin Server's Krb5Authenticate. If the keytab for Gremlin Server is OK and a kinit was done on the Gremlin Server host with the right user, the hbase client of janusgraph-hbase, running on the Gremlin Server host, should be able to access the TGT and authenticate to HBase.

Best wishes,     Marc


Re: Kerberos authentication of gremlin console with Janusgraph server

Nikita Pande
 
Edited

Thanks for recommending this approach. However, I am getting following error:
when running gremlin> def list = client.submit("g.V()").all().get()
>>> CCacheInputStream: readFlags()
get normal credential
org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Failure to initialize security context

Also similarly when earlier I was running, I am getting inconsistent response:
1.  :remote connect tinkerpop.server conf/remote.yaml
2. :remote console
3.  graph=JanusGraphFactory.open("/root/janusgraph-0.6.0/conf/janusgraph-hbase.properties"), sometimes works fine  and returns configured graph. However sometimes when I repeat 1,2. It gives error "Failure to initialize security context"
 

181 - 200 of 6656