Date   

Erratic behavior with Cassandra input and SparkGraphComputer OLAP engine

Samik R <sam...@...>
 

Hi,

I am testing out SparkGraphComputer for OLAP queries, directly reading data from a JG-Cassandra-ES instance. Everything is running on a single VM, and I have built JG on the box but cloning the repo. Using hadoop version 2.7.1 with Spark 1.6.1. Cassandra version 2.1.9 (same as packaged).

I am using the properties file mentioned in this SO thread - mostly because the setup matches with mine. I initially tried out with a smaller graph having ~1K nodes and 1.5K edges, and things seem to work fine. However when I try OLAP queries with ~300K nodes, I am facing various issues.

  • Initially, I got hit by the Exception: "java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Frame size (20784689) larger than max length (15728640)!". After some reasearch, I added the following line to the properties file: cassandra.thrift.framed.size_mb=200
  • In the next try, the Cassandra process died when I tried running the query. The gremlin server and ES processes were running though.

gremlin> graph = GraphFactory.open("conf/hadoop-graph/read-cassandra.properties")
==>hadoopgraph[cassandrainputformat->gryooutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.V().count()
[Stage 0:===21:16:23 ERROR org.apache.spark.executor.Executor  - Exception in task 4.0 in stage 0.0 (TID 4)
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
...

org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
...

Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent failure in storage backend
...

Caused by: org.apache.thrift.transport.TTransportException

...


  • I restarted janusgraph and retried the same query. This time the query went through, but the same exception reappeared when I tried a groupCount.

gremlin> g.V().count()
                                                                        ==>108156
gremlin> g.V().groupCount().by(T.label)
[Stage 0:>                           21:23:49 ERROR org.apache.spark.executor.Executor  - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException


  • Another restart, and the groupCount() query went through, but the gremlin shell got killed when I tried the count query. All three daemons (gremlin, Cassandra and ES) were still running though.

gremlin> g.V().groupCount().by(T.label)
[Stage 2:>                                                          (0 +==>[hotLead:1,proactiveChatInvite:1,chatSession:906,webPage:56921,buttonChatInvite:1,webPurchase:1,visitor:1269,webSession:27378,device:21677,cart:1]
gremlin> g.V().count()
[Stage 0:>                           Killed                                    
samik@samik-lap:~/git/janusgraph$ Write failed: Broken pipe


These all seems pretty erratic to me. Any suggestions on getting consistent result with this?


Regards.

-Samik


Re: JanusGraph release roadmap

Ted Wilmes <twi...@...>
 

Definitely.  We're not quite there but getting close: https://github.com/JanusGraph/janusgraph/issues/40.

--Ted


On Wednesday, February 22, 2017 at 9:22:28 AM UTC-6, Manas Bajaj wrote:
Ted - Thanks for the response and the link. It would be best if the team starts posting snapshot zips that can be used. 

On Tuesday, February 21, 2017 at 8:46:52 AM UTC-5, Ted Wilmes wrote:
Hello,
A tentative date has not been set yet.  There has been some discussion on the dev list.
There are a number of prerequisites that we're working through.  For now, I'd recommend
watching the dev list as discussions continue.  You can also track progress of the issues that
are slated to be included by looking at https://github.com/JanusGraph/janusgraph/milestone/1.
That's not necessarily a complete or final list but it should give you some insight along with
the dev chatter.

Thanks,
Ted

On Monday, February 20, 2017 at 8:16:29 PM UTC-6, 吴冉波 wrote:
Want too ~~~~
And hope to make ElasticSearch to be the new storage backend.

On Monday, February 20, 2017 at 10:22:54 AM UTC+8, Manas Bajaj wrote:
Hello JanusGraph committers

We are all very excited to see JanusGraph kickoff in 2017. Is there a release roadmap for this project? What will be the first release and tentatively when?

Thanks
Manas


Re: JanusGraph build and OS compatibility.

HadoopMarc <m.c.d...@...>
 

Hi Manoy,

I do not have a Titan or JanuGraph build at hand over here, but I remember there was a zip archive in the titan-dist/titan-dist-hadoop2/target folder.

Cheers,   Marc

Op donderdag 23 februari 2017 04:57:13 UTC+1 schreef Manoj Waikar:

Hi,

I have a few questions about JanusGraph -

1) The maven task for creating a build, fails on Windows. So, can we build JanusGraph only on Mac / Linux?

2) The build created on Mac / Linux does not work on Windows - for example, if I have to run the JanusGraph server, there is only one file in the build's bin directory - janusgraph.sh, and there isn't any bat file for Windows. So, does the JanusGraph server only run on Mac / Linux?


Thanks,
Manoj.


Re: JanusGraph server vs. embedded mode.

HadoopMarc <m.c.d...@...>
 


Hi Manoj,

It totally depends on your total system setup: how many end users, application servers, backend nodes, security demands, etc.

For just issuing a gremlin query and getting results back, the deployment mode does not matter. Gremlin server, however, gives you far more control regarding security, scaling, reliability, non-JVM client connections, etc.

Hope this, helps, otherwise sketch your system setup.

Cheers,   Marc


Op donderdag 23 februari 2017 07:22:58 UTC+1 schreef Manoj Waikar:

Also, is there any equivalent of the Frames component of the older Tinkerpop stack in the current stack?


Re: JanusGraph build and OS compatibility.

Palash Kulshrestha <tricky...@...>
 

Hi Manoj
Just FYI there is developers list https://groups.google.com/forum/#!forum/janusgraph-dev also.


On Thursday, February 23, 2017 at 9:27:13 AM UTC+5:30, Manoj Waikar wrote:
Hi,

I have a few questions about JanusGraph -

1) The maven task for creating a build, fails on Windows. So, can we build JanusGraph only on Mac / Linux?

2) The build created on Mac / Linux does not work on Windows - for example, if I have to run the JanusGraph server, there is only one file in the build's bin directory - janusgraph.sh, and there isn't any bat file for Windows. So, does the JanusGraph server only run on Mac / Linux?


Thanks,
Manoj.


Re: JanusGraph server vs. embedded mode.

Manoj Waikar <mmwa...@...>
 

Also, is there any equivalent of the Frames component of the older Tinkerpop stack in the current stack?


JanusGraph server vs. embedded mode.

Manoj Waikar <mmwa...@...>
 

Hi,

What is the recommended approach of working with JanusGraph? Should one - 
 (a) Setup Gremlin (JanusGraph) server and submit Gremlin queries to the server? Or,
 (b) Embed JanusGraph inside the application and use the native Java API to execute queries against the graph within the same JVM?
 
So, what are the pros and cons of running a JanusGraph server and then connecting to it vs. directly connecting to a data store using JanusGraphFactory.open() method?

Also, in the Appendix A. API Documentation (JavaDoc), it is mentioned - 
We strongly encourage all users of JanusGraph to use the Gremlin query language for any queries executed on JanusGraph and to not use JanusGraph’s APIs outside of the management system.

Does it mean, that only for creating schema should we use the JanusGraph’s management APIs and for everything else, use Gremlin?

Thanks,
Manoj.


Cross instance communication in graphs with custom partitioning

Alain Rodriguez <al...@...>
 

Hi,

The JanusGraph documentation suggests that using custom partitioning schemes can reduce cross-instance communication by placing vertices that are frequently traversed together in the same instance.

However, random partitioning results in less efficient query processing as the JanusGraph cluster grows to accommodate more graph data because of the increasing cross-instance communication required to retrieve the query’s result set.

Can someone clarify what is meant by instance in this context? I am assuming it refers to storage backend instance, eg cross-Cassandra instance communication.

What about placing closely related entities in the same C* machine makes query traversal faster? Does Janus read and cache full ranges of rows from Cassandra at a time, thus increasing the probability that the nearby vertices are preloaded into the cache and used in subsequent iteration steps? Is Janus counting on C* to load and keep in cache a nearby-block of rows?

I was under the impression that Janus executes queries in an iterative fashion, thus issuing one network request to C* per traversal hop.

Any clarifications much appreciated!

Alain


JanusGraph build and OS compatibility.

Manoj Waikar <mmwa...@...>
 

Hi,

I have a few questions about JanusGraph -

1) The maven task for creating a build, fails on Windows. So, can we build JanusGraph only on Mac / Linux?

2) The build created on Mac / Linux does not work on Windows - for example, if I have to run the JanusGraph server, there is only one file in the build's bin directory - janusgraph.sh, and there isn't any bat file for Windows. So, does the JanusGraph server only run on Mac / Linux?


Thanks,
Manoj.


JanusGraph versus Google Cayley

Miguel Coimbra <miguel....@...>
 

Dear Community,

Apologies if this is evident or has been thoroughly detailed elsewhere.
- What are the main differences (with respect to target audiences, technical details and graph functionalities) between the Google Cayley graph database and JanusGraph (which is not only backed by Google)?

Thank you for your time.

Kind regards,

- Miguel


Re: Installation of JanusGraph

Manoj Waikar <mmwa...@...>
 

We were able to build it on Mac / Linux using the instructions mentioned, however the build fails on Windows (because of a few Linux commands used therein).


On Wednesday, February 22, 2017 at 3:52:35 AM UTC+5:30, lo...@... wrote:
Hi Misha,

What exactly is involved in building locally?  I can't seem to find any documentation.

Thanks

On Wednesday, February 8, 2017 at 10:07:39 PM UTC+1, Misha Brukman wrote:
Yes, that's correct. For now, please clone and build locally.

https://github.com/JanusGraph/janusgraph/issues/40 is tracking the work of building a release snapshot to simplify the getting started process.

On Wed, Feb 8, 2017 at 2:15 PM, e2t2 <gha...@...> wrote:
Hi All,

I am new to graph world and have done basic Titan tutorials, etc. Happy to hear that JanusGraph will continue where titan guys left.

Just a quick question, in the Getting started guide for JanusGraph it says to download zip file and unzip it to get started. However, in download link there is no release available.

Do I need to clone the repository and build it myself or I am missing something here?

Thanks

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: JanusGraph release roadmap

Manas Bajaj <manas...@...>
 

Ted - Thanks for the response and the link. It would be best if the team starts posting snapshot zips that can be used. 


On Tuesday, February 21, 2017 at 8:46:52 AM UTC-5, Ted Wilmes wrote:
Hello,
A tentative date has not been set yet.  There has been some discussion on the dev list.
There are a number of prerequisites that we're working through.  For now, I'd recommend
watching the dev list as discussions continue.  You can also track progress of the issues that
are slated to be included by looking at https://github.com/JanusGraph/janusgraph/milestone/1.
That's not necessarily a complete or final list but it should give you some insight along with
the dev chatter.

Thanks,
Ted

On Monday, February 20, 2017 at 8:16:29 PM UTC-6, 吴冉波 wrote:
Want too ~~~~
And hope to make ElasticSearch to be the new storage backend.

On Monday, February 20, 2017 at 10:22:54 AM UTC+8, Manas Bajaj wrote:
Hello JanusGraph committers

We are all very excited to see JanusGraph kickoff in 2017. Is there a release roadmap for this project? What will be the first release and tentatively when?

Thanks
Manas


Re: Installation of JanusGraph

Misha Brukman <mbru...@...>
 

On Tue, Feb 21, 2017 at 4:41 PM, <loui...@...> wrote:
Hi Misha,

What exactly is involved in building locally?  I can't seem to find any documentation.

Thanks

On Wednesday, February 8, 2017 at 10:07:39 PM UTC+1, Misha Brukman wrote:
Yes, that's correct. For now, please clone and build locally.

https://github.com/JanusGraph/janusgraph/issues/40 is tracking the work of building a release snapshot to simplify the getting started process.

On Wed, Feb 8, 2017 at 2:15 PM, e2t2 <gha...@...> wrote:
Hi All,

I am new to graph world and have done basic Titan tutorials, etc. Happy to hear that JanusGraph will continue where titan guys left.

Just a quick question, in the Getting started guide for JanusGraph it says to download zip file and unzip it to get started. However, in download link there is no release available.

Do I need to clone the repository and build it myself or I am missing something here?

Thanks

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Installation of JanusGraph

loui...@...
 

Hi Misha,

What exactly is involved in building locally?  I can't seem to find any documentation.

Thanks


On Wednesday, February 8, 2017 at 10:07:39 PM UTC+1, Misha Brukman wrote:
Yes, that's correct. For now, please clone and build locally.

https://github.com/JanusGraph/janusgraph/issues/40 is tracking the work of building a release snapshot to simplify the getting started process.

On Wed, Feb 8, 2017 at 2:15 PM, e2t2 <gha...@...> wrote:
Hi All,

I am new to graph world and have done basic Titan tutorials, etc. Happy to hear that JanusGraph will continue where titan guys left.

Just a quick question, in the Getting started guide for JanusGraph it says to download zip file and unzip it to get started. However, in download link there is no release available.

Do I need to clone the repository and build it myself or I am missing something here?

Thanks

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: JanusGraph release roadmap

Ted Wilmes <twi...@...>
 

Hello,
A tentative date has not been set yet.  There has been some discussion on the dev list.
There are a number of prerequisites that we're working through.  For now, I'd recommend
watching the dev list as discussions continue.  You can also track progress of the issues that
are slated to be included by looking at https://github.com/JanusGraph/janusgraph/milestone/1.
That's not necessarily a complete or final list but it should give you some insight along with
the dev chatter.

Thanks,
Ted


On Monday, February 20, 2017 at 8:16:29 PM UTC-6, 吴冉波 wrote:
Want too ~~~~
And hope to make ElasticSearch to be the new storage backend.

On Monday, February 20, 2017 at 10:22:54 AM UTC+8, Manas Bajaj wrote:
Hello JanusGraph committers

We are all very excited to see JanusGraph kickoff in 2017. Is there a release roadmap for this project? What will be the first release and tentatively when?

Thanks
Manas


Build and first use.

Manoj Waikar <mmwa...@...>
 

Hi,

I was able to build JanusGraph after cloning the Git repo.

I included the janusgraph-core and janusgraph-cassandra jar files in my project. Additionally, I had to specify the following dependencies in my pom file -

<dependency>
 
<groupId>org.apache.tinkerpop</groupId>
 
<artifactId>gremlin-core</artifactId>
 
<version>3.2.4</version>
</dependency>

<dependency>
 
<groupId>com.google.guava</groupId>
 
<artifactId>guava</artifactId>
 
<version>21.0</version>
</dependency>

<dependency>
 
<groupId>org.apache.commons</groupId>
 
<artifactId>commons-lang3</artifactId>
 
<version>3.4</version>
</dependency>

<dependency>
 
<groupId>com.netflix.astyanax</groupId>
 
<artifactId>astyanax-core</artifactId>
 
<version>3.9.0</version>
</dependency>

<dependency>
 
<groupId>com.netflix.astyanax</groupId>
 
<artifactId>astyanax-cassandra</artifactId>
 
<version>3.9.0</version>
</dependency>

<dependency>
 
<groupId>com.netflix.astyanax</groupId>
 
<artifactId>astyanax-thrift</artifactId>
 
<version>3.9.0</version>
</dependency>

<dependency>
 
<groupId>com.codahale.metrics</groupId>
 
<artifactId>metrics-core</artifactId>
 
<version>3.0.2</version>
</dependency>

<dependency>
 
<groupId>com.spatial4j</groupId>
 
<artifactId>spatial4j</artifactId>
 
<version>0.5</version>
</dependency>

Is this expected, or am I missing something?

Thanks,
Manoj.


Re: JanusGraph release roadmap

吴冉波 <wur...@...>
 

Want too ~~~~
And hope to make ElasticSearch to be the new storage backend.


On Monday, February 20, 2017 at 10:22:54 AM UTC+8, Manas Bajaj wrote:
Hello JanusGraph committers

We are all very excited to see JanusGraph kickoff in 2017. Is there a release roadmap for this project? What will be the first release and tentatively when?

Thanks
Manas


JanusGraph release roadmap

Manas Bajaj <manas...@...>
 

Hello JanusGraph committers

We are all very excited to see JanusGraph kickoff in 2017. Is there a release roadmap for this project? What will be the first release and tentatively when?

Thanks
Manas


Wrapper Class Support

Barry Hill <barry...@...>
 

It's only for Arrays oops

Work around for me is to use Kotlin IntArray which compiles to Javas int[]


Wrapper Class Support

Barry Hill <barry...@...>
 

Is it possible to add support for Arrays of wrapper classes such as Integer in JanusGraph? 

Exception in thread "main" java.lang.IllegalArgumentException: Property value [[Ljava.lang.Integer;@1583741e] is of type class [Ljava.lang.Integer; is not supported
at org.apache.tinkerpop.gremlin.structure.Property$Exceptions.dataTypeOfPropertyValueNotSupported(Property.java:163)
at org.apache.tinkerpop.gremlin.structure.Property$Exceptions.dataTypeOfPropertyValueNotSupported(Property.java:159)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.verifyAttribute(StandardJanusGraphTx.java:578)
at org.janusgraph.graphdb.query.QueryUtil.addConstraint(QueryUtil.java:233)
at org.janusgraph.graphdb.query.QueryUtil.constraints2QNF(QueryUtil.java:223)
at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.constructQueryWithoutProfile(GraphCentricQueryBuilder.java:213)
at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.constructQuery(GraphCentricQueryBuilder.java:202)
at org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.vertices(GraphCentricQueryBuilder.java:165)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphStep.lambda$new$0(JanusGraphStep.java:62)
at org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:123)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:126)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:37)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:157)

I'm using Kotlin for my project and kotlin uses wrapper classes for types such as Int (Integer)

https://kotlinlang.org/docs/reference/basic-types.html


I'm new to TinkerPop/JanusGraph so pardon my ignorance.

6601 - 6620 of 6651