Re: [BLOG] Configuring JanusGraph for spark-yarn
Joe Obernberger <joseph.o...@...>
Could this be a networking issue? Maybe a firewall is enabled, or selinux is preventing a connection? I've been able to get this to work, but running a simple count - g.V().count() on anything but a very small graph takes a very very long time (hours). Are there any cache settings, or other resources that could be modified to better the performance? The YARN container logs are filled withe debug lines about 'Created dirty vertex map with initial size 32', 'Created vertex cache with max size 20000', and 'Generated HBase Filter ColumnRange Filter'. Can any of these things be adjusted in the properties file? Thank you! -Joe On 7/24/2017 4:12 AM,
spirit...@... wrote:
|
|
Janusgraph with ES as index backend
Suny <sahithiy...@...>
Hi, I am using Janusgraph with ES as index backend. I created a mixed index on some vertex attributes. This mixed index is saved in ES. Now if i query for that vertices based on index, will janusgraph use ES internally to perform search operation ? Or do i need to use IndexQuery to perform search on ES directly ?
|
|
janus cassandra limitations
mirosla...@...
Hi all, from what I know janus is fork of titan which means if it does not have different storage impl it could have problems with bigger data count. "janusgraph/titan can store up to a quintillion edges (2^60) and half as many vertices. " "The maximum number of cells (rows x columns) in a single partition is 2 billion." 2 billions is about (2^31) in cassandra schema we always have 2 columns per table so you could store about (2^30) values per key so if not mistaken "half as many vertices" is not for cassandra storage backend? I'm using titan 0.4.4 and after having like 50M+ vertices I have spot cassandra started to complain about "Compacting large partition titan/vertexindex:00". So my question is what is real janusgraph/titan limit for cassandra backend which will not "kill" cassandra? Btw I also spot that some keys from "edgestore" table for "supernodes" are also bigger than 1GB with my current graph. Could anyone explain how janusgraph stores data in cassandra and how to configure it to prevent storing huge rows?
|
|
open source graph meetup NYC Aug 22
Jason Plurad <plu...@...>
We're getting another open source graph meetup organized in New York City later this month, Tuesday August 22 https://www.meetup.com/graphs/events/241136321/ If anybody in the community is doing something with graphs that you'd like to present, let me know (pluradj/gmail if you don't want to post up here. We picked the date and location to be close to the first ever JupyterCon conference, so if you have a topic with data science, machine learning, Python, etc, those might work too. Thanks! -- Jason
|
|
I'm starting a new startup big project, should I use Janus as main database to store all my data?
Augusto Will <pw...@...>
I'm thinking about learn Janus to use in my new big project but i can't understand some things. Janus
can be used like any database and supports "insert", "update", "delete"
operations so Janus will write data into Cassandra or other database
to store these data, right? Where Janus store the Nodes, Edges, Attributes etc, it will write these into database, right? These data should be loaded in memory by Janus or will be read from Cassandra all the time? The data that Janus read, must be load in Janus in every query or it will do selects in database to retrieve the data I need? The data retrieved in database is only what I need or Janus will read all records in database all the time? Should I use Janus in my project in production or should I wait until it becomes production ready? I'm
developing some kind of social network that need to store friendship,
posts, comments, user blocks and do some elasticsearch too, in this
case, what database backend should I use? Thank you.
|
|
Re: Failed to load many nodes & edges
Jason Plurad <plu...@...>
Robert Dale posted an answer over on gremlin-users https://groups.google.com/d/msg/gremlin-users/PGtuWvG8UNs/AKtvy9ipAwAJ
On Tuesday, August 1, 2017 at 10:02:04 AM UTC-4, Ohad Pinchevsky wrote:
|
|
Savvi and The Graph
Jay F <na...@...>
(x-post to gremlin-users) Fellow traversers! For the last 3 years at Savvi, we've been doing our bit towards reifying the Universal Graph Theory[1] by building extremely high-fidelity Graphs for our telecoms customers. These graphs span the gamut from active physical infrastructure (the modem at your home, copper cables, sheaths, segments, pits, pillars and so forth, the DSLAM, the rack, the chassis, the line card, the SFP, the ethernet cable... on and on it goes...), to the logical constructs on top (virtual circuits, trunks/LAGs, satellite beam apertures etc.). We then annotate that structural graph with time-series data (link utilisation, for example) - yes, we store this in graph, although hopefully not for too much longer! - as well as rendering more ephemeral data such as incident tickets, alarms/alerts as well. We've been running Titan 0.5.4 in Production since January 2015, and we even worked with the Aurelius team in the background in the early days (at least until they were snarfed up by Datastax!). Since then we've been progressively adding more and more fidelity to the graph as new datasets became available to us. Despite initial teething problems with Titan (principally around HA/DR), Titan has been rock solid for us, running in a mission critical environment and powering a number of applications. One of the more interesting ones is real-time field force optimisation, whereby our automatons will identify common infrastructural elements across incidents, and migrate a field engineer to look at that element as opposed to visiting a number of households. This kinds of application of the graph is extremely valuable, and yet took us just 2 months from concept to Production - possible only because of the expressivity of the Gremlin/Groovy language (80% of that application is an - admittedly very complex - Gremlin query). A big thank you to all of the Tinkerpop, Titan and now JanusGraph developers for contributing to this awesome project and keeping it stable amidst a large amount of change! We are naturally now in the process of migrating to JanusGraph, and expanding our use of Graph into a number of other areas as well. I'm poking my head out as we're now looking for more Graph wizards to join our team, and I thought some of you may find the above interesting even if you're not looking for work - hopefully I won't be hung, drawn and quartered for advertising as a result ;) If you'd like to come play with us, please take a look at the role[2] and either apply or fire an email to firstc...@.... Happy to discuss any of the above on here too. Thanks! [1] https://www.youtube.com/watch?v=aRNWhpEPOOA [2] https://savvi.workable.com/jobs/113135 Best Regards, Jay Fenton (@jfenton / skype:jfenton) Founder & CTO, Savvi Inc. Level 3, 455 Bourke Street Melbourne CBD, VIC, Australia
|
|
Re: janusgraph solr cassandra GraphOfTheGodsFactory
Adam Holley <holl...@...>
I deleted my previous post as it was not correct. You do not need to create a core for each mixedIndex. Assuming you are using Solr cloud mode, and following the instructions for Option 1 (http://docs.janusgraph.org/latest/solr.html#_solr_collections) you just need to manually copy the configset, and then add the initial core. Here's the relevant section from my janusgraph-cassandra-solr.properties file index.search.backend=solr index.search.solr.mode=cloud index.search.solr.zookeeper-url=localhost:2181 index.search.solr.configset=janusgraph
On Thursday, July 13, 2017 at 10:35:47 PM UTC-5, s...@... wrote:
|
|
Re: Janus as an RDF store
Jason Plurad <plu...@...>
JanusGraph supports 3 file formats that are provided via Apache TinkerPop -- Gryo, GraphML, and GraphSON. http://tinkerpop.apache.org/docs/current/reference/#_gremlin_i_o You can load it like this:
If you have some other format file, you'll need to write code to read it in the data file, and then construct the graph elements based on the data.
On Thursday, July 27, 2017 at 9:40:30 AM UTC-4, 谭宇超 wrote:
|
|
Re: Janus as an RDF store
谭宇超 <archu...@...>
Hello Jason,I'm new to JanusGraph, and I didn't find any method that could load data-file into hbase. So, how can I load my data-file into hbase? ps: backend=hbase&caching 在 2017年3月30日星期四 UTC-7下午2:25:26,Jason Plurad写道:
|
|
Re: [BLOG] Configuring JanusGraph for spark-yarn
Joe Obernberger <joseph.o...@...>
Marc - thank you for posting this. I'm trying to get this to work with our CDH 5.10.0 distribution, but have run into an issue; but first some questions. I'm using a 5 node cluster, and I think I do not need to set the zookeeper.zone.parent since the hbase configuration is in /etc/conf/hbase. Is that correct? The error that I'm getting is: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 1 in stage 0.0 failed 4 times, most recent failure:
Lost task 1.3 in stage 0.0 (TID 10, host002, executor 1):
java.lang.ClassCastException: cannot assign instance of
java.lang.invoke.SerializedLambda to field
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.x$330
of type org.apache.spark.api.java.function.PairFunction in
instance of
org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1 Given this post: It looks like I'm not including a necessary jar, but I'm at a loss as to which one. Any ideas? For reference, here is part of the config: # Thank you! -Joe On 7/6/2017 4:15 AM, HadoopMarc wrote:
|
|
Re: JanusGraph support for Cassandra 3.x
Vladyslav Kosulin <vkos...@...>
Yer, thrift was essential part.of our model.
On Friday, July 14, 2017 at 1:06:00 PM UTC-4, Ted Wilmes wrote:
|
|
Why Don't support partitioned vertex while I using Janus-hadoop
spirit...@...
I find out the code in JanusGraph, the following
|
|
Re: when release 0.2.0?
Ranger Tsao <cao....@...>
Glad to 在 2017年7月22日星期六 UTC+8上午3:55:53,Jason Plurad写道:
|
|
Re: Geoshape property in remote gremlin query, GraphSON
rosen...@...
Dear Robert, thank-you for your prompt reply! The given patch solves my problem. For reference to future viewers of the post, don't forget to include the `janusgraph` namespace in your GraphSON for `@type`:
On Wednesday, July 19, 2017 at 1:16:37 PM UTC-6, Robert Dale wrote:
|
|
Re: Not able to connect when 1 of 3 nodes is down in the Cassandra cluster
Jason Plurad <plu...@...>
This is more of a Cassandra question than JanusGraph/Titan. If you have two nodes in DC1 and the read/write consistency settings are LOCAL_QUORUM, you can't reach a local quorum in DC1 when one node is down. You could try either LOCAL_ONE or QUORUM. http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_config_consistency_c.html
On Sunday, July 23, 2017 at 9:14:12 AM UTC-4, Bharat Dighe wrote:
|
|
Re: Make HttpChannelizer enabled while using external cassandra in Titan(1.0.0)
Jason Plurad <plu...@...>
Hi Manoj, There are directions for JanusGraph in the documentation here http://docs.janusgraph.org/latest/server.html#_janusgraph_server_as_a_rest_style_endpoint If you're using the default gremlin-server.yaml with uses the janusgraph-cassandra-es.properties, you just need to change the channelizer channelizer: org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer -- Jason
On Monday, July 24, 2017 at 9:16:47 AM UTC-4, manoj nainwal wrote:
|
|
Re: [BLOG] Configuring JanusGraph for spark-yarn
spirit...@...
hi,Thanks for your post. I did it according to the post.But I ran into a problem.
在 2017年7月6日星期四 UTC+8下午4:15:37,HadoopMarc写道:
|
|
Make HttpChannelizer enabled while using external cassandra in Titan(1.0.0)
manoj92...@...
Hi all, Could you please tell me, how can I enable HttpChannelizer while using external cassandra? Thank you, Manoj
|
|
Re: how can i remove the index
李平 <lipin...@...>
BaseConfiguration baseConfiguration = new BaseConfiguration();
it can not turn to register status,exception is timeout 10:54:19.148 [main] DEBUG org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Key phone has status INSTALLED 10:54:19.148 [main] INFO org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Some key(s) on index phoneIndex do not currently have status REGISTERED: phone=INSTALLED 10:54:19.148 [main] INFO org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Timed out (PT1M) while waiting for index phoneIndex to converge on status REGISTERED 在 2017年7月22日星期六 UTC+8上午12:15:20,David Pitera写道:
|
|