Date   

Re: janus cassandra limitations

mirosla...@...
 

Ok so i get it a bit wrong in my initial assumption.

1.
"vertexindex" stores values for all properties for all vertices.
In my case key=0x00 is 'false' and this value is stored in 90% of my vertices.

so still in theory you could have so many vertices as titan schema allows but you could not store same value for any property more than 2^30 times.

2.
"edgestorage" contains information about all vertices with all properties values references and all edges per vertex
this means one vertex could have in theory maximum of 2^30 edges

3.
Request to janusgraph designers: 


On Thursday, August 3, 2017 at 12:58:29 AM UTC+2, Kelvin Lawrence wrote:

Hi Mirosław,

Janus graph uses an adjacency list model for storing vertices and edges. A vertex, its properties and all of its adjacent edges are stored in a single Cassandra row,

The Janus Graph documentation goes into these issues in some detail.

You are using a very old version of Titan BTW. It would be worth upgrading if you can.

Cheers,
Kelvin

On Wednesday, August 2, 2017 at 10:36:39 AM UTC-5, Mirosław Głusiuk wrote:

Hi all,


from what I know janus is fork of titan which means if it does not have different storage impl it could have problems with bigger data count.


"janusgraph/titan can store up to a quintillion edges (2^60) and half as many vertices. "

"The maximum number of cells (rows x columns) in a single partition is 2 billion."


2 billions is about (2^31)

in cassandra schema we always have 2 columns per table so you could store about (2^30) values per key

so if not mistaken "half as many vertices" is not for cassandra storage backend?


I'm using titan 0.4.4 and after having like 50M+ vertices I have spot cassandra started to complain about "Compacting large partition titan/vertexindex:00".
As I understand partition for key 0x00 already is too big and start to causing performance during compaction.
Also I spot that it contains one value per each created vertex (8+8=16bytes). so it is already bigger than 500MB in size which is already bigger than cassandra recommendation.
http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningPartitionSize.html


So my question is what is real janusgraph/titan limit for cassandra backend which will not "kill" cassandra?

Btw I also spot that some keys from "edgestore" table for "supernodes" are also bigger than 1GB with my current graph.


Could anyone explain how janusgraph stores data in cassandra and how to configure it to prevent storing huge rows?


hi how can i use janusGraph api to connect gremlin-server

李平 <lipin...@...>
 

i want to use janusGraph api to connect my gremlin server  
another question ,how to build a unique vertex if the vertex exists ,and then return the vertex


Re: janus cassandra limitations

Kelvin Lawrence <kelvin....@...>
 


Hi Mirosław,

Janus graph uses an adjacency list model for storing vertices and edges. A vertex, its properties and all of its adjacent edges are stored in a single Cassandra row,

The Janus Graph documentation goes into these issues in some detail.
http://docs.janusgraph.org/latest/index.html

You are using a very old version of Titan BTW. It would be worth upgrading if you can.

Cheers,
Kelvin

On Wednesday, August 2, 2017 at 10:36:39 AM UTC-5, Mirosław Głusiuk wrote:

Hi all,


from what I know janus is fork of titan which means if it does not have different storage impl it could have problems with bigger data count.


"janusgraph/titan can store up to a quintillion edges (2^60) and half as many vertices. "

"The maximum number of cells (rows x columns) in a single partition is 2 billion."


2 billions is about (2^31)

in cassandra schema we always have 2 columns per table so you could store about (2^30) values per key

so if not mistaken "half as many vertices" is not for cassandra storage backend?


I'm using titan 0.4.4 and after having like 50M+ vertices I have spot cassandra started to complain about "Compacting large partition titan/vertexindex:00".
As I understand partition for key 0x00 already is too big and start to causing performance during compaction.
Also I spot that it contains one value per each created vertex (8+8=16bytes). so it is already bigger than 500MB in size which is already bigger than cassandra recommendation.
http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningPartitionSize.html


So my question is what is real janusgraph/titan limit for cassandra backend which will not "kill" cassandra?

Btw I also spot that some keys from "edgestore" table for "supernodes" are also bigger than 1GB with my current graph.


Could anyone explain how janusgraph stores data in cassandra and how to configure it to prevent storing huge rows?


Re: Janusgraph with ES as index backend

Kelvin Lawrence <kelvin....@...>
 

If you tell Janus about the indexed properties using the management API it will use them automatically when you run Gremlin queries. You only need to use indexQuery for cases where you want to read from the index directly for other reasons.

HTH

Kelvin


On Wednesday, August 2, 2017 at 12:46:37 PM UTC-5, Suny wrote:
Hi,

I am using Janusgraph with ES as index backend. I created a mixed index  on some vertex attributes. This mixed index is saved in ES.

Now if i query for that vertices based on index, will janusgraph use ES internally to perform search operation ? Or do i need to use IndexQuery to perform search on ES directly ?


Re: I'm starting a new startup big project, should I use Janus as main database to store all my data?

Kelvin Lawrence <kelvin....@...>
 

Hi there,

I don't think it would be appropriate to make definitive recommendations as to whether or not to use Janus in production for your needs. The best way to decide on that is to install it and run some tests. What I do know is that on this list a number of people have indicated they either already are or plan to build solutions that include Janus Graph.

As to your other questions here are some answers.

Janus graph supports the Gremlin query and traversal language that let's you add, delete, update nodes and edges to a graph.

Janus supports numerous back end stores that include Cassandra, HBase and Berkley DB and it can also run just in memory which is good for testing. The graph data is persisted to the back end store.

Deciding which back end store to use will depend on many factors. You will want to consider things like number of users and whether you care more about consistency or availability when making that choice. 

I would encourage you to install Janus and run some tests and see what works best for your needs. I'm sure people on this list can help if you encounter issues as you experiment.

HTH

Kelvin


On Wednesday, August 2, 2017 at 8:56:53 AM UTC-5, Augusto Will wrote:
I'm thinking about learn Janus to use in my new big project but i can't understand some things.

Janus can be used like any database and supports "insert", "update", "delete"  operations so Janus will write data into Cassandra or other database to store these data, right?

Where Janus store the Nodes, Edges, Attributes etc, it will write these into database, right?

These data should be loaded in memory by Janus or will be read from Cassandra all the time?

The data that Janus read, must be load in Janus in every query or it will do selects in database to retrieve the data I need?

The data retrieved in database is only what I need or Janus will read all records in database all the time?

Should I use Janus in my project in production or should I wait until it becomes production ready?

I'm developing some kind of social network that need to store friendship, posts, comments, user blocks and do some elasticsearch too, in this case, what database backend should I use?


Thank you.


Re: [BLOG] Configuring JanusGraph for spark-yarn

Joe Obernberger <joseph.o...@...>
 

Could this be a networking issue?  Maybe a firewall is enabled, or selinux is preventing a connection?

I've been able to get this to work, but running a simple count - g.V().count() on anything but a very small graph takes a very very long time (hours).  Are there any cache settings, or other resources that could be modified to better the performance?

The YARN container logs are filled withe debug lines about 'Created dirty vertex map with initial size 32', 'Created vertex cache with max size 20000', and 'Generated HBase Filter ColumnRange Filter'.  Can any of these things be adjusted in the properties file?  Thank you!

-Joe


On 7/24/2017 4:12 AM, spirit...@... wrote:

hi,Thanks for your post.
I did it according to the post.But I ran into a problem.
15:58:49,110  INFO SecurityManager:58 - Changing view acls to: rc
15:58:49,110  INFO SecurityManager:58 - Changing modify acls to: rc
15:58:49,110  INFO SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rc); users with modify permissions: Set(rc)
15:58:49,111  INFO Client:58 - Submitting application 25 to ResourceManager
15:58:49,320  INFO YarnClientImpl:274 - Submitted application application_1500608983535_0025
15:58:49,321  INFO SchedulerExtensionServices:58 - Starting Yarn extension services with app application_1500608983535_0025 and attemptId None
15:58:50,325  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:50,326  INFO Client:58 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1500883129115
final status: UNDEFINED
user: rc
15:58:51,330  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:52,333  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:53,335  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:54,337  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:55,340  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:56,343  INFO Client:58 - Application report for application_1500608983535_0025 (state: ACCEPTED)
15:58:56,802  INFO YarnSchedulerBackend$YarnSchedulerEndpoint:58 - ApplicationMaster registered as NettyRpcEndpointRef(null)
15:58:56,822  INFO YarnClientSchedulerBackend:58 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dl-rc-optd-ambari-master-v-test-1.host.dataengine.com,dl-rc-optd-ambari-master-v-test-2.host.dataengine.com, PROXY_URI_BASES -> http://dl-rc-optd-ambari-master-v-test-1.host.dataengine.com:8088/proxy/application_1500608983535_0025,http://dl-rc-optd-ambari-master-v-test-2.host.dataengine.com:8088/proxy/application_1500608983535_0025), /proxy/application_1500608983535_0025
15:58:56,824  INFO JettyUtils:58 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15:58:57,346  INFO Client:58 - Application report for application_1500608983535_0025 (state: RUNNING)
15:58:57,347  INFO Client:58 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.200.48.154
ApplicationMaster RPC port: 0
queue: default
start time: 1500883129115
final status: UNDEFINED
user: rc
15:58:57,348  INFO YarnClientSchedulerBackend:58 - Application application_1500608983535_0025 has started running.
15:58:57,358  INFO Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 47514.
15:58:57,358  INFO NettyBlockTransferService:58 - Server created on 47514
15:58:57,360  INFO BlockManagerMaster:58 - Trying to register BlockManager
15:58:57,363  INFO BlockManagerMasterEndpoint:58 - Registering block manager 10.200.48.112:47514 with 2.4 GB RAM, BlockManagerId(driver, 10.200.48.112, 47514)15:58:57,366  INFO BlockManagerMaster:58 - Registered BlockManager
15:58:57,585  INFO EventLoggingListener:58 - Logging events to hdfs:///spark-history/application_1500608983535_0025
15:59:07,177  WARN YarnSchedulerBackend$YarnSchedulerEndpoint:70 - Container marked as failed: container_e170_1500608983535_0025_01_000002 on host: dl-rc-optd-ambari-slave-v-test-1.host.dataengine.com. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_e170_1500608983535_0025_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Shell output: main : command provided 1
main : run as user is rc
main : requested yarn user is rc


Container exited with a non-zero exit code 1
Display stack trace? [yN]15:59:57,702  WARN TransportChannelHandler:79 - Exception in connection from 10.200.48.155/10.200.48.155:50921
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
15:59:57,704 ERROR TransportResponseHandler:132 - Still have 1 requests outstanding when connection from 10.200.48.155/10.200.48.155:50921 is closed
15:59:57,706  WARN NettyRpcEndpointRef:91 - Error sending message [message = RequestExecutors(0,0,Map())] in 1 attempts
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)

I am confused about that. Could you please help me?



在 2017年7月6日星期四 UTC+8下午4:15:37,HadoopMarc写道:

Readers wanting to run OLAP queries on a real spark-yarn cluster might want to check my recent post:

http://yaaics.blogspot.nl/2017/07/configuring-janusgraph-for-spark-yarn.html

Regards,  Marc
--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
For more options, visit https://groups.google.com/d/optout.

Virus-free. www.avg.com


Janusgraph with ES as index backend

Suny <sahithiy...@...>
 

Hi,

I am using Janusgraph with ES as index backend. I created a mixed index  on some vertex attributes. This mixed index is saved in ES.

Now if i query for that vertices based on index, will janusgraph use ES internally to perform search operation ? Or do i need to use IndexQuery to perform search on ES directly ?


janus cassandra limitations

mirosla...@...
 

Hi all,


from what I know janus is fork of titan which means if it does not have different storage impl it could have problems with bigger data count.


"janusgraph/titan can store up to a quintillion edges (2^60) and half as many vertices. "

"The maximum number of cells (rows x columns) in a single partition is 2 billion."


2 billions is about (2^31)

in cassandra schema we always have 2 columns per table so you could store about (2^30) values per key

so if not mistaken "half as many vertices" is not for cassandra storage backend?


I'm using titan 0.4.4 and after having like 50M+ vertices I have spot cassandra started to complain about "Compacting large partition titan/vertexindex:00".
As I understand partition for key 0x00 already is too big and start to causing performance during compaction.
Also I spot that it contains one value per each created vertex (8+8=16bytes). so it is already bigger than 500MB in size which is already bigger than cassandra recommendation.
http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningPartitionSize.html


So my question is what is real janusgraph/titan limit for cassandra backend which will not "kill" cassandra?

Btw I also spot that some keys from "edgestore" table for "supernodes" are also bigger than 1GB with my current graph.


Could anyone explain how janusgraph stores data in cassandra and how to configure it to prevent storing huge rows?


open source graph meetup NYC Aug 22

Jason Plurad <plu...@...>
 

We're getting another open source graph meetup organized in New York City later this month, Tuesday August 22
https://www.meetup.com/graphs/events/241136321/

If anybody in the community is doing something with graphs that you'd like to present, let me know (pluradj/gmail if you don't want to post up here. We picked the date and location to be close to the first ever JupyterCon conference, so if you have a topic with data science, machine learning, Python, etc, those might work too.

Thanks!
-- Jason


I'm starting a new startup big project, should I use Janus as main database to store all my data?

Augusto Will <pw...@...>
 

I'm thinking about learn Janus to use in my new big project but i can't understand some things.

Janus can be used like any database and supports "insert", "update", "delete"  operations so Janus will write data into Cassandra or other database to store these data, right?

Where Janus store the Nodes, Edges, Attributes etc, it will write these into database, right?

These data should be loaded in memory by Janus or will be read from Cassandra all the time?

The data that Janus read, must be load in Janus in every query or it will do selects in database to retrieve the data I need?

The data retrieved in database is only what I need or Janus will read all records in database all the time?

Should I use Janus in my project in production or should I wait until it becomes production ready?

I'm developing some kind of social network that need to store friendship, posts, comments, user blocks and do some elasticsearch too, in this case, what database backend should I use?


Thank you.


Re: Failed to load many nodes & edges

Jason Plurad <plu...@...>
 

Robert Dale posted an answer over on gremlin-users https://groups.google.com/d/msg/gremlin-users/PGtuWvG8UNs/AKtvy9ipAwAJ


On Tuesday, August 1, 2017 at 10:02:04 AM UTC-4, Ohad Pinchevsky wrote:
Hi,

I am calling submit many times in order to add nodes from CSV file and it is failing.
In the server I see the following WARN message:

org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor - Pausing response writing as writeBufferHighWaterMark exceeded on RequestMessage{, requestId=b7647c81-0648-40b2-a5aa-edc6d1ec196e, op='eval', processor='', args={gremlin=g.addV('group').property('uuid',uuid).property('name',name), bindings={name=GROUP-26, uuid=GR-3A5F386A663A4892BAC7900E4444EDDF}, batchSize=64}} - writing will continue once client has caught up


Client code via driver:
     public void addGroups() {
       
List<String[]> groupsInput = getParsedData("groups.csv");
       
Map params = new HashMap<String, String>();
       
       
for (String[] line : groupsInput) {
           
params.put("uuid", line[3]);
           
params.put("name", line[5]);

           
ResultSet x = client.submit("g.addV('group')"
                   
+ ".property('uuid',uuid)"
                   
+ ".property('name',name)", params);
           
       
}
   
}


How to fix it, what do you suggest me to use for bulk add?

Thanks,
Ohad


Savvi and The Graph

Jay F <na...@...>
 


(x-post to gremlin-users)

Fellow traversers!

For the last 3 years at Savvi, we've been doing our bit towards reifying the Universal Graph Theory[1] by building extremely high-fidelity Graphs for our telecoms customers. These graphs span the gamut from active physical infrastructure (the modem at your home, copper cables, sheaths, segments, pits, pillars and so forth, the DSLAM, the rack, the chassis, the line card, the SFP, the ethernet cable... on and on it goes...), to the logical constructs on top (virtual circuits, trunks/LAGs, satellite beam apertures etc.). We then annotate that structural graph with time-series data (link utilisation, for example) - yes, we store this in graph, although hopefully not for too much longer! - as well as rendering more ephemeral data such as incident tickets, alarms/alerts as well.

We've been running Titan 0.5.4 in Production since January 2015, and we even worked with the Aurelius team in the background in the early days (at least until they were snarfed up by Datastax!). Since then we've been progressively adding more and more fidelity to the graph as new datasets became available to us.

Despite initial teething problems with Titan (principally around HA/DR), Titan has been rock solid for us, running in a mission critical environment and powering a number of applications. 

One of the more interesting ones is real-time field force optimisation, whereby our automatons will identify common infrastructural elements across incidents, and migrate a field engineer to look at that element as opposed to visiting a number of households. This kinds of application of the graph is extremely valuable, and yet took us just 2 months from concept to Production - possible only because of the expressivity of the Gremlin/Groovy language (80% of that application is an - admittedly very complex - Gremlin query).

A big thank you to all of the Tinkerpop, Titan and now JanusGraph developers for contributing to this awesome project and keeping it stable amidst a large amount of change!

We are naturally now in the process of migrating to JanusGraph, and expanding our use of Graph into a number of other areas as well.

I'm poking my head out as we're now looking for more Graph wizards to join our team, and I thought some of you may find the above interesting even if you're not looking for work - hopefully I won't be hung, drawn and quartered for advertising as a result ;)

If you'd like to come play with us, please take a look at the role[2] and either apply or fire an email to firstc...@....

Happy to discuss any of the above on here too. Thanks!

[1] https://www.youtube.com/watch?v=aRNWhpEPOOA
[2] https://savvi.workable.com/jobs/113135

Best Regards,

Jay Fenton (@jfenton / skype:jfenton)
Founder & CTO, Savvi Inc.
Level 3, 455 Bourke Street
Melbourne CBD, VIC, Australia


Re: janusgraph solr cassandra GraphOfTheGodsFactory

Adam Holley <holl...@...>
 

I deleted my previous post as it was not correct.  You do not need to create a core for each mixedIndex.  Assuming you are using Solr cloud mode, and following the instructions for Option 1 (http://docs.janusgraph.org/latest/solr.html#_solr_collections) you just need to manually copy the configset, and then add the initial core.

Here's the relevant section from my janusgraph-cassandra-solr.properties file

index.search.backend=solr
index.search.solr.mode=cloud
index.search.solr.zookeeper-url=localhost:2181
index.search.solr.configset=janusgraph

On Thursday, July 13, 2017 at 10:35:47 PM UTC-5, s...@... wrote:
It looks like that is the case. I'm not a regular Solr user so maybe others can chime in here if they know otherwise.

The docs reference a "index.search.solr.configset" configuration property that would allow configset/core reuse but it looks like that's only used in SolrCloud configurations not HTTP.

On Wednesday, July 12, 2017 at 2:52:04 PM UTC-5, mahendiran chandrasekar wrote:

I am still running into same kind of trouble with solr, when i try to build a search index.


Does section 24.1.2.3 means, i need to create a solr core for every index i create ? Like, if i create an index on a property key of a vertex called "foo" i need to create a core for that index ?


On Saturday, 8 July 2017 05:41:17 UTC-7, s...@... wrote:
When I've done this test before I've needed to create a core for both edges and vertices.

http://localhost:8983/solr/admin/cores?action=CREATE&name=edges&instanceDir=/opt/solr/server/solr/configsets/janusgraph&config=solrconfig.xml&dataDir=/tmp/edges_data
http://localhost:8983/solr/admin/cores?action=CREATE&name=vertices&instanceDir=/opt/solr/server/solr/configsets/janusgraph&config=solrconfig.xml&dataDir=/tmp/vertices_data

To load GraphOfTheGods you probably will also need to add the JTS jar to your Solr server. There's an open PR now that will remove this requirement in the future.

http://docs.janusgraph.org/0.1.1/solr.html#_jts_classnotfoundexception_with_geo_data

On Friday, July 7, 2017 at 10:11:04 PM UTC-5, mahendiran chandrasekar wrote:
Error trying to load GraphOfTheGodsFactory into janusgraph-cassandra-solr

Steps: 
1) download cassandra, copy cassandra.yaml from janusgraph/cassandra/cassandra.yaml -> cassandra_installationdir/conf/cassandra.yaml
2) download solr 5.2.1, copy all conf files from janusgraph/conf/solr/* -> solr_installation/server/solr/config_sets/basic_configs/*
3) Started cassandra and solr
4)janusgraph/bin/gremlin.sh

graph = JanusGraphFactory.open('conf/janusgraph-cassandra-solr.properties')
==>standardjanusgraph[cassandrathrift:[127.0.0.1]] 

gremlin> GraphOfTheGodsFactory.load(graph)

causes this stack trace

15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - Unable to save documents to Solr as one of the shape objects stored were not compatible with Solr.
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/edges/update. Reason:
<pre>    Not Found</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>

</body>
</html>

at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:529)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227)
at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:526)
at org.janusgraph.diskstorage.solr.SolrIndex.commitDocumentChanges(SolrIndex.java:407)
at org.janusgraph.diskstorage.solr.SolrIndex.mutate(SolrIndex.java:318)
at org.janusgraph.diskstorage.indexing.IndexTransaction$1.call(IndexTransaction.java:137)
at org.janusgraph.diskstorage.indexing.IndexTransaction$1.call(IndexTransaction.java:134)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.indexing.IndexTransaction.flushInternal(IndexTransaction.java:134)
at org.janusgraph.diskstorage.indexing.IndexTransaction.commit(IndexTransaction.java:115)
at org.janusgraph.diskstorage.BackendTransaction.commitIndexes(BackendTransaction.java:140)
at org.janusgraph.graphdb.database.StandardJanusGraph.commit(StandardJanusGraph.java:743)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1363)
at org.janusgraph.example.GraphOfTheGodsFactory.load(GraphOfTheGodsFactory.java:144)
at org.janusgraph.example.GraphOfTheGodsFactory.load(GraphOfTheGodsFactory.java:63)
at org.janusgraph.example.GraphOfTheGodsFactory$load.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at groovysh_evaluate.run(groovysh_evaluate:3)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:70)
at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:190)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:152)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:455)
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - Details in failed document batch:
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:5y1-6h4-9hx-38w
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - place_g:POINT(23.7 38.1)
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:2s2-39s-b2t-cqw
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - reason_t:no fear of death
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:5xt-39k-b2t-6fs
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - reason_t:loves waves
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:6qh-6h4-9hx-6eo
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - place_g:POINT(22.0 39.0)
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:3yx-3bc-b2t-3cg
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - reason_t:loves fresh breezes
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:6c9-6h4-9hx-9l4
15:07:39 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - place_g:POINT(23.9 37.7)
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - Unable to save documents to Solr as one of the shape objects stored were not compatible with Solr.
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/edges/update. Reason:
<pre>    Not Found</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>

</body>
</html>

at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:529)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227)
at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:526)
at org.janusgraph.diskstorage.solr.SolrIndex.commitDocumentChanges(SolrIndex.java:407)
at org.janusgraph.diskstorage.solr.SolrIndex.mutate(SolrIndex.java:318)
at org.janusgraph.diskstorage.indexing.IndexTransaction$1.call(IndexTransaction.java:137)
at org.janusgraph.diskstorage.indexing.IndexTransaction$1.call(IndexTransaction.java:134)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.indexing.IndexTransaction.flushInternal(IndexTransaction.java:134)
at org.janusgraph.diskstorage.indexing.IndexTransaction.commit(IndexTransaction.java:115)
at org.janusgraph.diskstorage.BackendTransaction.commitIndexes(BackendTransaction.java:140)
at org.janusgraph.graphdb.database.StandardJanusGraph.commit(StandardJanusGraph.java:743)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1363)
at org.janusgraph.example.GraphOfTheGodsFactory.load(GraphOfTheGodsFactory.java:144)
at org.janusgraph.example.GraphOfTheGodsFactory.load(GraphOfTheGodsFactory.java:63)
at org.janusgraph.example.GraphOfTheGodsFactory$load.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
at groovysh_evaluate.run(groovysh_evaluate:3)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.codehaus.groovy.tools.shell.Interpreter.evaluate(Interpreter.groovy:70)
at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:190)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:72)
at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1215)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:152)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:455)
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - Details in failed document batch:
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:5y1-6h4-9hx-38w
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - place_g:POINT(23.7 38.1)
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:2s2-39s-b2t-cqw
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - reason_t:no fear of death
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:5xt-39k-b2t-6fs
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - reason_t:loves waves
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:6qh-6h4-9hx-6eo
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - place_g:POINT(22.0 39.0)
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:3yx-3bc-b2t-3cg
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - reason_t:loves fresh breezes
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - id:6c9-6h4-9hx-9l4
15:07:52 ERROR org.janusgraph.diskstorage.solr.SolrIndex  - place_g:POINT(23.9 37.7)



Re: Janus as an RDF store

Jason Plurad <plu...@...>
 

JanusGraph supports 3 file formats that are provided via Apache TinkerPop -- Gryo, GraphML, and GraphSON.
http://tinkerpop.apache.org/docs/current/reference/#_gremlin_i_o

You can load it like this:

gremlin> graph = JanusGraphFactory.open("conf/janusgraph-hbase.properties")
==>standardjanusgraph[hbase:[127.0.0.1]]
gremlin
> graph.io(gryo()).readGraph("data/tinkerpop-modern.kryo")
gremlin
> graph.io(graphml()).readGraph("data/tinkerpop-modern.xml")
gremlin
> graph.io(graphson()).readGraph("data/tinkerpop-modern.json")

If you have some other format file, you'll need to write code to read it in the data file, and then construct the graph elements based on the data.


On Thursday, July 27, 2017 at 9:40:30 AM UTC-4, 谭宇超 wrote:
Hello Jason,I'm new to JanusGraph, and I didn't find any method that could load data-file into hbase. So, how can I load my data-file into hbase? ps: backend=hbase&caching

在 2017年3月30日星期四 UTC-7下午2:25:26,Jason Plurad写道:
JanusGraph provides native support for the property graph data model exposed by Apache TinkerPop and uses Gremlin as its query language. It does not have native support for RDF or SPARQL. That being said, you could write custom scripts to ingest RDF and transform it into a property graph model. Daniel Kuppitz started work on transforming SPARQL queries into Gremlin, and that effort continues on at https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin

If you are looking for something that does both, you might want to consider something like Stardog or BlazeGraph.

On Thursday, March 30, 2017 at 9:35:11 AM UTC-4, Laura Morales wrote:
Hi all,
 
        I'm relatively new to graph databases, and there seems to be a significant difference among "graph/property stores" and "RDF stores". Not a big difference in theory, but quite a big difference in practice about how these databases are implemented. So my question is, is Janus a graph/property store or an RDF store? Or can Janus do both? Can I, for example, use Janus to load a bunch of RDF dumps such as DBPedia/Wikidata and perform SPARQL queries over it?


Re: Janus as an RDF store

谭宇超 <archu...@...>
 

Hello Jason,I'm new to JanusGraph, and I didn't find any method that could load data-file into hbase. So, how can I load my data-file into hbase? ps: backend=hbase&caching

在 2017年3月30日星期四 UTC-7下午2:25:26,Jason Plurad写道:

JanusGraph provides native support for the property graph data model exposed by Apache TinkerPop and uses Gremlin as its query language. It does not have native support for RDF or SPARQL. That being said, you could write custom scripts to ingest RDF and transform it into a property graph model. Daniel Kuppitz started work on transforming SPARQL queries into Gremlin, and that effort continues on at https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin

If you are looking for something that does both, you might want to consider something like Stardog or BlazeGraph.

On Thursday, March 30, 2017 at 9:35:11 AM UTC-4, Laura Morales wrote:
Hi all,
 
        I'm relatively new to graph databases, and there seems to be a significant difference among "graph/property stores" and "RDF stores". Not a big difference in theory, but quite a big difference in practice about how these databases are implemented. So my question is, is Janus a graph/property store or an RDF store? Or can Janus do both? Can I, for example, use Janus to load a bunch of RDF dumps such as DBPedia/Wikidata and perform SPARQL queries over it?


Re: [BLOG] Configuring JanusGraph for spark-yarn

Joe Obernberger <joseph.o...@...>
 

Marc - thank you for posting this.  I'm trying to get this to work with our CDH 5.10.0 distribution, but have run into an issue; but first some questions.  I'm using a 5 node cluster, and I think I do not need to set the zookeeper.zone.parent since the hbase configuration is in /etc/conf/hbase.  Is that correct?

The error that I'm getting is:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 10, host002, executor 1): java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.x$330 of type org.apache.spark.api.java.function.PairFunction in instance of org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1
        at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
        at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2238)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2156)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2232)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2156)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2232)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2112)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2232)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2156)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:423)
        at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2123)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2232)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2156)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2232)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2156)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2014)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1536)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:423)
        at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
        at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Given this post:
https://stackoverflow.com/questions/28186607/java-lang-classcastexception-using-lambda-expressions-in-spark-job-on-remote-ser

It looks like I'm not including a necessary jar, but I'm at a loss as to which one.  Any ideas?

For reference, here is part of the config:

#
# Hadoop Graph Configuration
#
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphInputFormat=org.janusgraph.hadoop.formats.hbase.HBaseInputFormat
gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.inputLocation=none
gremlin.hadoop.outputLocation=output

#
# JanusGraph HBase InputFormat configuration
#
janusgraphmr.ioformat.conf.storage.backend=hbase
#janusgraphmr.ioformat.conf.storage.hostname=fqdn1,fqdn2,fqdn3
janusgraphmr.ioformat.conf.storage.hostname=10.22.5.63:2181,10.22.5.64:2181,10.22.5.65:2181
janusgraphmr.ioformat.conf.storage.hbase.table=TEST0.2.0
janusgraphmr.ioformat.conf.storage.hbase.region-count=5
janusgraphmr.ioformat.conf.storage.hbase.regions-per-server=18
janusgraphmr.ioformat.conf.storage.hbase.short-cf-names=false
#zookeeper.znode.parent=/hbase-unsecure
# Security configs are needed in case of a secure cluster
#zookeeper.znode.parent=/hbase-secure
#hbase.rpc.protection=privacy
#hbase.security.authentication=kerberos

#
# SparkGraphComputer with Yarn Configuration
#

spark.master=yarn-client
spark.executor.memory=512m
spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
spark.yarn.dist.archives=/home/graph/janusgraph-0.2.0-SNAPSHOT-hadoop2.JOE/lib.zip
spark.yarn.dist.files=/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.2.0-SNAPSHOT.jar
spark.yarn.dist.jars=/opt/cloudera/parcels/CDH/jars/janusgraph-hbase-0.2.0-SNAPSHOT.jar,/opt/cloudera/parcels/CDH/jars/spark-core_2.10-1.6.0-cdh5.10.0.jar
#spark.yarn.appMasterEnv.CLASSPATH=/etc/hadoop/conf:./lib.zip/*:
spark.yarn.appMasterEnv.CLASSPATH=/etc/haddop/conf:/etc/hbase/conf:./lib.zip/*:/opt/cloudera/parcels/CDH/jars/spark-core_2.10-1.6.0-cdh5.10.0.jar
#spark.executor.extraClassPath=/etc/hadoop/conf:/etc/hbase/conf:/home/graph/janusgraph-0.2.0-SNAPSHOT-hadoop2/janusgraph-hbase-0.2.0-SNAPSHOT.jar:./lib.zip/*
spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/native:/opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH/lib/hadoop/native:/opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/lib/native/Linux-amd64-64

Thank you!

-Joe


On 7/6/2017 4:15 AM, HadoopMarc wrote:


Readers wanting to run OLAP queries on a real spark-yarn cluster might want to check my recent post:

http://yaaics.blogspot.nl/2017/07/configuring-janusgraph-for-spark-yarn.html

Regards,  Marc
--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@....
For more options, visit https://groups.google.com/d/optout.

Virus-free. www.avg.com


Re: JanusGraph support for Cassandra 3.x

Vladyslav Kosulin <vkos...@...>
 

Yer, thrift was essential part.of our model.


On Friday, July 14, 2017 at 1:06:00 PM UTC-4, Ted Wilmes wrote:
Interesting, if you have a record of what your issues were, could you create a ticket? I haven't seen issues with Cassandra 3.10 
and the Astyanax adapter but maybe your setup or usage patterns were different. I'm assuming you enabled Thrift on your Cassandra cluster?

Thanks,
Ted

On Thursday, July 13, 2017 at 7:14:37 PM UTC-5, Vladyslav Kosulin wrote:
Astyanax with Cassandra 3?
I tried in other project, this combo did not work.

On Friday, July 7, 2017 at 3:09:23 PM UTC-4, Ted Wilmes wrote:
Hi Robert,
I've used the thrift and astyanax adapters and as of late, I've been dipping my toes into the new CQL adapter. So far that has worked well too against Apache Cassandra and Scylla 1.6.

--Ted

On Fri, Jul 7, 2017 at 1:55 PM, Robert Dale <r...@...> wrote:
Ted, what driver do you use?

Robert Dale

On Fri, Jul 7, 2017 at 2:28 PM, Ted Wilmes <t...@...> wrote:
Hi Vijaya,
I haven't had any issues running Janus in OLTP mode against Cassandra 3. Issue 172 is a problem if you want to run analytic queries using the TinkerPop SparkGraphComputer against your Janus data.

--Ted

On Friday, July 7, 2017 at 9:55:43 AM UTC-5, vijaya bhaskar Peddinti wrote:
Dear All,

Can we uses Janusgraph with Cassandra 3.x (as per docs it is mentioned only 2.1.z)? if not supported, Any new release planned with support for cassandra 3.x and when can it be expected?


I have gone through Issues#172 and 267 where this is extensively discussed but these are in Open state. 


thanks and regards,
Vijaya Bhaskar Peddinti

--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



Why Don't support partitioned vertex while I using Janus-hadoop

spirit...@...
 

17:52:45,251  INFO RemoteActorRefProvider$RemotingTerminator:74 - Remote daemon shut down; proceeding with flushing remote transports.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.200.48.158): java.lang.IllegalStateException: Read partitioned vertex (ID=8202), but partitioned vertex filtering is disabled.
at com.google.common.base.Preconditions.checkState(Preconditions.java:176)
at org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:84)
at org.janusgraph.hadoop.formats.util.GiraphRecordReader.nextKeyValue(GiraphRecordReader.java:60)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:168)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.advance(IteratorUtils.java:298)
at org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$4.hasNext(IteratorUtils.java:269)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I find out the code in JanusGraph, the following
 // Read a single row from the edgestore and create a TinkerVertex corresponding to the row
// The neighboring vertices are represented by DetachedVertex instances
public TinkerVertex readHadoopVertex(final StaticBuffer key, Iterable<Entry> entries) {

// Convert key to a vertex ID
final long vertexId = idManager.getKeyID(key);
Preconditions.checkArgument(vertexId > 0);

// Partitioned vertex handling
if (idManager.isPartitionedVertex(vertexId)) {
Preconditions.checkState(setup.getFilterPartitionedVertices(),
"Read partitioned vertex (ID=%s), but partitioned vertex filtering is disabled.", vertexId);
log.debug("Skipping partitioned vertex with ID {}", vertexId);
return null;
}

// Create TinkerVertex
TinkerGraph tg = TinkerGraph.open();

boolean foundVertexState = !verifyVertexExistence;

TinkerVertex tv = null;

// Iterate over edgestore columns to find the vertex's label relation
for (final Entry data : entries) {
RelationReader relationReader = setup.getRelationReader(vertexId);
final RelationCache relation = relationReader.parseRelation(data, false, typeManager);
if (systemTypes.isVertexLabelSystemType(relation.typeId)) {
// Found vertex Label
long vertexLabelId = relation.getOtherVertexId();
VertexLabel vl = typeManager.getExistingVertexLabel(vertexLabelId);
// Create TinkerVertex with this label
//tv = (TinkerVertex)tg.addVertex(T.label, vl.label(), T.id, vertexId);
tv = getOrCreateVertex(vertexId, vl.name(), tg);
}
}

// Added this following testing
if (null == tv) {
//tv = (TinkerVertex)tg.addVertex(T.id, vertexId);
tv = getOrCreateVertex(vertexId, null, tg);
}

Preconditions.checkState(null != tv, "Unable to determine vertex label for vertex with ID %s", vertexId);

// Iterate over and decode edgestore columns (relations) on this vertex
for (final Entry data : entries) {
try {
RelationReader relationReader = setup.getRelationReader(vertexId);
final RelationCache relation = relationReader.parseRelation(data, false, typeManager);
if (systemTypes.isVertexExistsSystemType(relation.typeId)) {
foundVertexState = true;
}

if (systemTypes.isSystemType(relation.typeId)) continue; //Ignore system types
final RelationType type = typeManager.getExistingRelationType(relation.typeId);
if (((InternalRelationType)type).isInvisibleType()) continue; //Ignore hidden types

// Decode and create the relation (edge or property)
if (type.isPropertyKey()) {
// Decode property
Object value = relation.getValue();
Preconditions.checkNotNull(value);
VertexProperty.Cardinality card = getPropertyKeyCardinality(type.name());
tv.property(card, type.name(), value, T.id, relation.relationId);
} else {
assert type.isEdgeLabel();

// Partitioned vertex handling
if (idManager.isPartitionedVertex(relation.getOtherVertexId())) {
Preconditions.checkState(setup.getFilterPartitionedVertices(),
"Read edge incident on a partitioned vertex, but partitioned vertex filtering is disabled. " +
"Relation ID: %s. This vertex ID: %s. Other vertex ID: %s. Edge label: %s.",
relation.relationId, vertexId, relation.getOtherVertexId(), type.name());
log.debug("Skipping edge with ID {} incident on partitioned vertex with ID {} (and nonpartitioned vertex with ID {})",
relation.relationId, relation.getOtherVertexId(), vertexId);
continue;
}

// Decode edge
TinkerEdge te;

// We don't know the label of the other vertex, but one must be provided
TinkerVertex adjacentVertex = getOrCreateVertex(relation.getOtherVertexId(), null, tg);

// handle self-loop edges
if (tv.equals(adjacentVertex) && isLoopAdded(tv, type.name())) {
continue;
}

if (relation.direction.equals(Direction.IN)) {
te = (TinkerEdge)adjacentVertex.addEdge(type.name(), tv, T.id, relation.relationId);
} else if (relation.direction.equals(Direction.OUT)) {
te = (TinkerEdge)tv.addEdge(type.name(), adjacentVertex, T.id, relation.relationId);
} else {
throw new RuntimeException("Direction.BOTH is not supported");
}

if (relation.hasProperties()) {
// Load relation properties
for (final LongObjectCursor<Object> next : relation) {
assert next.value != null;
RelationType rt = typeManager.getExistingRelationType(next.key);
if (rt.isPropertyKey()) {
// PropertyKey pkey = (PropertyKey)vertex.getTypeManager().getPropertyKey(rt.name());
// log.debug("Retrieved key {} for name \"{}\"", pkey, rt.name());
// frel.property(pkey.label(), next.value);
te.property(rt.name(), next.value);
} else {
throw new RuntimeException("Metaedges are not supported");
// assert next.value instanceof Long;
// EdgeLabel el = (EdgeLabel)vertex.getTypeManager().getEdgeLabel(rt.name());
// log.debug("Retrieved ege label {} for name \"{}\"", el, rt.name());
// frel.setProperty(el, new FaunusVertex(configuration,(Long)next.value));
}
}
}
}

// // Iterate over and copy the relation's metaproperties
// if (relation.hasProperties()) {
// // Load relation properties
// for (final LongObjectCursor<Object> next : relation) {
// assert next.value != null;
// RelationType rt = typeManager.getExistingRelationType(next.key);
// if (rt.isPropertyKey()) {
// PropertyKey pkey = (PropertyKey)vertex.getTypeManager().getPropertyKey(rt.name());
// log.debug("Retrieved key {} for name \"{}\"", pkey, rt.name());
// frel.property(pkey.label(), next.value);
// } else {
// assert next.value instanceof Long;
// EdgeLabel el = (EdgeLabel)vertex.getTypeManager().getEdgeLabel(rt.name());
// log.debug("Retrieved ege label {} for name \"{}\"", el, rt.name());
// frel.setProperty(el, new FaunusVertex(configuration,(Long)next.value));
// }
// }
// for (JanusGraphRelation rel : frel.query().queryAll().relations())
// ((FaunusRelation)rel).setLifeCycle(ElementLifeCycle.Loaded);
// }
// frel.setLifeCycle(ElementLifeCycle.Loaded);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
//vertex.setLifeCycle(ElementLifeCycle.Loaded);

/*Since we are filtering out system relation types, we might end up with vertices that have no incident relations.
This is especially true for schema vertices. Those are filtered out. */
if (!foundVertexState) {
log.trace("Vertex {} has unknown lifecycle state", vertexId);
return null;
} else if (!tv.edges(Direction.BOTH).hasNext() && !tv.properties().hasNext()) {
log.trace("Vertex {} has no relations", vertexId);
return null;
}
return tv;
}



Re: when release 0.2.0?

Ranger Tsao <cao....@...>
 

Glad to

在 2017年7月22日星期六 UTC+8上午3:55:53,Jason Plurad写道:

I think it's getting pretty close. I'd guess July is a stretch, but August should be possible.

Several great things are already in place: TinkerPop 3.2.5 support, Cassandra CQL support, ES 5.4.2 support, Solr 6.6.0 support.
* There are a few more PRs in the queue that need to get merged, including this one on OLAP compatibility with Cassandra 3.0+.
* There is also some discussion going on regarding the Cassandra source code tree organization, which also needs to be completed.
* A recently identified migration issue from Titan must be fixed.

I could be missing others. Here's how you and anybody else in the community can help:
* Help triage the issues. Try to reproduce them. Add comments. If there's something critical that's needed for 0.2, let it be known.
* Help review code in the pull requests.
* Test the master branch in your test environments. It already has cassandra-cql and ES 5.x support in place, so you can help us make sure it works especially for your use cases.

On Thursday, July 20, 2017 at 10:27:59 PM UTC-4, Ranger Tsao wrote:
I want use JanusGraph in my production,but I need two feature: cassandra-cql and es 5.x


Re: Geoshape property in remote gremlin query, GraphSON

rosen...@...
 

Dear Robert,

thank-you for your prompt reply! The given patch solves my problem. For reference to future viewers of the post, don't forget to include the `janusgraph` namespace in your GraphSON for `@type`:

{"@value": {"type": "point", "coordinates": [{"@value": 1.1, "@type": "g:Double"}, {"@value": 2.2, "@type": "g:Double"}]}, "@type": "janusgraph:Geoshape"}


On Wednesday, July 19, 2017 at 1:16:37 PM UTC-6, Robert Dale wrote:
It seems Geoshape GraphSON support is hardcoded to v1 although I couldn't get it to work with that either.  If you have to use GraphSON instead of Gryo, then you could checkout master, apply this patch, and rebuild. I created an  issue to support multiple versions of serializers  https://github.com/JanusGraph/janusgraph/issues/420

diff --git a/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/io/graphson/JanusGraphSONModule.java b/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/io/graphson/JanusGraphSONModule.java
index
6ef907b..8168309 100644
--- a/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/io/graphson/JanusGraphSONModule.java
+++ b/janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/io/graphson/JanusGraphSONModule.java
@@ -50,10 +50,10 @@ public class JanusGraphSONModule extends TinkerPopJacksonModule {
     
private JanusGraphSONModule() {
         
super("janusgraph");
         addSerializer
(RelationIdentifier.class, new RelationIdentifierSerializer());
-        addSerializer(Geoshape.class, new Geoshape.GeoshapeGsonSerializerV1d0());
+        addSerializer(Geoshape.class, new Geoshape.GeoshapeGsonSerializerV2d0());
 
         addDeserializer
(RelationIdentifier.class, new RelationIdentifierDeserializer());
-        addDeserializer(Geoshape.class, new Geoshape.GeoshapeGsonDeserializerV1d0());
+        addDeserializer(Geoshape.class, new Geoshape.GeoshapeGsonDeserializerV2d0());
     
}
 
     
private static final JanusGraphSONModule INSTANCE = new JanusGraphSONModule();



On Tuesday, July 18, 2017 at 5:47:50 PM UTC-4, Conrad Rosenbrock wrote:
I am trying to assign a value to a property with the native Geoshape type. I have it serialized into JSON as follows (where g is aliased to the traversal on gremlin server):

{"@value": {"type": "point", "coordinates": [{"@value": 1.1, "@type": "g:Double"}, {"@value": 2.2, "@type": "g:Double"}]}, "@type": "g:Geoshape"}

In the gremlin console, I can easily type 

Geoshape.point(1.1, 2.2)

and it works perfectly. I am sure that it is something quite simple. Here is the error:

Request [PooledUnsafeDirectByteBuf(ridx: 653, widx: 653, cap: 687)] could not be deserialized by org.apache.tinkerpop.gremlin.driver.ser.AbstractGraphSONMessageSerializerV2d0.

For reference, I do have the following serializer in the gremlin server config:

{ className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}

which should direct gremlin server to the relevant deserializer in Janus.

Thanks!

6181 - 6200 of 6663