Hi All, I am pretty new to Janusgraph and want to get some suggestions from you. Previously I posted a question about using ES as backend storage, and got some good feedback from Jason (Thanks!). Now here comes another question: If I want to use janusgraph spark standalone without Hadoop for OLAP, can some one point me a direction? Basically I have spark standalone deployed on kubernetes, how could that be used for OLAP?
Thanks a lot!
Wei
|
|
I don't think it will work. Spark needs input (to read graph data) and output (to write graph data). JanusGraph currently only provides Hadoop InputFormat based reading from JanusGraph for OLAP. In Tinkerpop, there are InputRDD and OutputRDD interfaces, which are by Spark (SpackGraphComputer). (Search for Tinkerpop InputRDD.) Unfortunately, JanusGraph provides no implementations other than the InputFormat based at the moment.
Thanks,
Jerry
toggle quoted message
Show quoted text
|
|
That being said, to be clear, you don't need a Hadoop cluster or any kind if that is what you mean. JanusGraph packages the Hadoop jars it needs. That is all you need to run SparkComputer on JansGraph.
Thanks
Jerry
toggle quoted message
Show quoted text
On Thu, Aug 23, 2018 at 5:56 PM Jerry He < jerr...@...> wrote: I don't think it will work. Spark needs input (to read graph data) and output (to write graph data). JanusGraph currently only provides Hadoop InputFormat based reading from JanusGraph for OLAP. In Tinkerpop, there are InputRDD and OutputRDD interfaces, which are by Spark (SpackGraphComputer). (Search for Tinkerpop InputRDD.) Unfortunately, JanusGraph provides no implementations other than the InputFormat based at the moment.
Thanks,
Jerry
|
|
Debasish Kanhar <d.k...@...>
@Jerry:
JanusGraph doesn't need Hadoop Cluster to run OLAP yes, but doesn't JanusGraph needs to point to a live Hadoop Cluster by setting HADOOP_CONF_DIR in CLASSPATH? I guess that was my understanding, and that was missing piece in docs for which it took me really long time to crack OLAP using Spark cluster.
toggle quoted message
Show quoted text
On Friday, 24 August 2018 06:53:20 UTC+5:30, Jerry He wrote: That being said, to be clear, you don't need a Hadoop cluster or any kind if that is what you mean. JanusGraph packages the Hadoop jars it needs. That is all you need to run SparkComputer on JansGraph.
Thanks
Jerry On Thu, Aug 23, 2018 at 5:56 PM Jerry He < je...@...> wrote: I don't think it will work. Spark needs input (to read graph data) and output (to write graph data). JanusGraph currently only provides Hadoop InputFormat based reading from JanusGraph for OLAP. In Tinkerpop, there are InputRDD and OutputRDD interfaces, which are by Spark (SpackGraphComputer). (Search for Tinkerpop InputRDD.) Unfortunately, JanusGraph provides no implementations other than the InputFormat based at the moment.
Thanks,
Jerry
On Wed, Aug 22, 2018 at 8:46 AM, Wei Ding <dw...@...> wrote: Hi All, I am pretty new to Janusgraph and want to get some suggestions from you. Previously I posted a question about using ES as backend storage, and got some good feedback from Jason (Thanks!). Now here comes another question: If I want to use janusgraph spark standalone without Hadoop for OLAP, can some one point me a direction? Basically I have spark standalone deployed on kubernetes, how could that be used for OLAP?
Thanks a lot!
Wei
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/512224e3-5b20-4e31-afa5-e2fd74591182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
|
|
The need for Hadoop conf (only the hdfs conf) is to read from or write to graph data files on hdfs. Direct interacting with JanusGraph backend without involving any graph data files on hdfs won't need that, I think.
Thanks
Jerry
toggle quoted message
Show quoted text
On Fri, Aug 24, 2018 at 5:05 AM Debasish Kanhar < d.k...@...> wrote: @Jerry:
JanusGraph doesn't need Hadoop Cluster to run OLAP yes, but doesn't JanusGraph needs to point to a live Hadoop Cluster by setting HADOOP_CONF_DIR in CLASSPATH? I guess that was my understanding, and that was missing piece in docs for which it took me really long time to crack OLAP using Spark cluster.
On Friday, 24 August 2018 06:53:20 UTC+5:30, Jerry He wrote: That being said, to be clear, you don't need a Hadoop cluster or any kind if that is what you mean. JanusGraph packages the Hadoop jars it needs. That is all you need to run SparkComputer on JansGraph.
Thanks
Jerry On Thu, Aug 23, 2018 at 5:56 PM Jerry He < je...@...> wrote: I don't think it will work. Spark needs input (to read graph data) and output (to write graph data). JanusGraph currently only provides Hadoop InputFormat based reading from JanusGraph for OLAP. In Tinkerpop, there are InputRDD and OutputRDD interfaces, which are by Spark (SpackGraphComputer). (Search for Tinkerpop InputRDD.) Unfortunately, JanusGraph provides no implementations other than the InputFormat based at the moment.
Thanks,
Jerry
On Wed, Aug 22, 2018 at 8:46 AM, Wei Ding <dw...@...> wrote: Hi All, I am pretty new to Janusgraph and want to get some suggestions from you. Previously I posted a question about using ES as backend storage, and got some good feedback from Jason (Thanks!). Now here comes another question: If I want to use janusgraph spark standalone without Hadoop for OLAP, can some one point me a direction? Basically I have spark standalone deployed on kubernetes, how could that be used for OLAP?
Thanks a lot!
Wei
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusg...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/512224e3-5b20-4e31-afa5-e2fd74591182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgr...@....
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/d5561b80-bd25-4529-8698-cd605f5bab0a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
|
|
Debasish Kanhar <d.k...@...>
Ah. I think we might be wrong in our understanding there. As I was trying to read the graph data from my underlaying backend (Cassandra) and not any Graph stored on HDFS using JanusGraph's Cassandra3InputFormat class. The same was also failing when I was trying to run OLAP using a Spark Cluster without setting HADOOP_CONF_DIR variable.
Ideally that should not have been scenario, as TP > 3.3 doesn't need intermediate HDFS storage, but doesn't look like that's happening. Well we can track this thing if needed. :-)
toggle quoted message
Show quoted text
On Friday, 24 August 2018 21:04:24 UTC+5:30, Jerry He wrote: The need for Hadoop conf (only the hdfs conf) is to read from or write to graph data files on hdfs. Direct interacting with JanusGraph backend without involving any graph data files on hdfs won't need that, I think.
Thanks
Jerry On Fri, Aug 24, 2018 at 5:05 AM Debasish Kanhar < d...@...> wrote: @Jerry:
JanusGraph doesn't need Hadoop Cluster to run OLAP yes, but doesn't JanusGraph needs to point to a live Hadoop Cluster by setting HADOOP_CONF_DIR in CLASSPATH? I guess that was my understanding, and that was missing piece in docs for which it took me really long time to crack OLAP using Spark cluster.
On Friday, 24 August 2018 06:53:20 UTC+5:30, Jerry He wrote: That being said, to be clear, you don't need a Hadoop cluster or any kind if that is what you mean. JanusGraph packages the Hadoop jars it needs. That is all you need to run SparkComputer on JansGraph.
Thanks
Jerry On Thu, Aug 23, 2018 at 5:56 PM Jerry He < je...@...> wrote: I don't think it will work. Spark needs input (to read graph data) and output (to write graph data). JanusGraph currently only provides Hadoop InputFormat based reading from JanusGraph for OLAP. In Tinkerpop, there are InputRDD and OutputRDD interfaces, which are by Spark (SpackGraphComputer). (Search for Tinkerpop InputRDD.) Unfortunately, JanusGraph provides no implementations other than the InputFormat based at the moment.
Thanks,
Jerry
On Wed, Aug 22, 2018 at 8:46 AM, Wei Ding <dw...@...> wrote: Hi All, I am pretty new to Janusgraph and want to get some suggestions from you. Previously I posted a question about using ES as backend storage, and got some good feedback from Jason (Thanks!). Now here comes another question: If I want to use janusgraph spark standalone without Hadoop for OLAP, can some one point me a direction? Basically I have spark standalone deployed on kubernetes, how could that be used for OLAP?
Thanks a lot!
Wei
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/512224e3-5b20-4e31-afa5-e2fd74591182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/d5561b80-bd25-4529-8698-cd605f5bab0a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
|
|
Yeah, a stack trace from Gremlin will us help to see what is going on. That should not be a dependency in that case.
Thanks,
Jerry
toggle quoted message
Show quoted text
On Fri, Aug 24, 2018 at 8:41 AM, Debasish Kanhar <d.k...@...> wrote: Ah. I think we might be wrong in our understanding there. As I was trying to read the graph data from my underlaying backend (Cassandra) and not any Graph stored on HDFS using JanusGraph's Cassandra3InputFormat class. The same was also failing when I was trying to run OLAP using a Spark Cluster without setting HADOOP_CONF_DIR variable.
Ideally that should not have been scenario, as TP > 3.3 doesn't need intermediate HDFS storage, but doesn't look like that's happening. Well we can track this thing if needed. :-)
On Friday, 24 August 2018 21:04:24 UTC+5:30, Jerry He wrote:The need for Hadoop conf (only the hdfs conf) is to read from or write to graph data files on hdfs. Direct interacting with JanusGraph backend without involving any graph data files on hdfs won't need that, I think.
Thanks
Jerry On Fri, Aug 24, 2018 at 5:05 AM Debasish Kanhar < d...@...> wrote: @Jerry:
JanusGraph doesn't need Hadoop Cluster to run OLAP yes, but doesn't JanusGraph needs to point to a live Hadoop Cluster by setting HADOOP_CONF_DIR in CLASSPATH? I guess that was my understanding, and that was missing piece in docs for which it took me really long time to crack OLAP using Spark cluster.
On Friday, 24 August 2018 06:53:20 UTC+5:30, Jerry He wrote: That being said, to be clear, you don't need a Hadoop cluster or any kind if that is what you mean. JanusGraph packages the Hadoop jars it needs. That is all you need to run SparkComputer on JansGraph.
Thanks
Jerry On Thu, Aug 23, 2018 at 5:56 PM Jerry He < je...@...> wrote: I don't think it will work. Spark needs input (to read graph data) and output (to write graph data). JanusGraph currently only provides Hadoop InputFormat based reading from JanusGraph for OLAP. In Tinkerpop, there are InputRDD and OutputRDD interfaces, which are by Spark (SpackGraphComputer). (Search for Tinkerpop InputRDD.) Unfortunately, JanusGraph provides no implementations other than the InputFormat based at the moment.
Thanks,
Jerry
On Wed, Aug 22, 2018 at 8:46 AM, Wei Ding <dw...@...> wrote: Hi All, I am pretty new to Janusgraph and want to get some suggestions from you. Previously I posted a question about using ES as backend storage, and got some good feedback from Jason (Thanks!). Now here comes another question: If I want to use janusgraph spark standalone without Hadoop for OLAP, can some one point me a direction? Basically I have spark standalone deployed on kubernetes, how could that be used for OLAP?
Thanks a lot!
Wei
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/512224e3-5b20-4e31-afa5-e2fd74591182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/d5561b80-bd25-4529-8698-cd605f5bab0a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/042b6daa-ff2f-4230-80b4-c10ba5a740cf%40googlegroups.com.
|
|