Storing and reading connected component RDD through OutputFormatRDD & InputFormatRDD
I am using connected component vertex program to find all the connected nodes in graph and then using that RDD for further processing in graph. I want to store that RDD at some output location so that i can re-use the RDD and don't have to run connected component vertex program which is time consuming.
I see in tinker-pop library we have OutputFormatRDD to save data. I tired
outputFormatRDD.writeGraphRDD(graphComputerConfiguration, uniqueRDD); ## connected but its throwing class cast exception as connected component vertex program output RDD value is a list which can not be cast to VertexWritable
outputFormatRDD.writeMemoryRDD(graphComputerConfiguration, "memoryKey", uniqueRDD); ## Its saving RDD by creating memory key folder name at output location.
Not able to read RDD through InputFormatRDD.readMemoryRDD() as its looking for data files as per class SequenceFileInputFormat class.
Am i missing any thing? Please let me know if you have tired some these methods? Want to check if we can use out of box methods before proceeding with our own?
The following section of the TinkerPop ref docs gives an example of how to reuse the output RDD of one job in a follow-up gremlin OLAP job.
Best wishes, Marc
|1 - 2 of 2|