how to load a CSV file into janusgraph


Elizabeth <hlf...@...>
 

Hi all,

I am new to Janusgraph, I have dived into docs of Janusgraph for almost two weeks, nothing found.
I could only gather the scatted information and most of the time it will prompt some errors.
Could anyone supply a complete example of bulk loading or loading a CSV file into Janusgraph, please?
Any little help is appreated!

Best regards,

Elis.


HadoopMarc <m.c.d...@...>
 

Hi Elizabeth,

For JanusGraph you should also take into account the TinkerPop documentation. A relevant pointer for you is:

https://groups.google.com/forum/#!searchin/gremlin-users/csv%7Csort:relevance/gremlin-users/AetuGcLiBxo/KW966WAyAQAJ

Cheers,    Marc

Op woensdag 14 juni 2017 18:44:16 UTC+2 schreef Elizabeth:

Hi all,

I am new to Janusgraph, I have dived into docs of Janusgraph for almost two weeks, nothing found.
I could only gather the scatted information and most of the time it will prompt some errors.
Could anyone supply a complete example of bulk loading or loading a CSV file into Janusgraph, please?
Any little help is appreated!

Best regards,

Elis.


Elizabeth <hlf...@...>
 

Hi Marc,

Thanks so much for your information, however, I was wondering is there any complete code example about how to use 
"bulk-loading" in Janusgraph without Hadoop?


Thanks again!
Elis

On Thursday, June 15, 2017 at 9:59:10 PM UTC+8, HadoopMarc wrote:
Hi Elizabeth,

For JanusGraph you should also take into account the TinkerPop documentation. A relevant pointer for you is:
https://groups.google.com/forum/#!searchin/gremlin-users/csv%7Csort:relevance/gremlin-users/AetuGcLiBxo/KW966WAyAQAJ

Cheers,    Marc

Op woensdag 14 juni 2017 18:44:16 UTC+2 schreef Elizabeth:
Hi all,

I am new to Janusgraph, I have dived into docs of Janusgraph for almost two weeks, nothing found.
I could only gather the scatted information and most of the time it will prompt some errors.
Could anyone supply a complete example of bulk loading or loading a CSV file into Janusgraph, please?
Any little help is appreated!

Best regards,

Elis.


HadoopMarc <m.c.d...@...>
 

Hi Elizabeth,

OK, another resource I dug up by searching for CSV on the gremlin user list:

http://www.datastax.com/dev/blog/powers-of-ten-part-i

Translation to JanusGraph should be straightforward.

HTH,   Marc

Op woensdag 21 juni 2017 11:15:51 UTC+2 schreef Elizabeth:

Hi Marc,

Thanks so much for your information, however, I was wondering is there any complete code example about how to use 
"bulk-loading" in Janusgraph without Hadoop?


Thanks again!
Elis

On Thursday, June 15, 2017 at 9:59:10 PM UTC+8, HadoopMarc wrote:
Hi Elizabeth,

For JanusGraph you should also take into account the TinkerPop documentation. A relevant pointer for you is:
https://groups.google.com/forum/#!searchin/gremlin-users/csv%7Csort:relevance/gremlin-users/AetuGcLiBxo/KW966WAyAQAJ

Cheers,    Marc

Op woensdag 14 juni 2017 18:44:16 UTC+2 schreef Elizabeth:
Hi all,

I am new to Janusgraph, I have dived into docs of Janusgraph for almost two weeks, nothing found.
I could only gather the scatted information and most of the time it will prompt some errors.
Could anyone supply a complete example of bulk loading or loading a CSV file into Janusgraph, please?
Any little help is appreated!

Best regards,

Elis.


Rohit Jain <rohit.j...@...>
 

Hi Elis,

Did you figure this out?  I am in the same predicament.  I have a simple graph of movies and persons who have acted or directed (edge) in those movies.  I have downloaded a sample of the IMDB database and now have a relational table with movie id, person id, and role in one table.  The role column indicates an actor or a director.  I have to generate the movie vertex entries with the movie id, the person vertex entries with the person id, and then create the edges between the movie and person ids, along with a role edge with a role property of actor or director.  Given I have almost 20,000 rows in that association table, I need to find a way to load the graph using csv files generated from that relational table.  And like you, I can't find any documentation on how to do that easily.  Would be grateful for any help.

Thanks!
Rohit


Robert Dale <rob...@...>
 


There are no ready to run tools for this. You will have to write a small script to parse the csv file, create vertexes and edges.

You can search for 'titan bulk load csv' for some loading strategies.


On Friday, August 11, 2017 at 4:48:25 PM UTC-4, Rohit Jain wrote:
Hi Elis,

Did you figure this out?  I am in the same predicament.  I have a simple graph of movies and persons who have acted or directed (edge) in those movies.  I have downloaded a sample of the IMDB database and now have a relational table with movie id, person id, and role in one table.  The role column indicates an actor or a director.  I have to generate the movie vertex entries with the movie id, the person vertex entries with the person id, and then create the edges between the movie and person ids, along with a role edge with a role property of actor or director.  Given I have almost 20,000 rows in that association table, I need to find a way to load the graph using csv files generated from that relational table.  And like you, I can't find any documentation on how to do that easily.  Would be grateful for any help.

Thanks!
Rohit


a.mar...@...
 

For adding vertex, this is using gremlin 

use something like this 

toInt = { Integer.parseInt(it)}
toShort
= { Short.parseShort(it) }
toByte
= { Byte.parseByte(it) }
toLong
= { Long.parseLong(it) }
toDate
= { Date.parse("yyyy-mm-dd",it) }
toBoolean
= { Boolean.parseBoolean(it) }
toFloat
= { Float.parseFloat(it) }


importFile
= { gr, file, adder ->
               
// Track Timing Stats
                start
= new Date();


               
// Make sure retries are built in
                attempt
= 0;


               
while (attempt < ATTEMPTS_MAX) {
                        attempt
++;
                       
try {
                                lines
= 0;
                               
// Load lines to DB
                                file
.eachLine { lines++; if (lines > 1) adder (gr, it) };
                                gr
.tx().commit();
                               
// Ensure consistency
                                attempt
= ATTEMPTS_MAX;
                               
// Print timing stats
                                stop
= new Date();
                                println
"Lines Processed: ["+ lines +"], Time Taken: ["+ TimeCategory.minus(stop, start) +"]";


                       
} catch (ex) {
                               
// Rollback and retry in the event of an error
                                println
"Error " + ex.toString();
                                println
"Retrying after 1 second...";
                                gr
.tx().commit();
                                sleep
(1000);
                       
}
               
}
       
};






addUser
= { gr, data ->
                ch
= data.split(",");
                gr
.addVertex(
                label
, 'User',
               
'property1', (ch[0]),
               
'property2', toShort(ch[1]),
               
'property3', toLong(ch[2]),
               
'property4', toDate(ch[3])
               
)
};


importFile
(g,file_name,addUser)




On Wednesday, June 14, 2017 at 9:44:16 AM UTC-7, Elizabeth wrote:
Hi all,

I am new to Janusgraph, I have dived into docs of Janusgraph for almost two weeks, nothing found.
I could only gather the scatted information and most of the time it will prompt some errors.
Could anyone supply a complete example of bulk loading or loading a CSV file into Janusgraph, please?
Any little help is appreated!

Best regards,

Elis.