Re: java.io.EOFException in kryo+blvp error in bulk loading
HadoopMarc <bi...@...>
Hi Eliz,
You are supposed to use Tinkerpop's gryo writer to create kryo graph input files (I do not believe there is a formal spec of the format). There is also the BulkDumperVertexProgram to create a kryo outputfile from a HadoopGraph. If you have a large dataset on hdfs and you want a distributed load into JanusGraph it is also possible to use Spark mapPartitions and have each Spark task make its own connection to the same JanusGraph.
HTH, Marc
Op donderdag 6 juli 2017 04:45:22 UTC+2 schreef Elizabeth:
You are supposed to use Tinkerpop's gryo writer to create kryo graph input files (I do not believe there is a formal spec of the format). There is also the BulkDumperVertexProgram to create a kryo outputfile from a HadoopGraph. If you have a large dataset on hdfs and you want a distributed load into JanusGraph it is also possible to use Spark mapPartitions and have each Spark task make its own connection to the same JanusGraph.
HTH, Marc
Op donderdag 6 juli 2017 04:45:22 UTC+2 schreef Elizabeth:
Hi Biko and Marc,Below is the method I used to create test.kryo, which was converted from a CSV file: vertices.txtvi vertices.txt123116921181412473141801431914667112228vi CsvToJavaObject.javapackage mypkg;import java.io.BufferedReader;import java.io.FileNotFoundException;import java.io.FileReader;import java.io.IOException;import java.util.ArrayList;import java.util.List;import com.esotericsoftware.kryo.Kryo; import com.esotericsoftware.kryo.io.Input; import com.esotericsoftware.kryo.io.Output; import java.io.File;import java.io.FileOutputStream;import java.io.FileInputStream;import java.util.Arrays;public class CsvToJavaObject {public static void main(String[] args) {String csvFile = "/home/dev/wanmeng/kryo/mypkg/vertices.txt"; BufferedReader br = null;String line = "";String csvSplitBy = "\n";List<Long> list = new ArrayList<Long>();File file;FileOutputStream fop = null;try {br = new BufferedReader(new FileReader(csvFile));while ((line = br.readLine()) != null) {//trim the line breaks at the end of line.line = line.trim();// adding Long objects to a list.list.add(Long.parseLong(line)); //System.out.println(list); }} catch (FileNotFoundException e) {e.printStackTrace(); } catch (IOException e) {e.printStackTrace(); } finally {if (br != null) {try {br.close(); } catch (IOException e) {e.printStackTrace(); }}}try{Kryo kryo = new Kryo();file = new File("test.kryo");fop = new FileOutputStream(file);//if the file doesnt exist, then create it.if(!file.exists()) {file.createNewFile(); }Output output = new Output(fop);kryo.writeObject(output, list); output.close();}catch (Exception ex) {ex.printStackTrace();}finally {try{if (fop != null) {fop.close();}} catch (IOException e) {e.printStackTrace();}}}public void deserialize() {File file;FileInputStream fip = null;// deserializetry{Kryo kryo = new Kryo();file = new File("test.kryo");fip = new FileInputStream(file);if(!file.exists()) {file.createNewFile();}Input input = new Input(fip);//ArrayList<Long> listin = kryo.readObject(input, Long.class);Long listin = kryo.readObject(input, Long.class);//System.out.println(Arrays.toString(listin. toArray())); System.out.println(listin); input.close();}catch (Exception ex) {ex.printStackTrace();}finally {try{if (fip != null) {fip.close();}} catch (IOException e) {e.printStackTrace();}}}}javac -d . CsvToJavaObject.javajava mypkg.CsvToJavaObject.javaThanks,Eliz2017-07-02 23:14 GMT+08:00 <b...@...>:Hi Elis,
Did the same sequence of gremlin statements work fine for the tinkerpop-modern.kryo or grafeful-dead.kryo example files?
If so, your test.kryo file is the problem. How did you create it?
Cheers, Marc
Op woensdag 28 juni 2017 15:02:27 UTC+2 schreef Elizabeth:Hi all,I was using the Kryo format and BulkLoaderVertexProgram to load large files into Janusgraph, and encountered an error:gremlin> hdfs.copyFromLocal('data/test.kryo','data/test.kryo') ==>nullgremlin> graph = GraphFactory.open('conf/hadoop-graph/hadoop-load. properties') ==>hadoopgraph[gryoinputformat-> gryooutputformat] gremlin>gremlin> blvp = BulkLoaderVertexProgram.build().writeGraph('conf/janusgraph- hbase-es.properties').create( graph) ==>BulkLoaderVertexProgram[bulkLoader= IncrementalBulkLoader, vertexIdProperty=bulkLoader. vertex.id, userSuppliedIds=false, keepOriginalIds=true, batchSize=0] gremlin>gremlin> result = graph.compute(SparkGraphComputer).program( blvp).submit().get() 20:21:32 ERROR org.apache.spark.executor.Executor - Exception in task 0.0 in stage 0.0 (TID 0) java.io.EOFExceptionat java.io.DataInputStream.readByte(DataInputStream.java: 267) at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo. GryoRecordReader.seekToHeader( GryoRecordReader.java:93) at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo. GryoRecordReader.initialize( GryoRecordReader.java:85) at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo. GryoInputFormat. createRecordReader( GryoInputFormat.java:38) has anyone ever had this error before, please help me with this last step!Any idea is appreciated!Best,Meng--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/ iJqOtNl1-AE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.