Re: Issues while iterating over self-loop edges in Apache Spark
hadoopmarc@...
Hi Mladen,
Indeed, the self-loop logic you point to still exists in: https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-hadoop/src/main/java/org/janusgraph/hadoop/formats/util/JanusGraphVertexDeserializer.java I guess the intent of the filtering of these self loop edges is to prevent that a single self-loop edge appears twice, as IN edge and as OUT edge. I also guess that the actual implementation is buggy: it is not the responsibility of the InputFormat to filter any data (your example!) but rather to represent all data present faithfully. Can you report an issue for this at https://github.com/JanusGraph/janusgraph/issues ? This also means that there is not an easy way out, other than starting with a private build with a fix (and possibly contributing the fix as a PR). Best wishes, Marc |
|