We have loaded four hundred millions of vertices into HBase through JanusGraph api. Our HBase is pre-split to 36 regions. We find some of the regions have 10 times bigger than other region which cause unbalancing problem. At JanusGraph configuration side we haven't set any partition settings.
First situation is we use auto increment ID for vertex ID. Then all the data which their column family is edgeStore goes into one region. As you can see the picture below all the data goes into the 7th region.
Second situation is we let JanusGraph generate the vertex ID. The problem still exist but not as serious as first situation.