if you have keys to your vertices available, either vertex ids or unique values of some vertex property, you can start as many gremlin clients as your backends can handle and distribute the keys over the clients. This is the easy case, but often not applicable.
if there are no keys available, gremlin can only help you with a full table scan g.V(). If you have a client machine with many cores, the withComputer() step, either with or without spark-local, will help you parallelize the scan.
you can copy the vertex files from the storage backend and decode them offline. Decoding procedures are implicit in the janusgraph source code, but I am not aware of any library that does this for you explicitly.
You decide, but I would suggest option 2 with spark-local as the option that works out of the box.