Date   

Re: Janus Graph Performance with Cassandra vs BigTable

hadoopmarc@...
 

Hi Vishal,

Your question is very general. What is most important to you: write performance, simple queries, complex queries? Do you mean comparison between managed Cassandra and managed Bigtable in terms of Euros needed for a specific workload? I am not aware of independent benchmark results for the JanusGraph storage backends, while vendors can be skimmy about circumstances for the benchmarks they present.

Some general notions:
  • Cassandra has the non-java drop in ScyllaDb, therefore large JanusGraph deployments often use ScyllaDb, see the materials on https://janusgraph.org/
  • JanusGraph has not a lot of code that is specific to any storage backend, but the adapters were designed with Cassandra in mind
  • Compatibility between JanusGraph and BigTable is only maintained indirectly through JanusGraph-HBase compatibility (I am not aware that this resulted in problems in the past, though)
Best wishes,     Marc


Re: a problem about elasticsearch

Peter Corless
 

Is it ES [the software] that is bottlenecking, or could it be the HW you have it running on? If the HW isn't the issue, have you been able to trace where the issue is in ES?

I hate to be "that guy," but if the underlying storage engine isn't keeping up, you have options with JanusGraph. Of course, if you resolve the issue and can keep running on ES, I am all for least-disruptive solutions.

But if not, I'd be remiss to not put in a plug for Scylla as a better performing option as a JanusGraph data store.

Hope you get it resolved!

On Fri, Jun 11, 2021, 1:37 AM <anjanisingh22@...> wrote:
Hi Anshul,

I am facing same issue? Did you got any solution for the issue?

Thanks,
Anjani


Re: a problem about elasticsearch

anjanisingh22@...
 

Hi Anshul,

I am facing same issue? Did you got any solution for the issue?

Thanks,
Anjani


Janus Graph Performance with Cassandra vs BigTable

Vishal Gupta <vgupta@...>
 

Hi Community/Team, 

I see that Janus graph can be integrated with multiple storage backends like Cassandra and BigTable. 
I am trying to evaluating which storage backend is more performant for Janus Graph. 

I want to see if people have any recommendations here ? Has anyone done performance comparison evaluating performance of Janus + BitTable vs Janus + Cassandra ?

Thanks
Vishal


Transaction Recovery and Bulk Loading

madams@...
 

Hi all,

We've been integrating our pipelines with Janusgraph for sometime now, it's been working great, thanks to the developers!
We use the transaction recovery job and enabled batch-loading for performance, and then we realized the write ahead transaction log is not used when batch-loading is enabled.
By curiosity, is there any reason for this?
At the moment we disabled batch loading and consistency checks. We've thought about replacing the transaction recovery with a reindexing job but reindexing is quite a heavy operation.

Thanks,
Best Regards,
Marc


Issues while iterating over self-loop edges in Apache Spark

Mladen Marović
 

Hello,

while debugging some Apache Spark jobs that process data from a Janusgraph graph. i noticed some issues with self-loop edges (edges that connect a vertex to itself). The data is read using:

javaSparkContext.newAPIHadoopRDD(hadoopConfiguration(), CqlInputFormat.class, NullWritable.class, VertexWritable.class)

When I try to process all outbound edges of a single vertex using:

vertex.edges(Direction.OUT)

and that vertex has multiple self-loop edges with the same edge label, the iterator always returns only one such edge. Edges that are not self-loop are all returned as expected.

To give a specific example, if I have a vertex V0 with edges that E1, E2, E3, E4, E5 that lead to vertices V1, V2, V3, V4, V5, the call vertex.edges(Direction.OUT) will return an iterator that iterates over all five edges. However, if I have a vertex V0 with edges E1, E2, E3 that lead to V1, V2, V3, and self-loop edges EL1, EL2, EL3, the iterator will iterate over E1, E2, E3, and only one of (EL1, EL2, EL3), giving a total of four edges instead of the expected six.

After further analysis, I came upon this commit:

https://github.com/JanusGraph/janusgraph/commit/d3006dc939c1b640bb263806abd3fd6bee630d12

which explicitly added code that skips deserializing multiple self-loop edges. The code from the linked commit is still present in org.janusgraph:janusgraph-hadoop:0.5.3 and seems to be the cause of this unexpected behavior.

My questions are as follows:

  1. What is the reason behind implementing the change from the given commit?
  2. Is there another way to iterate on all edges, including (possibly) multiple self-loop edges with the same edge label?

Kind regards,

Mladen Marović


Re: Difference Between JanusGraph Server and Embedded JanusGraph in Java

hadoopmarc@...
 

Hi Zach,

1. For building an API service you do not need Gremlin Server. Gremlin Server has all kinds of features though that might (slightly) relieve the complexity of your service (with the complexity of maintaining  Gremlin Server added). The main driver for using Gremlin Server is the support for Gremlin Language Variants, which you do not need.
Resource usage should not differ very much for similar workloads and comparable settings; Gremlin Server requires an additional JVM, but might be more optimized than what you build in house.

2. First check using Gremlin Console for connecting to Gremlin Server. If that works, please report more details about what visualization tool you use.

Best wishes,     Marc


Difference Between JanusGraph Server and Embedded JanusGraph in Java

Zach B.
 

I've seen a lot of discussion about the benefits and such of both implementations but I was wondering if there was a big difference in terms of resource usage? I'm building an API service that will be deployed to a low resource virtual machine and I was wondering if there was a big difference between the memory usage of the two implementations.

Furthermore and unrelated, but I have been developing using the Embedded implementation using HBase as a storage backend. I wanted to use a visualization tool to see if my graph is appearing the way I want, however all the tools I see require gremlin-server. So I started up the server using the same exact HBase configuration as Embedded, but it displays an empty graph. Does anyone know why that is the case?

Thank you in advance.


Re: Getting org.janusgraph.graphdb.database.idassigner.IDPoolExhaustedException consistently

hadoopmarc@...
 

Hi,

There does not seem to be much that helps in finding a root cause (no similar questions or issues in history). The most helpful thing I found is the following javadoc:
https://javadoc.io/doc/org.janusgraph/janusgraph-core/latest/org/janusgraph/graphdb/database/idassigner/placement/SimpleBulkPlacementStrategy.html

Assuming that you use this default SimpleBulkPlacementStrategy, what value do your use for ids.num-partitions ?  The default number might be too small. In the beginning of a spark job, the tasks can be more or less synchronized, that is they finish after about the same amount of time and then cause congestion (task number 349 ...). If this is the case, other configs could help too:

ids.renew-percentage                             If you increase this value, congestion is avoided a bit, but this cannot have a high impact.
ids.flush                                                  I assume you did not change the default "true" value
ids.authority.conflict-avoidance-mode    Undocumented, but talks about contention during ID block reservation

Best wishes,    Marc


Getting org.janusgraph.graphdb.database.idassigner.IDPoolExhaustedException consistently

sauverma
 

Hi

I am getting the below exception while ingesting data to an existing graph

Job aborted due to stage failure: Task 349 in stage 2.0 failed 10 times, most recent failure: Lost task 349.9 in stage 2.0 (TID 2524, dproc-connect-graph1-prod-us-sw-xwv9.c.zeotap-prod-datalake.internal, executor 262): org.janusgraph.graphdb.database.idassigner.IDPoolExhaustedException: Could not find non-exhausted partition ID Pool after 1000 attempts

The value of `ids.block-size` is set to 5000000 (50M) and I am using spark for data loading (around 300 executors per run).

Could you please suggest the configuration which can fix this issue?

Thanks


Re: Backend data model deserialization

Elliot Block <eblock@...>
 

Awesome thank you all for the great info and recent presentations!  We are prototyping bulk export + deserialize from Cloud Bigtable over approx. the next week and will try to report back if we can produce something useful to share.  Thanks again, -Elliot
 
On Thu, May 20, 2021 at 6:45 AM sauverma <saurabhdec1988@...> wrote:
At zeotap we ve taken the same route to enable olap consumers via apache spark. We presented it in the recent janusgraph meet-up at https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376. We are using ScyllaDB as the backend.
  
On Thu, May 20, 2021, 6:12 PM Boxuan Li <liboxuan@...> wrote:
If you want to resort to the source code, you could check out EdgeSerializer and IndexSerializer. Here is a simple code snippet demonstrating how to deserialize an edge:
 
On May 20, 2021, at 8:07 PM, hadoopmarc@... wrote:
If you look back at this week's OLAP presentations (https://lists.lfaidata.foundation/g/janusgraph-users/topic/janusgraph_meetup_4/82939376) you will see that one of the presenters exactly did what you propose: they exported rows from scylladb and converted it to gryo format for import into TinkerPop HadoopGraph. You might want to contact them to coordinate a possible contribution to the JanusGraph project. 
_._,_.


Re: ID block allocation exception while creating edge

hadoopmarc@...
 

Hi Anjani,

One thing that does not feel good is that you create and commit a transaction for every row of your dataframe. Although I do not see how this would interfere with ID allocation, best practice is to have partitions of about 10.000 vertices/edges and commit these as one batch. In case of an exception, you rollback the transaction and raise your own exception. After that, Spark will retry the partition and your job will still succeed. It is worth a atry.

Best wishes,    Marc


Re: Making janus graph client to not use QUORUM

anjanisingh22@...
 

Thanks Marc, i will try that option.


Re: ID block allocation exception while creating edge

anjanisingh22@...
 

Sharing detail on how i am creating node/edges to make sure nothing wrong with that which is resulting in ID allocation failures.

 

I am creating one static instance JanusGraph object on each spark worker box and using that i am creating multiple transaction and commit.

pairRDD.foreachPartition(partIterator -> {
partIterator.forEachRemaining( tuple -> {
createNodeAndEdge(tuple, JanusGraphConfig.getJanusGraph(janusConfig));
});
}); where JanusGraphConfig.getJanusGraph returns static instance.

 

In createNodeAndEdge() method i am creating GraphTraversalSource using static janusGraph, creating node, edge, committing and then closing GraphTraversalSource object, as shown below in pseudo code:

createNodeAndEdge(Tuple2<K, V> pair, JanusGraph janusGraph)

{

GraphTraversalSource g = janusGraph.buildTransaction().start().traversal();
 try{

      create node;

      create edge;
     
      g.tx().commit();

    }  catch ( Exception) {

     g.tx().rollback();
  } finally() {
    g.tx().close();

    g.close();
  }

}

 

Thanks,
Anjani


Re: ID block allocation exception while creating edge

anjanisingh22@...
 

Thanks for response Marc. Yes i also think for some reason changes are not getting picked up but not able to figure out why so.

ids.block-size is updated in config file of all janus nodes and after that all nodes are re-started. 

In code i have only one method which is used to create janus-instance and same is passed to method for node/edge creation.

Yes 
IDS_BLOCK_SIZE  is equals "ids.block-size".

Thanks,
Anjani


Re: ID block allocation exception while creating edge

hadoopmarc@...
 

Hi Anjani,

It is still most likely that the modified value of "ids.block-size" somehow does not come through. So, are you sure that
  • all JanusGraph instances are closed before using the new value ("ids.block-size" has GLOBAL_OFFLINE mutability level). Safest is to have a fresh keyspace and one location for the properties to be used for both graph creation and bulk loading.
  • sorry for asking: does IDS_BLOCK_SIZE  equals "ids.block-size"
Best wishes,    Marc


Re: ID block allocation exception while creating edge

anjanisingh22@...
 

Hi Marc,

I tried setting ids.num-partitions = number of executors through code not directly in janus global config files but no luck. Added below properties but it didn't helped.
configProps.set("ids.renew-timeout", "240000");
configProps.set("ids.renew-percentage", "0.4");
configProps.set("ids.num-partitions", "253");

Thanks,
Anjani


Re: MapReduce reindexing with authentication

Boxuan Li
 

Hi Marc,

That is an interesting solution. I was not aware of the mapreduce.application.classpath property. It is not well documented, but from what I understand, this option is used primarily to distribute the mapreduce framework rather than user files. Glad to know it can be used for user files as well.
I am not 100% sure, but seems it requires you to upload the file to hdfs first (if you are using a yarn cluster). The ToolRunner, however, can add a file from local filesystem too. We prefer not to store keytab files on hdfs permanently. This difference is subtle, though. Also, we don’t use gremlin console anyway, so not being able to do so via gremlin console is not a drawback for us.

Agree with you that the documentation can be enhanced. Right now it simply says “The class starts a Hadoop MapReduce job using the Hadoop configuration and jars on the classpath.”, which is too brief and assumes users have a good knowledge of Hadoop MapReduce.

> One could even think of putting the mapreduce properties in the graph properties file and pass on properties of this namespace to the mapreduce client.

Not sure if it’s possible, but if someone implements it, it would be very helpful for users to do quick start without worrying about the cumbersome Hadoop configs.

Best regards,
Boxuan

「<hadoopmarc@...>」在 2021年5月24日 週一,下午3:48 寫道:

Hi Boxuan,

Yes, you are right, I mixed things up by wrongly interpreting GENERIC_OPTIONS as an env variable. I did some additional experiments. though, bringing in new information.

1. It is possible to put a mapred-site.xml file on the JanusGraph classpath that is automatically loaded by the mapreduce client. When using the file below during mapreduce reindexing, I get the following exception (on purpose):

gremlin> mr.updateIndex(i, SchemaAction.REINDEX).get()
java.io.FileNotFoundException: File file:/tera/lib/janusgraph-full-0.5.3/hi.tgz does not exist

The mapreduce config parameters are listed in https://hadoop.apache.org/docs/r2.7.3/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
The description for mapreduce.application.framework.path suggests that you can pass additional files to the mapreduce workers using this option (without any changes to JanusGraph).

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>local</value>
  </property>
  <property>
    <name>mapreduce.application.classpath</name>
    <value>dummy</value>
  </property>
  <property>
    <name>mapreduce.application.framework.path</name>
    <value>hi.tgz</value>
  </property>
  <property>
    <name>mapred.map.tasks</name>
    <value>2</value>
  </property>
  <property>
    <name>mapred.reduce.tasks</name>
    <value>2</value>
  </property>
</configuration>

2. When using mapreduce reindexing in the documented way, it already issues the following warning:
08:49:55 WARN  org.apache.hadoop.mapreduce.JobResourceUploader  - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

When you would resolve your keytab issue by modifying the JanusGraph code and calling the hadoop ToolRunner, you have the additional advantage of getting rid of this warning. This would not work from the gremlin console, though, unless gremlin.sh passes the additional command line options to the java command line (ugly).

So, I think I would prefer the option with mapred-site.xml. It would not hurt to slightly extend the mapreduce reindexing documentation, anyway:
  • when calling from the gremlin console, you need an "import org.janusgraph.hadoop.MapReduceIndexManagement"
  • mapreduce has a default setting mapreduce.framework.name=local. Where do you set mapreduce.framework.name=yarn for using your cluster? One could even think of putting the mapreduce properties in the graph properties file and pass on properties of this namespace to the mapreduce client.
Best wishes,    Marc


Re: MapReduce reindexing with authentication

hadoopmarc@...
 

Hi Boxuan,

Yes, you are right, I mixed things up by wrongly interpreting GENERIC_OPTIONS as an env variable. I did some additional experiments. though, bringing in new information.

1. It is possible to put a mapred-site.xml file on the JanusGraph classpath that is automatically loaded by the mapreduce client. When using the file below during mapreduce reindexing, I get the following exception (on purpose):

gremlin> mr.updateIndex(i, SchemaAction.REINDEX).get()
java.io.FileNotFoundException: File file:/tera/lib/janusgraph-full-0.5.3/hi.tgz does not exist

The mapreduce config parameters are listed in https://hadoop.apache.org/docs/r2.7.3/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
The description for mapreduce.application.framework.path suggests that you can pass additional files to the mapreduce workers using this option (without any changes to JanusGraph).

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>local</value>
  </property>
  <property>
    <name>mapreduce.application.classpath</name>
    <value>dummy</value>
  </property>
  <property>
    <name>mapreduce.application.framework.path</name>
    <value>hi.tgz</value>
  </property>
  <property>
    <name>mapred.map.tasks</name>
    <value>2</value>
  </property>
  <property>
    <name>mapred.reduce.tasks</name>
    <value>2</value>
  </property>
</configuration>

2. When using mapreduce reindexing in the documented way, it already issues the following warning:
08:49:55 WARN  org.apache.hadoop.mapreduce.JobResourceUploader  - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

When you would resolve your keytab issue by modifying the JanusGraph code and calling the hadoop ToolRunner, you have the additional advantage of getting rid of this warning. This would not work from the gremlin console, though, unless gremlin.sh passes the additional command line options to the java command line (ugly).

So, I think I would prefer the option with mapred-site.xml. It would not hurt to slightly extend the mapreduce reindexing documentation, anyway:
  • when calling from the gremlin console, you need an "import org.janusgraph.hadoop.MapReduceIndexManagement"
  • mapreduce has a default setting mapreduce.framework.name=local. Where do you set mapreduce.framework.name=yarn for using your cluster? One could even think of putting the mapreduce properties in the graph properties file and pass on properties of this namespace to the mapreduce client.
Best wishes,    Marc


Re: Making janus graph client to not use QUORUM

hadoopmarc@...
 

Hi Anjani,

To see what exactly happens with local configurations, I did the following:
  • from the binary janusgraph distribution I started janusgraph with "bin/janusgraph.sh start" (this implicitly uses conf/janusgraph-cql-es.properties)
  • I made a copy of conf/janusgraph-cql-es.properties in which I added your storage.cql.read-consistency-level=LOCAL_ONE
  • In gremlin console I ran the code below (using JanusGraph in an embedded way, no remote connection):
graph = JanusGraphFactory.open('conf/janusgraph-cql-es-local-one.properties')
conf = graph.getConfiguration().getLocalConfiguration()
ks = conf.getKeys(); null;
while (ks.hasNext()) {
  k = ks.next()
  System.out.print(String.format("%30s: %s\n", k, conf.getProperty(k)))
}
With printed output:
              storage.hostname: 127.0.0.1
storage.cql.read-consistency-level: LOCAL_ONE
                cache.db-cache: true
          storage.cql.keyspace: janusgraph
               storage.backend: cql
         index.search.hostname: 127.0.0.1
           cache.db-cache-size: 0.25
                 gremlin.graph: org.janusgraph.core.JanusGraphFactory

Can you do the same printing of configurations on the client that shows the exception about the QUORUM?
In this way, we can check whether the problem is in your code or in JanusGraph not properly passing the  local configurations.

Best wishes,    Marc