Operate with JMX metrics and measurements units. What metrics to observe for read queries?


vincent....@...
 

I have setup JanusGraph and enabled JMX. I have a JMX exported to Prometheus which allows me to query the metrics and generate graphs in Grafana. 

There is probably a hundred different metrics available. What I find hard to understand is:

- what unit are they in (microseconds, milliseconds?)
- which ones are useful to observe for read queries?

One example: I have Cassandra as backend and a Spring boot API to run queries and an UI in front of it.

I have metrics and graphs for Cassandra and the API component. But I would like to know how many queries Janus is handling and the time queries take to run at a 99th percentile.

There is a metric called `metrics_org_janusgraph_query_graph_execute_time_99thPercentile`. Is the the right one to observe? What unit is it in? Microseconds or milliseconds?

I get values such as `400`.

What metrics do you look at to observe read performance?


v.sure...@...
 

Hi Vincent,

We were configuring the jmx to report the JanusGraph (JG) metrics on to Prometheus, here's the sample JMX yaml file:

lowercaseOutputName: true

lowercaseOutputLabelNames
: true
ssl
: false
rules
:
- pattern : global.storeManager.openDatabase.calls
  name
: "jg_calls_count"
- pattern : org.janusgraph.<type=(.*)>
  name
: "jg_test_$1"

Started the Gremlin Server with java agent:

exec $JAVA -javaagent:"/data/janusgraph/jmx/jmx_prometheus_javaagent-0.3.1.jar"=9086:/data/janusgraph/jmx/jmx.yml -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=5556 -Djanusgraph.logdir="$JANUSGRAPH_LOGDIR" -Dlog4j.configuration=conf/gremlin-server/log4j-server.properties $JAVA_OPTIONS -cp $CP:$CLASSPATH org.apache.tinkerpop.gremlin.server.GremlinServer $ARGS

And added the jmx reporter on graph properties file:
metrics.enabled = true
metrics
.console.interval = 60000
metrics
.jmx.enabled = true
metrics
.jmx.interval = 60000


It reports all JVM specific metrics on Prometheus (targets) but none of the JanusGraph specific metrics could be seen, just wondering if you can share any steps that you followed to configure JMX metrics reporter on Prometheus?

Thanks,
Suresh V

On Tuesday, June 26, 2018 at 9:50:53 AM UTC+5:30, vin...@... wrote:
I have setup JanusGraph and enabled JMX. I have a JMX exported to Prometheus which allows me to query the metrics and generate graphs in Grafana. 

There is probably a hundred different metrics available. What I find hard to understand is:

- what unit are they in (microseconds, milliseconds?)
- which ones are useful to observe for read queries?

One example: I have Cassandra as backend and a Spring boot API to run queries and an UI in front of it.

I have metrics and graphs for Cassandra and the API component. But I would like to know how many queries Janus is handling and the time queries take to run at a 99th percentile.

There is a metric called `metrics_org_janusgraph_query_graph_execute_time_99thPercentile`. Is the the right one to observe? What unit is it in? Microseconds or milliseconds?

I get values such as `400`.

What metrics do you look at to observe read performance?


Florian Hockmann <f...@...>
 

Did you enable the jmxReporter in the gremlin-server.yaml?

Am Dienstag, 13. August 2019 13:19:52 UTC+2 schrieb v....@...:

Hi Vincent,

We were configuring the jmx to report the JanusGraph (JG) metrics on to Prometheus, here's the sample JMX yaml file:

lowercaseOutputName: true

lowercaseOutputLabelNames
: true
ssl
: false
rules
:
- pattern : global.storeManager.openDatabase.calls
  name
: "jg_calls_count"
- pattern : org.janusgraph.<type=(.*)>
  name
: "jg_test_$1"

Started the Gremlin Server with java agent:

exec $JAVA -javaagent:"/data/janusgraph/jmx/jmx_prometheus_javaagent-0.3.1.jar"=9086:/data/janusgraph/jmx/jmx.yml -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=5556 -Djanusgraph.logdir="$JANUSGRAPH_LOGDIR" -Dlog4j.configuration=conf/gremlin-server/log4j-server.properties $JAVA_OPTIONS -cp $CP:$CLASSPATH org.apache.tinkerpop.gremlin.server.GremlinServer $ARGS

And added the jmx reporter on graph properties file:
metrics.enabled = true
metrics
.console.interval = 60000
metrics
.jmx.enabled = true
metrics
.jmx.interval = 60000


It reports all JVM specific metrics on Prometheus (targets) but none of the JanusGraph specific metrics could be seen, just wondering if you can share any steps that you followed to configure JMX metrics reporter on Prometheus?

Thanks,
Suresh V

On Tuesday, June 26, 2018 at 9:50:53 AM UTC+5:30, vin...@... wrote:
I have setup JanusGraph and enabled JMX. I have a JMX exported to Prometheus which allows me to query the metrics and generate graphs in Grafana. 

There is probably a hundred different metrics available. What I find hard to understand is:

- what unit are they in (microseconds, milliseconds?)
- which ones are useful to observe for read queries?

One example: I have Cassandra as backend and a Spring boot API to run queries and an UI in front of it.

I have metrics and graphs for Cassandra and the API component. But I would like to know how many queries Janus is handling and the time queries take to run at a 99th percentile.

There is a metric called `metrics_org_janusgraph_query_graph_execute_time_99thPercentile`. Is the the right one to observe? What unit is it in? Microseconds or milliseconds?

I get values such as `400`.

What metrics do you look at to observe read performance?


v.sure...@...
 

Nope I didn't add jmxReporter reporter in  gremlin-server.yaml, however the following properties were added in respective graph.properties file

metrics.enabled = true
metrics
.console.interval = 60000
metrics
.jmx.enabled = true


BTW, was able to resolve this by adding prometheus jmx java agent and jmx VM args in gremlin-server.sh:

exec $JAVA -javaagent:"/data/jmx/jmx_prometheus_javaagent-0.3.1.jar"=9086:/data/jmx/jg_jmx.yml -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=5556 -Djanusgraph.logdir="$JANUSGRAPH_LOGDIR" -Dlog4j.configuration=conf/gremlin-server/log4j-server.properties $JAVA_OPTIONS -cp $CP:$CLASSPATH org.apache.tinkerpop.gremlin.server.GremlinServer $ARGS


Added some metrics in jg_jmx.yml file

lowercaseOutputName: true
lowercaseOutputLabelNames
: true
ssl
: false
rules
:
- pattern : global.storeManager.openDatabase.calls
  name
: "jg_global_storemanager_opendb_calls"
- pattern : global.storeManager.startTransaction.calls
  name
: "jg_global_storemanager_start_tx_calls"
- pattern : org.janusgraph.caches.retrievals
  name
: "jg_caches_retrievals"



On Friday, September 6, 2019 at 1:53:58 PM UTC+5:30, Florian Hockmann wrote:
Did you enable the jmxReporter in the gremlin-server.yaml?

Am Dienstag, 13. August 2019 13:19:52 UTC+2 schrieb v....@...:
Hi Vincent,

We were configuring the jmx to report the JanusGraph (JG) metrics on to Prometheus, here's the sample JMX yaml file:

lowercaseOutputName: true

lowercaseOutputLabelNames
: true
ssl
: false
rules
:
- pattern : global.storeManager.openDatabase.calls
  name
: "jg_calls_count"
- pattern : org.janusgraph.<type=(.*)>
  name
: "jg_test_$1"

Started the Gremlin Server with java agent:

exec $JAVA -javaagent:"/data/janusgraph/jmx/jmx_prometheus_javaagent-0.3.1.jar"=9086:/data/janusgraph/jmx/jmx.yml -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=5556 -Djanusgraph.logdir="$JANUSGRAPH_LOGDIR" -Dlog4j.configuration=conf/gremlin-server/log4j-server.properties $JAVA_OPTIONS -cp $CP:$CLASSPATH org.apache.tinkerpop.gremlin.server.GremlinServer $ARGS

And added the jmx reporter on graph properties file:
metrics.enabled = true
metrics
.console.interval = 60000
metrics
.jmx.enabled = true
metrics
.jmx.interval = 60000


It reports all JVM specific metrics on Prometheus (targets) but none of the JanusGraph specific metrics could be seen, just wondering if you can share any steps that you followed to configure JMX metrics reporter on Prometheus?

Thanks,
Suresh V

On Tuesday, June 26, 2018 at 9:50:53 AM UTC+5:30, vin...@... wrote:
I have setup JanusGraph and enabled JMX. I have a JMX exported to Prometheus which allows me to query the metrics and generate graphs in Grafana. 

There is probably a hundred different metrics available. What I find hard to understand is:

- what unit are they in (microseconds, milliseconds?)
- which ones are useful to observe for read queries?

One example: I have Cassandra as backend and a Spring boot API to run queries and an UI in front of it.

I have metrics and graphs for Cassandra and the API component. But I would like to know how many queries Janus is handling and the time queries take to run at a 99th percentile.

There is a metric called `metrics_org_janusgraph_query_graph_execute_time_99thPercentile`. Is the the right one to observe? What unit is it in? Microseconds or milliseconds?

I get values such as `400`.

What metrics do you look at to observe read performance?


Florian Hockmann <f...@...>
 

Good to hear that you solved the problem already!

If you have the time and want to contribute to JanusGraph, then you could document the necessary steps to configure JanusGraph to get metrics into Prometheus: https://github.com/JanusGraph/janusgraph/issues/1087

Am Freitag, 6. September 2019 13:57:06 UTC+2 schrieb v....@...:

Nope I didn't add jmxReporter reporter in  gremlin-server.yaml, however the following properties were added in respective graph.properties file

metrics.enabled = true
metrics
.console.interval = 60000
metrics
.jmx.enabled = true


BTW, was able to resolve this by adding prometheus jmx java agent and jmx VM args in gremlin-server.sh:

exec $JAVA -javaagent:"/data/jmx/jmx_prometheus_javaagent-0.3.1.jar"=9086:/data/jmx/jg_jmx.yml -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=5556 -Djanusgraph.logdir="$JANUSGRAPH_LOGDIR" -Dlog4j.configuration=conf/gremlin-server/log4j-server.properties $JAVA_OPTIONS -cp $CP:$CLASSPATH org.apache.tinkerpop.gremlin.server.GremlinServer $ARGS


Added some metrics in jg_jmx.yml file

lowercaseOutputName: true
lowercaseOutputLabelNames
: true
ssl
: false
rules
:
- pattern : global.storeManager.openDatabase.calls
  name
: "jg_global_storemanager_opendb_calls"
- pattern : global.storeManager.startTransaction.calls
  name
: "jg_global_storemanager_start_tx_calls"
- pattern : org.janusgraph.caches.retrievals
  name
: "jg_caches_retrievals"



On Friday, September 6, 2019 at 1:53:58 PM UTC+5:30, Florian Hockmann wrote:
Did you enable the jmxReporter in the gremlin-server.yaml?

Am Dienstag, 13. August 2019 13:19:52 UTC+2 schrieb v....@...:
Hi Vincent,

We were configuring the jmx to report the JanusGraph (JG) metrics on to Prometheus, here's the sample JMX yaml file:

lowercaseOutputName: true

lowercaseOutputLabelNames
: true
ssl
: false
rules
:
- pattern : global.storeManager.openDatabase.calls
  name
: "jg_calls_count"
- pattern : org.janusgraph.<type=(.*)>
  name
: "jg_test_$1"

Started the Gremlin Server with java agent:

exec $JAVA -javaagent:"/data/janusgraph/jmx/jmx_prometheus_javaagent-0.3.1.jar"=9086:/data/janusgraph/jmx/jmx.yml -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=5556 -Djanusgraph.logdir="$JANUSGRAPH_LOGDIR" -Dlog4j.configuration=conf/gremlin-server/log4j-server.properties $JAVA_OPTIONS -cp $CP:$CLASSPATH org.apache.tinkerpop.gremlin.server.GremlinServer $ARGS

And added the jmx reporter on graph properties file:
metrics.enabled = true
metrics
.console.interval = 60000
metrics
.jmx.enabled = true
metrics
.jmx.interval = 60000


It reports all JVM specific metrics on Prometheus (targets) but none of the JanusGraph specific metrics could be seen, just wondering if you can share any steps that you followed to configure JMX metrics reporter on Prometheus?

Thanks,
Suresh V

On Tuesday, June 26, 2018 at 9:50:53 AM UTC+5:30, vin...@... wrote:
I have setup JanusGraph and enabled JMX. I have a JMX exported to Prometheus which allows me to query the metrics and generate graphs in Grafana. 

There is probably a hundred different metrics available. What I find hard to understand is:

- what unit are they in (microseconds, milliseconds?)
- which ones are useful to observe for read queries?

One example: I have Cassandra as backend and a Spring boot API to run queries and an UI in front of it.

I have metrics and graphs for Cassandra and the API component. But I would like to know how many queries Janus is handling and the time queries take to run at a 99th percentile.

There is a metric called `metrics_org_janusgraph_query_graph_execute_time_99thPercentile`. Is the the right one to observe? What unit is it in? Microseconds or milliseconds?

I get values such as `400`.

What metrics do you look at to observe read performance?


Ronnie
 

Hi,
Please can someone clarify the default unit for the time relates metrics? Couldnt find this from any of the docs or on the internet.

Thanks!
Ronnie