Task and Operator Metrics in Flink 1.3

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Task and Operator Metrics in Flink 1.3

Dail, Christopher

I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the metrics going into telegraf using tcpdump and found that there was a bunch of data missing that I expect.

 

I’m using this as a reference for what metrics I expect:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

 

I see all of the JobManager and TaskManager level metrics. Things like Status.JVM.* are coming through. TaskManager Status.Network are there (but not Task level buffers). The ‘Cluster’ metrics are there.

 

This IO section contains task and operator level metrics (like what is available on the dashboard). I’m not seeing any of these metrics coming through when using statsd.

 

I’m configuring flink with this configuration:

 

metrics.reporters: statsd

metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter

metrics.reporter.statsd.host: hostname

metrics.reporter.statsd.port: 8125

 

# Customized Scopes

metrics.scope.jm: flink.jm

metrics.scope.jm.job: flink.jm.<job_name>

metrics.scope.tm: flink.tm.<tm_id>

metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>

metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>

metrics.scope.operator: flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

 

I have tried with and without specifically setting the metrics.scope values.

 

Is anyone else having similar issues with metrics in 1.3?

 

Thanks

 

Chris Dail

Director, Software Engineering

Dell EMC | Infrastructure Solutions Group

mobile +1 506 863 4675

[hidden email]

 

 

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Task and Operator Metrics in Flink 1.3

Chesnay Schepler
The scopes look OK to me.

Let's try to narrow down the problem areas a bit:
  1. Did this work with the same setup before 1.3?
  2. Are all task/operator metrics available in the metrics tab of the dashboard?
  3. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?
My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

On 12.06.2017 20:32, Dail, Christopher wrote:

I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the metrics going into telegraf using tcpdump and found that there was a bunch of data missing that I expect.

 

I’m using this as a reference for what metrics I expect:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

 

I see all of the JobManager and TaskManager level metrics. Things like Status.JVM.* are coming through. TaskManager Status.Network are there (but not Task level buffers). The ‘Cluster’ metrics are there.

 

This IO section contains task and operator level metrics (like what is available on the dashboard). I’m not seeing any of these metrics coming through when using statsd.

 

I’m configuring flink with this configuration:

 

metrics.reporters: statsd

metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter

metrics.reporter.statsd.host: hostname

metrics.reporter.statsd.port: 8125

 

# Customized Scopes

metrics.scope.jm: flink.jm

metrics.scope.jm.job: flink.jm.<job_name>

metrics.scope.tm: flink.tm.<tm_id>

metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>

metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>

metrics.scope.operator: flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

 

I have tried with and without specifically setting the metrics.scope values.

 

Is anyone else having similar issues with metrics in 1.3?

 

Thanks

 

Chris Dail

Director, Software Engineering

Dell EMC | Infrastructure Solutions Group

mobile +1 506 863 4675

[hidden email]

 

 

 


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Task and Operator Metrics in Flink 1.3

Dail, Christopher

Responses to your questions:

 

  1. Did this work with the same setup before 1.3?

 

I have not tested it with another version. I started working on the metrics stuff with a snapshot of 1.3 and move to the release.

 

  1. Are all task/operator metrics available in the metrics tab of the dashboard?

 

Yes, the metrics are seen from the dashboard.

 

  1. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

 

No, I am not seeing any errors in the logs related to metrics.

 

 

> My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

 

This was my original thought too. I did find two separate issues with the metrics Flink outputs and I was planning on filing JIRA tickets on these. They are:

 

-          Flink does not escape spaces. I had a space in the job name which messed up the metrics. So I have a workaround for this but it is probably something Flink should escape.

-          Flink is outputting a float value of “n/a” for lastCheckpointExternalPath. A guage needs to be a float so Telegraf does not like this. It errors on and continues ignoring it though.

 

Note that even with these accounted for I am still not seeing the task/operator metrics. I ran a tcpdump to be sure on exactly what is coming through. Searching through that dump, I don’t see any of the metrics I was looking for.

 

I guess a few things to note. This is the application I am running:
https://github.com/chrisdail/pravega-samples/blob/master/flink-examples/src/main/scala/io/pravega/examples/flink/iot/TurbineHeatProcessor.scala

 

Also, I am running this in DC/OS 1.9 trying to integrate with DC/OS metrics.

 

Thanks

 

Chris

 

 

From: Chesnay Schepler <[hidden email]>
Date: Tuesday, June 13, 2017 at 5:26 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: Task and Operator Metrics in Flink 1.3

 

The scopes look OK to me.

Let's try to narrow down the problem areas a bit:

  1. Did this work with the same setup before 1.3?
  2. Are all task/operator metrics available in the metrics tab of the dashboard?
  3. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

On 12.06.2017 20:32, Dail, Christopher wrote:

I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the metrics going into telegraf using tcpdump and found that there was a bunch of data missing that I expect.

 

I’m using this as a reference for what metrics I expect:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

 

I see all of the JobManager and TaskManager level metrics. Things like Status.JVM.* are coming through. TaskManager Status.Network are there (but not Task level buffers). The ‘Cluster’ metrics are there.

 

This IO section contains task and operator level metrics (like what is available on the dashboard). I’m not seeing any of these metrics coming through when using statsd.

 

I’m configuring flink with this configuration:

 

metrics.reporters: statsd

metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter

metrics.reporter.statsd.host: hostname

metrics.reporter.statsd.port: 8125

 

# Customized Scopes

metrics.scope.jm: flink.jm

metrics.scope.jm.job: flink.jm.<job_name>

metrics.scope.tm: flink.tm.<tm_id>

metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>

metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>

metrics.scope.operator: flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

 

I have tried with and without specifically setting the metrics.scope values.

 

Is anyone else having similar issues with metrics in 1.3?

 

Thanks

 

Chris Dail

Director, Software Engineering

Dell EMC | Infrastructure Solutions Group

mobile +1 506 863 4675

[hidden email]

 

 

 

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Task and Operator Metrics in Flink 1.3

Chesnay Schepler
Both your suggestions sound good, would be great to create JIRAs for them.

Could you replace the task scope format with the one below and try again?

metrics.scope.task: flink.tm.<tm_id>.<job_id>.<task_id>.<subtask_index>

This scope doesn't contain any special characters, except the periods.
If you receive task metrics with this scope there are some other special characters we need to filter out.

Filtering characters in the StatsDReporter is always a bit icky though, since it supports many storage
backends with different requirements. The last-resort would be to filter out all special characters.

On 13.06.2017 13:41, Dail, Christopher wrote:

Responses to your questions:

 

  1. Did this work with the same setup before 1.3?

 

I have not tested it with another version. I started working on the metrics stuff with a snapshot of 1.3 and move to the release.

 

  1. Are all task/operator metrics available in the metrics tab of the dashboard?

 

Yes, the metrics are seen from the dashboard.

 

  1. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

 

No, I am not seeing any errors in the logs related to metrics.

 

 

> My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

 

This was my original thought too. I did find two separate issues with the metrics Flink outputs and I was planning on filing JIRA tickets on these. They are:

 

-          Flink does not escape spaces. I had a space in the job name which messed up the metrics. So I have a workaround for this but it is probably something Flink should escape.

-          Flink is outputting a float value of “n/a” for lastCheckpointExternalPath. A guage needs to be a float so Telegraf does not like this. It errors on and continues ignoring it though.

 

Note that even with these accounted for I am still not seeing the task/operator metrics. I ran a tcpdump to be sure on exactly what is coming through. Searching through that dump, I don’t see any of the metrics I was looking for.

 

I guess a few things to note. This is the application I am running:
https://github.com/chrisdail/pravega-samples/blob/master/flink-examples/src/main/scala/io/pravega/examples/flink/iot/TurbineHeatProcessor.scala

 

Also, I am running this in DC/OS 1.9 trying to integrate with DC/OS metrics.

 

Thanks

 

Chris

 

 

From: Chesnay Schepler [hidden email]
Date: Tuesday, June 13, 2017 at 5:26 AM
To: [hidden email] [hidden email]
Subject: Re: Task and Operator Metrics in Flink 1.3

 

The scopes look OK to me.

Let's try to narrow down the problem areas a bit:

  1. Did this work with the same setup before 1.3?
  2. Are all task/operator metrics available in the metrics tab of the dashboard?
  3. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

On 12.06.2017 20:32, Dail, Christopher wrote:

I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the metrics going into telegraf using tcpdump and found that there was a bunch of data missing that I expect.

 

I’m using this as a reference for what metrics I expect:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

 

I see all of the JobManager and TaskManager level metrics. Things like Status.JVM.* are coming through. TaskManager Status.Network are there (but not Task level buffers). The ‘Cluster’ metrics are there.

 

This IO section contains task and operator level metrics (like what is available on the dashboard). I’m not seeing any of these metrics coming through when using statsd.

 

I’m configuring flink with this configuration:

 

metrics.reporters: statsd

metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter

metrics.reporter.statsd.host: hostname

metrics.reporter.statsd.port: 8125

 

# Customized Scopes

metrics.scope.jm: flink.jm

metrics.scope.jm.job: flink.jm.<job_name>

metrics.scope.tm: flink.tm.<tm_id>

metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>

metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>

metrics.scope.operator: flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

 

I have tried with and without specifically setting the metrics.scope values.

 

Is anyone else having similar issues with metrics in 1.3?

 

Thanks

 

Chris Dail

Director, Software Engineering

Dell EMC | Infrastructure Solutions Group

mobile +1 506 863 4675

[hidden email]

 

 

 

 


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Task and Operator Metrics in Flink 1.3

Dail, Christopher

I think I found the root cause of this problem. It has to do with how DC/OS metrics handling works.

 

DC/OS passes special environment variables to any task started by mesos. These include STATSD_UDP_HOST and STATSD_UDP_PORT. It sets up a StatsD relay that adds extra data into statsd events that adds things like the mesos framework and container that the events came from. The host/port it passes in are unique per mesos task that is run. What I was doing was replacing the ‘metrics.reporter.statsd.host’ and ‘metrics.reporter.statsd.port’ config values in the flink-config.yaml file with the values from DC/OS on startup.

 

The above works fine for the job manager which is the main task that mesos starts. When DC/OS starts the mesos tasks for the flink task managers, it cannot access that same host/port combination to talk to statsd. This is why I was getting only the metrics down to the job manager and not any of the task level metrics. The task managers could not actually talk to the DC/OS statsd.

 

So this is going to be an integration issue for people trying to run Flink on DC/OS if using the built in DC/OS metrics. For my setup I have worked around this issue by writing directly to our telegraf server and bypassing DC/OS. This misses the extra metadata they add.

 

I will file the other issues I mentioned earlier.

 

Thanks

 

Chris

 

From: Chesnay Schepler <[hidden email]>
Date: Tuesday, June 13, 2017 at 9:21 AM
To: "Dail, Christopher" <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Task and Operator Metrics in Flink 1.3

 

Both your suggestions sound good, would be great to create JIRAs for them.

Could you replace the task scope format with the one below and try again?

metrics.scope.task: flink.tm.<tm_id>.<job_id>.<task_id>.<subtask_index>

This scope doesn't contain any special characters, except the periods.
If you receive task metrics with this scope there are some other special characters we need to filter out.

Filtering characters in the StatsDReporter is always a bit icky though, since it supports many storage
backends with different requirements. The last-resort would be to filter out all special characters.

On 13.06.2017 13:41, Dail, Christopher wrote:

Responses to your questions:

 

  1. Did this work with the same setup before 1.3?

 

I have not tested it with another version. I started working on the metrics stuff with a snapshot of 1.3 and move to the release.

 

  1. Are all task/operator metrics available in the metrics tab of the dashboard?

 

Yes, the metrics are seen from the dashboard.

 

  1. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

 

No, I am not seeing any errors in the logs related to metrics.

 

 

> My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

 

This was my original thought too. I did find two separate issues with the metrics Flink outputs and I was planning on filing JIRA tickets on these. They are:

 

-          Flink does not escape spaces. I had a space in the job name which messed up the metrics. So I have a workaround for this but it is probably something Flink should escape.

-          Flink is outputting a float value of “n/a” for lastCheckpointExternalPath. A guage needs to be a float so Telegraf does not like this. It errors on and continues ignoring it though.

 

Note that even with these accounted for I am still not seeing the task/operator metrics. I ran a tcpdump to be sure on exactly what is coming through. Searching through that dump, I don’t see any of the metrics I was looking for.

 

I guess a few things to note. This is the application I am running:
https://github.com/chrisdail/pravega-samples/blob/master/flink-examples/src/main/scala/io/pravega/examples/flink/iot/TurbineHeatProcessor.scala

 

Also, I am running this in DC/OS 1.9 trying to integrate with DC/OS metrics.

 

Thanks

 

Chris

 

 

From: Chesnay Schepler [hidden email]
Date: Tuesday, June 13, 2017 at 5:26 AM
To: [hidden email] [hidden email]
Subject: Re: Task and Operator Metrics in Flink 1.3

 

The scopes look OK to me.

Let's try to narrow down the problem areas a bit:

  1. Did this work with the same setup before 1.3?
  2. Are all task/operator metrics available in the metrics tab of the dashboard?
  3. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

On 12.06.2017 20:32, Dail, Christopher wrote:

I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the metrics going into telegraf using tcpdump and found that there was a bunch of data missing that I expect.

 

I’m using this as a reference for what metrics I expect:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

 

I see all of the JobManager and TaskManager level metrics. Things like Status.JVM.* are coming through. TaskManager Status.Network are there (but not Task level buffers). The ‘Cluster’ metrics are there.

 

This IO section contains task and operator level metrics (like what is available on the dashboard). I’m not seeing any of these metrics coming through when using statsd.

 

I’m configuring flink with this configuration:

 

metrics.reporters: statsd

metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter

metrics.reporter.statsd.host: hostname

metrics.reporter.statsd.port: 8125

 

# Customized Scopes

metrics.scope.jm: flink.jm

metrics.scope.jm.job: flink.jm.<job_name>

metrics.scope.tm: flink.tm.<tm_id>

metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>

metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>

metrics.scope.operator: flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

 

I have tried with and without specifically setting the metrics.scope values.

 

Is anyone else having similar issues with metrics in 1.3?

 

Thanks

 

Chris Dail

Director, Software Engineering

Dell EMC | Infrastructure Solutions Group

mobile +1 506 863 4675

[hidden email]

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Task and Operator Metrics in Flink 1.3

Dail, Christopher

For reference, the two issues I filed on the metrics:

 

https://issues.apache.org/jira/browse/FLINK-6910 - Metrics value for lastCheckpointExternalPath is not valid

https://issues.apache.org/jira/browse/FLINK-6911 - StatsD Metrics name should escape spaces

 

Thanks

 

Chris

 

From: "Dail, Christopher" <[hidden email]>
Date: Tuesday, June 13, 2017 at 1:58 PM
To: Chesnay Schepler <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Task and Operator Metrics in Flink 1.3

 

I think I found the root cause of this problem. It has to do with how DC/OS metrics handling works.

 

DC/OS passes special environment variables to any task started by mesos. These include STATSD_UDP_HOST and STATSD_UDP_PORT. It sets up a StatsD relay that adds extra data into statsd events that adds things like the mesos framework and container that the events came from. The host/port it passes in are unique per mesos task that is run. What I was doing was replacing the ‘metrics.reporter.statsd.host’ and ‘metrics.reporter.statsd.port’ config values in the flink-config.yaml file with the values from DC/OS on startup.

 

The above works fine for the job manager which is the main task that mesos starts. When DC/OS starts the mesos tasks for the flink task managers, it cannot access that same host/port combination to talk to statsd. This is why I was getting only the metrics down to the job manager and not any of the task level metrics. The task managers could not actually talk to the DC/OS statsd.

 

So this is going to be an integration issue for people trying to run Flink on DC/OS if using the built in DC/OS metrics. For my setup I have worked around this issue by writing directly to our telegraf server and bypassing DC/OS. This misses the extra metadata they add.

 

I will file the other issues I mentioned earlier.

 

Thanks

 

Chris

 

From: Chesnay Schepler <[hidden email]>
Date: Tuesday, June 13, 2017 at 9:21 AM
To: "Dail, Christopher" <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Task and Operator Metrics in Flink 1.3

 

Both your suggestions sound good, would be great to create JIRAs for them.

Could you replace the task scope format with the one below and try again?

metrics.scope.task: flink.tm.<tm_id>.<job_id>.<task_id>.<subtask_index>

This scope doesn't contain any special characters, except the periods.
If you receive task metrics with this scope there are some other special characters we need to filter out.

Filtering characters in the StatsDReporter is always a bit icky though, since it supports many storage
backends with different requirements. The last-resort would be to filter out all special characters.

On 13.06.2017 13:41, Dail, Christopher wrote:

Responses to your questions:

 

  1. Did this work with the same setup before 1.3?

 

I have not tested it with another version. I started working on the metrics stuff with a snapshot of 1.3 and move to the release.

 

  1. Are all task/operator metrics available in the metrics tab of the dashboard?

 

Yes, the metrics are seen from the dashboard.

 

  1. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

 

No, I am not seeing any errors in the logs related to metrics.

 

 

> My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

 

This was my original thought too. I did find two separate issues with the metrics Flink outputs and I was planning on filing JIRA tickets on these. They are:

 

-          Flink does not escape spaces. I had a space in the job name which messed up the metrics. So I have a workaround for this but it is probably something Flink should escape.

-          Flink is outputting a float value of “n/a” for lastCheckpointExternalPath. A guage needs to be a float so Telegraf does not like this. It errors on and continues ignoring it though.

 

Note that even with these accounted for I am still not seeing the task/operator metrics. I ran a tcpdump to be sure on exactly what is coming through. Searching through that dump, I don’t see any of the metrics I was looking for.

 

I guess a few things to note. This is the application I am running:
https://github.com/chrisdail/pravega-samples/blob/master/flink-examples/src/main/scala/io/pravega/examples/flink/iot/TurbineHeatProcessor.scala

 

Also, I am running this in DC/OS 1.9 trying to integrate with DC/OS metrics.

 

Thanks

 

Chris

 

 

From: Chesnay Schepler [hidden email]
Date: Tuesday, June 13, 2017 at 5:26 AM
To: [hidden email] [hidden email]
Subject: Re: Task and Operator Metrics in Flink 1.3

 

The scopes look OK to me.

Let's try to narrow down the problem areas a bit:

  1. Did this work with the same setup before 1.3?
  2. Are all task/operator metrics available in the metrics tab of the dashboard?
  3. Are there any warnings in the TaskManager logs from the MetricRegistry or StatsDReporter?

My guess would be that the operator/task metrics contain characters that either StatsD or telegraf don't allow,
which causes them to be dropped.

On 12.06.2017 20:32, Dail, Christopher wrote:

I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the metrics going into telegraf using tcpdump and found that there was a bunch of data missing that I expect.

 

I’m using this as a reference for what metrics I expect:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

 

I see all of the JobManager and TaskManager level metrics. Things like Status.JVM.* are coming through. TaskManager Status.Network are there (but not Task level buffers). The ‘Cluster’ metrics are there.

 

This IO section contains task and operator level metrics (like what is available on the dashboard). I’m not seeing any of these metrics coming through when using statsd.

 

I’m configuring flink with this configuration:

 

metrics.reporters: statsd

metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter

metrics.reporter.statsd.host: hostname

metrics.reporter.statsd.port: 8125

 

# Customized Scopes

metrics.scope.jm: flink.jm

metrics.scope.jm.job: flink.jm.<job_name>

metrics.scope.tm: flink.tm.<tm_id>

metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>

metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>

metrics.scope.operator: flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

 

I have tried with and without specifically setting the metrics.scope values.

 

Is anyone else having similar issues with metrics in 1.3?

 

Thanks

 

Chris Dail

Director, Software Engineering

Dell EMC | Infrastructure Solutions Group

mobile +1 506 863 4675

[hidden email]

 

 

 

 

 

Loading...