Flink-derrived operator names cause issues in Graphite metrics

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Flink-derrived operator names cause issues in Graphite metrics

Carst Tankink
Hi,

We accidentally forgot to give some operators in our flink stream a custom/unique name, and ran into the following exception in Graphite:
‘exceptions.IOError: [Errno 36] File name too long: '/<pathToJob>/<jobName>/TriggerWindow_SlidingEventTimeWindows_600000_-600000__-FoldingStateDescriptor_serializer=org-apache-flink-api-common-typeutils-base-IntSerializer_655523dd_-initialValue=0_-foldFunction=<functionQualifiedName>_24e08d59__-EventTimeTrigger___-WindowedStream-fold_AllWindowedStream-java:<lineNo> __/0/buffers/inputQueueLength.wsp'

(some placeholders because it might reveal too much about our platform, sorry. The actual filename is quite a bit longer).

The problem seems to be that Flink uses toString for the operator if no name is set, and the graphite exporter does not sanitize the output for length.
Is this something that should be posted as a bug?  Or a known limitation that we missed in the documentation?

Thanks,
Caarst

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Flink-derrived operator names cause issues in Graphite metrics

Chesnay Schepler
So there's 2 issues here:
  1. The default names for windows are horrible. They are to long, full of special characters, and unstable as reported in FLINK-6464
  2. The reporter doesn't filter out metrics it can't report.

For 2) we can do 2 things:

  • If a fully assembled metric name is too long the graphite reporter will ignore the metric and log a warning.
  • when converting the operator name to a string, limit the total size. Say, 40-60 characters. This may not be enough for your use-case though.
I'll create JIRAs for 2), and try to fix them as soon as possible.

A more comprehensive solution will be made as part of FLINK-6464, which includes a clean-up/refactoring of operator names.

On 12.06.2017 14:45, Carst Tankink wrote:
Hi, 

We accidentally forgot to give some operators in our flink stream a custom/unique name, and ran into the following exception in Graphite:
‘exceptions.IOError: [Errno 36] File name too long: '/<pathToJob>/<jobName>/TriggerWindow_SlidingEventTimeWindows_600000_-600000__-FoldingStateDescriptor_serializer=org-apache-flink-api-common-typeutils-base-IntSerializer_655523dd_-initialValue=0_-foldFunction=<functionQualifiedName>_24e08d59__-EventTimeTrigger___-WindowedStream-fold_AllWindowedStream-java:<lineNo> __/0/buffers/inputQueueLength.wsp'

(some placeholders because it might reveal too much about our platform, sorry. The actual filename is quite a bit longer).

The problem seems to be that Flink uses toString for the operator if no name is set, and the graphite exporter does not sanitize the output for length. 
Is this something that should be posted as a bug?  Or a known limitation that we missed in the documentation? 

Thanks,
Caarst


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Flink-derrived operator names cause issues in Graphite metrics

Carst Tankink

Thanks for the quick response :-)

 

I think the limiting of names might still be good enough for my use-case, because the default case is naming operators properly (it helps in creating dashboards...) but if we forget/miss one, we do not want to start hammering our graphite setup with bad data.

 

Thanks again,

Carst

 

From: Chesnay Schepler <[hidden email]>
Date: Monday, June 12, 2017 at 15:10
To: "[hidden email]" <[hidden email]>
Subject: Re: Flink-derrived operator names cause issues in Graphite metrics

 

So there's 2 issues here:

  1. The default names for windows are horrible. They are to long, full of special characters, and unstable as reported in FLINK-6464
  2. The reporter doesn't filter out metrics it can't report.

For 2) we can do 2 things:

  • If a fully assembled metric name is too long the graphite reporter will ignore the metric and log a warning.
  • when converting the operator name to a string, limit the total size. Say, 40-60 characters. This may not be enough for your use-case though.

I'll create JIRAs for 2), and try to fix them as soon as possible.

A more comprehensive solution will be made as part of FLINK-6464, which includes a clean-up/refactoring of operator names.

On 12.06.2017 14:45, Carst Tankink wrote:

Hi, 
 
We accidentally forgot to give some operators in our flink stream a custom/unique name, and ran into the following exception in Graphite:
‘exceptions.IOError: [Errno 36] File name too long: '/<pathToJob>/<jobName>/TriggerWindow_SlidingEventTimeWindows_600000_-600000__-FoldingStateDescriptor_serializer=org-apache-flink-api-common-typeutils-base-IntSerializer_655523dd_-initialValue=0_-foldFunction=<functionQualifiedName>_24e08d59__-EventTimeTrigger___-WindowedStream-fold_AllWindowedStream-java:<lineNo> __/0/buffers/inputQueueLength.wsp'
 
(some placeholders because it might reveal too much about our platform, sorry. The actual filename is quite a bit longer).
 
The problem seems to be that Flink uses toString for the operator if no name is set, and the graphite exporter does not sanitize the output for length. 
Is this something that should be posted as a bug?  Or a known limitation that we missed in the documentation? 
 
Thanks,
Caarst
 

 

Loading...