Checkpoint size growing over time

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Checkpoint size growing over time

Daniel Harper
Hi there, 

We are running a streaming application on  Flink 1.5.2 with BEAM 2.7.0.

We’ve noticed that the checkpoint size appears to be increasing at a slow, gradual rate (see screenshot) over the course of many months and are not certain as to why this is happening. 

We take a checkpoint every 5 minutes and have an allowed lateness period of 30 minutes. 

Does anyone have any idea why this is happening, and are there any tools we can use to help us debug what state is being accumulated in this checkpoint? I’m assuming there is something that is supposed to discard prior state but in this case it does not appear to be happening.

Kind regards,


Screen Shot 2019-09-05 at 10.00.06.png (63K) Download Attachment