[ANNOUNCE] Weekly Community Update 2019/33-36

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[ANNOUNCE] Weekly Community Update 2019/33-36

Konstantin Knauf-2
Dear Community,

happy to share this "week's" community update, back after a three week summer break. It's been a very busy time in the Flink community as a lot of FLIP discussions and votes for Apache Flink 1.10 are on their way. I will try to cover a good part of it in this update along with bugs in Flink 1.9.0 and and more...

Flink Development

* [roadmap] There are currently two great resources to get an overview of Flink's Roadmap for 1.10 and beyond. The first one is the recently updated roadmap on the Project website [1] and the other one is a discussion thread launched by Gary on the features for Flink 1.10 [2]. Gary and Yu Li stepped up as release managers for Flink 1.10 and proposed a feature freeze around end of November 2019 and a release beginning of January 2020. Most of the FLIP discussions covered in this update are mentioned on these roadmaps.

* [releases] The vote for Apache Flink 1.8.2 RC1 [3] is currently ongoing. Checkout the corresponding discussion thread [4] for a list of fixes.

* [development] Following up on the repository split discussion, the community is now looking into other ways to reduce the build time of Apache Flink. Chesnay has proposed several options, some of which are investigated in more detailed as of writing. Among these are sharing JVMs between tests for more modules, moving to gradle has a build system (better incremental builds) and moving to a different CI system (Azure Pipelines?). [5]

* [state] Yu Li proposes to add a new state backend to Flink, the SpillableHeapStatebackend. [6] State will primarly live on the Java heap, but the coldest state will be spilled to disk if memory becomes scarce. The vote has already passed. [7]

* [python] Jincheng has started a discussion on adding support for user-defined functions in the Python Table API. The high-level architecture follows the approach of Beam's portability framework of executing user-defined functions in a separate language specific environment. The first FLIP (FLIP-58) will only deal with stateless user-defined functions and will lay the ground work.[8]

* [sql] Xu Forward has started a discussion on adding functions to construct and query JSON objects in Flink SQL. The proposal has generally been well-received, but there is no FLIP yet. [9]

* [sql] Bowen has started a discussion on reworking the function catalog, which among other goals aims to support external built-in functions (Hive), to revisit the resolution order of function names and to support fully qualified function names. [10]

* [connectors] Yijie Shen proposes to contribute the Apache Pulsar connector (currently in Apache Pulsar) back to Apache Flink. While everyone agrees that a strong Apache Pulsar connector is a valuable contribution to the project, there are concerns about build time, maintainability in the long-run and dependencies on FLIP-27 (New Source Interface). The discussion is ongoing. [11]

* [connectors] From Apache Flink 1.10 onwards the Kinesis Connector will be part of the Apache Flink release. In the past this was blocked by the license of its dependencies, which have recently been changed to Apache 2.0. [12]

* [recovery] Till has published to small FLIPs on Flink's restart strategies. The first one, FLIP-61, proposes to change the logic to determine the restart strategy to ignore restart strategy configuration properties, when the corresponding restart strategy was not set via "restart-strategy". The other one, FLIP-62, proposes to change the default restart delay for all strategies from 0s to 1s. The vote has passed for both of them [13, 14].

* [resource management] Following up on FLIP-49, Xintong Song has started a discussion on FLIP-53 to add fine grained operator resource management to Flink [15]. If I understand it correctly, the feature will only be available via the Blink Planner of the Table API at first, and might later be extended to the DataStream API. The DataSet API will not be affected. The vote [16] is currently ongoing.

* [configuration] Dawid introduced a FLIP that adds support to configure ExecutionConfig (and similar classes) from a file or more generally from layers above the StreamExecutionEnvironment, which you currently need access to change these configurations. [17]

* [development] Stephan proposed to switch Java's Duration class instead of Flink's time class for non-API parts of Flink (API maybe in Flink 2.0). [18]

* [development] Gyula started a discussion to unify the implementation of the Builder pattern in Flink. Following the discussion he will add some guidelines to the code style guide. [19] 

* [releases] Apache Flink-shaded 8.0 has been released. [20]

Notable Bugs

For this update, I will focus on new bugs in Flink 1.9.0.

* [FLINK-13386] [FLINK-13799] [FLINK-13591] [1.9.0] A couple of issues with the new dashboard have already been filed. If you experience any friction with it, check if these tickets already address the issue. Otherwise please create a new issue. [21,22,23]

* [FLINK-13568] [1.9.0] It is currently not possible to create a table with a "String" data type via the SQL DDL. Resolved. [24]

* [FLINK-13940] [1.9.0] [1.8.1] Since Flink 1.8.0 the StreamingFileSink cleans up some temporary files in S3 during recovery. If a job fails during recovery after the cleanup subsequent recovery attempts also fail, because the files have already been cleaned up. This results in data loss. Fixed with a workaround for 1.9.1 and 1.8.2. [25]

* [FLINK-13526] [1.9.0] When switching to a non existing catalog or database in the SQL Client the client crashes. [26]

* [FLINK-13737] flink-examples-table are missing in the binary distribution of Flink 1.9.0. Fixed for 1.9.1. [27]

* [FLINK-13958] [1.9.0] [1.8.1] [1.7.2] A native library can only be loaded by a single classloader per JVM. This may be a problem, if a native library is loaded via Flink's user classloader because the library might be reloaded after recovery by a new user class loader. The discussion on a possible resolution is ongoing. [28]

Events, Blog Posts, Misc

* Kostas Kloudas is now a member of the Apache Flink PMC. Congratulations! [29]
* Andrey Zagrebin is now an Apache Flink Committer. Congrats! [30]
* Flink Forward Europe training registration closes on September 30th. This time there are four different full-day training options (Dev, Ops, SQL, Tuning & Troubleshooting). [31]
* Upcoming Meetups
    * Enrico Canzonieri of Yelp and David Massart of Tentative will share their Apache Flink user stories of Yelp and BNP Paribas at the next Bay Area Apache Flink Meetup 24th of September.  [32]
    * On the 23rd of September there will be another edition of the London Flink Meetup with a talk by Yelp on how they run Flink on K8s. [33]




Konstantin Knauf | Solutions Architect

+49 160 91394525

Follow us @VervericaData Ververica


Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time


Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany


Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Tony) Cheng