Flink vs Spark deployment modes on multi-node Cluster

Sheel Pancholi

In Spark, the three cluster (not local) deployment options that I am familiar with:

  • Standalone
  • Mesos
  • Yarn

There might be more cluster deployment options but I am concerned with these three. All the three above support client and cluster modes of deployment. The client mode involves the driver program being run from the edge machine itself and the cluster mode involves launching the driver in one of the worker nodes inside the cluster.

Now on the side of Flink, I only have experience with a 1 node setup which I learned from some tutorial which did not really elaborate on the ecosystem and was focussed more on code than "also" providing a big picture. I was looking at deployment options in Flink, therefore, to understand this. The documentation talks about the all the three options: Standalone, Mesos and YARN but it's not becoming clear from the docs if it supports (, what we in Spark's jargon would term as) the client mode or the cluster mode or both or some other mode.

The idea is to replace a Spark cluster with a Flink one. I want to understand the steps while I carry those out. The steps are available in the docs. The rationale behind those steps are either implicit (enough for me to not understand) or are just not there.

Please help me understand.