Re: How to read large amount of data from hive and write to redis, in a batch manner?
You could always buffer records in your sink function/operator, until a large enough batch is accumulated and upload the whole batch at once. Note that if you want to have at-least-once or exactly-once semantics, you would need to take care of those buffered records in one way or another. For example you could:
1. Buffer records on some in memory data structure (not Flink's state), and just make sure that those records are flushed to the underlying sink on `CheckpointedFunction#snapshotState()` calls
2. Buffer records on Flink's state (heap state backend or rocksdb - heap state backend would be the fastest with little overhead, but you can risk running out of memory), and that would easily give you exactly-once. That way your batch could span multiple checkpoints.
3. Buffer/write records to temporary files, but in that case keep in mind that those files need to be persisted and recovered in case of failure and restart.
4. Ignore checkpointing and either always restart the job from scratch or accept some occasional data loss.
FYI, virtually every connector/sink is internally batching writes to some extent. Usually by doing option 1.