Flink job generating small files in Hadoop
I have an Apache Flink application that retrieves messages from Google PubSub and stores them in Hadoop/HDFS using the Parquet format, utilizing the PubSubSource connector for data ingestion.
I have an Apache Flink application that retrieves messages from Google PubSub and stores them in Hadoop/HDFS using the Parquet format, utilizing the PubSubSource connector for data ingestion.