Relative Content

Tag Archive for apachehadoopapache-flinkflink-streaming

Flink job generating small files in Hadoop

I have an Apache Flink application that retrieves messages from Google PubSub and stores them in Hadoop/HDFS using the Parquet format, utilizing the PubSubSource connector for data ingestion.