Relative Content

Tag Archive for apache-sparkapache-spark-sqlcdcrocksdb

is it good idea to merge CDC data with MongoDB snapshot using RockDB and generate final snapshot?

Currently we have an inhouse mongo-cdc process which is generating JSON files and writes them to s3. We have an initial snapshot-parquet-dump of mongodb on s3. We have EOD spark job that merges the cdc-json files with the snapshot-parquet files, generating new set of parquet files onto s3.

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for apache-sparkapache-spark-sqlcdcrocksdb

is it good idea to merge CDC data with MongoDB snapshot using RockDB and generate final snapshot?