This is my first question in this forum. Don’t know how to ask questions? but here you go.
I have legacy applications which is written in C. There are multiple batch jobs, some are scheduled, and some are on demand. All these jobs are running parallelly and they have interdependency of data with each other with shared database. There is huge volume of data around 70 million records are there. It is a monolithic application, and we are going to migrate it with microservices, but these are few questions:
- In microservice how batch jobs will refer the required data from another database- let’s say we have batch job b1 which is running which have b1 database, parallelly another batch job is running let’s say b2 with b2 database. My question is how we can share data in parallel jobs.
My recommendation is Apache spark, but it seems curious to me as we have to processed large volume of data.
- We have to provide status too for running jobs, here I am referring Kafka to provide status, Am I correct?
Apache spark implementation and kafka streaming. Need to know how we share data parallel jobs.
Srivastava Praveen is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.