I have the following use case using Celery and Flask.
A user uploads a PDF file. I want a worker to count the number of pages, n workers to split each page in a separate PDF file in parallel and then one worker to generate a report.
+----------------+
| |
| Count pages |
| |
+--------|-------+
|
|-------------------|-------------------|
| | |
+--------|-------+ +--------|-------+ +--------|-------+
| | | | | |
| Split page 1 | | Split page 2 | | Split page 3 |
| | | | | |
+----------------+ +----------------+ +----------------+
| | |
| | |
-----------------------------------------
|
+----------------+
| |
| Generate report|
| |
+----------------+
As you can see, count_pages result should create n split_page tasks. But at the same time, count page shouldn’t block the upload endpoint so it can’t be called synchronously.
So far I tried to play with chord
, group
and chain
but I did not find a way to trigger n split_page tasks asynchronously after count_page has finished.
2