Why a single celery task is even way faster than synchronous counterpart for I/O bound tasks I’m learning about concurrency in Python