I’m encountering an issue with a multiprocessing script in Python. The script processes flights using a function process_one_flight
. Individually, each step of the function works as expected, but when executed via multiprocessing workers, the script occasionally gets stuck at random steps of the process_one_flight
function.. I have been unable to reproduce the bug in a consistent manner, which complicates troubleshooting.
Here’s a simplified version of the script:
def _worker(q: Queue, unprocessed: list):
while not q.empty():
flight = q.get()
try:
process_one_flight(flight, pair=pair, baseline=baseline, force=force)
except Exception as e:
unprocessed.append(flight)
print(f"Error processing {flight}: {e}")
manager = Manager()
q = manager.Queue()
unprocessed = manager.list()
for world in db_path.iterdir():
if "flight" not in world.name or world_name not in world.name:
continue
for split in world.iterdir():
flight_list = list(split.iterdir())
flight_idx = [int(flight.name) for flight in flight_list]
sorted_idx = np.argsort(flight_idx)
sorted_flights = [flight_list[i] for i in sorted_idx]
for flight in sorted_flights:
print(flight)
flight = Flight_lw(world=world.name, split=split.name, flight_name=flight.name)
flight.add_db_path(db_path)
flight.build_folders()
q.put(flight)
if num_workers > 1:
workers = [Process(target=_worker, args=(q, unprocessed)) for _ in range(num_workers)]
for worker in workers:
worker.start()
for worker in workers:
worker.join()
else:
_worker(q, unprocessed)
Any idea what’s happening or how to reproduce the bug ?