I am trying to run VoxelNext model for testing. I have tried making worker=0, and check my data for nuscenes mini version as well. yet I receive the error
ERROR: Unexpected floating-point exception encountered in worker.
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1243, in _try_get_data
[rank0]: data = self._data_queue.get(timeout=timeout)
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/queue.py", line 180, in get
[rank0]: self.not_empty.wait(remaining)
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/threading.py", line 324, in wait
[rank0]: gotit = waiter.acquire(True, timeout)
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/site-packages/torch/utils/data/_utils/signal_handling.py", line 73, in handler
[rank0]: _error_if_any_worker_fails()
[rank0]: RuntimeError: DataLoader worker (pid 18031) is killed by signal: Floating point exception.
[rank0]: The above exception was the direct cause of the following exception:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/palak/lidar/VoxelNeXt/tools/test.py", line 211, in <module>
[rank0]: main()
[rank0]: File "/home/palak/lidar/VoxelNeXt/tools/test.py", line 204, in main
[rank0]: eval_single_ckpt(model, test_loader, args, eval_output_dir, logger, epoch_id, dist_test=dist_test)
[rank0]: File "/home/palak/lidar/VoxelNeXt/tools/test.py", line 66, in eval_single_ckpt
[rank0]: eval_utils.eval_one_epoch(
[rank0]: File "/home/palak/lidar/VoxelNeXt/tools/eval_utils/eval_utils.py", line 188, in eval_one_epoch
[rank0]: for i, batch_dict in enumerate(dataloader):
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 701, in __next__
[rank0]: data = self._next_data()
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1448, in _next_data
[rank0]: idx, data = self._get_data()
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1402, in _get_data
[rank0]: success, data = self._try_get_data()
[rank0]: File "/home/palak/miniconda3/envs/cuda118/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1256, in _try_get_data
[rank0]: raise RuntimeError(
[rank0]: RuntimeError: DataLoader worker (pid(s) 18031) exited unexpectedly
Please help me
I tried changing number of workers, I check my dataset, I made changes in eval function to disable AMP as well
New contributor
palak oza is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.