I have some py code to compare a given pair of files. I am using dask for lazy computation. There could sometimes be an empty file in one of the files. When I try to read an empty file with headers, it always fails.
-
Input file is empty (input.csv)
-
python code:
import dask.dataframe as dd
file='input.csv'
read_source_file_kwargs = {'sep': '|', 'header': None, 'names': ['column1', 'column2']}
df = dd.read_csv(file, **read_source_file_kwargs)
Error message:
ValueError Traceback (most recent call last)
Cell In[27], line 1
----> 1 dd.read_csv('input.csv', **read_source_file_kwargs).compute()
File ~/miniconda3/envs/rs/Lib/site-packages/dask/backends.py:142, in CreationDispatch.register_inplace.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
140 return func(*args, **kwargs)
141 except Exception as e:
--> 142 raise type(e)(
143 f"An error occurred while calling the {funcname(func)} "
144 f"method registered to the {self.backend} backend.n"
145 f"Original Message: {e}"
146 ) from e
ValueError: An error occurred while calling the read_csv method registered to the pandas backend.
Original Message: All `iterables` must have a non-zero length
Is this expected? How do I fix this?
I was hoping to get an empty dataframe whenever the file being read is empty.