This might be an old problem. Let’s say, there is a CSV file, for example:
col1,col2,col3,col4
"Empty String","","None_Value",
"None_Value",,"Empty String",""
As you can see, there maybe (“”) Empty String Data or (,,) missing entries. While reading this CSV file, I am trying to keep them as it is. So, after reading the csv file (“”) –> (“”) and (,,) –> as (,,) or None.
I am using python to read the CSV files and before Python 3.12 version, csv.reader() treated both empty string and missing entry value as “”. In python 3.12, new quoting parameters were added for csv.reader() and csv.writer().
csv.QUOTE_NOTNULL
Instructs writer objects to quote all fields which are not None. This is similar to QUOTE_ALL, except that if a field value is None an empty (unquoted) string is written.
Instructs reader objects to interpret an empty (unquoted) field as None and to otherwise behave as QUOTE_ALL.
However, I noticed that it only works with csv.writer(), not csv.reader(). Apparently, this is a bug for python 3.12.GitHub Issue Link
Does anyone know how to handle this correctly, perhaps with Pandas or another library? I tried with pandas a little bit, but I think even pandas can’t separate them (I might be wrong).
noobLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.