I’m trying to read in and parse a CSV output file from PostgreSQL, but there are some oddities in the data. The data contains vehicle details. One problem is with old Land Rovers which have models measured in inches, e.g. 100″. PostgreSQL outputs this as “100′”” – it uses a single quote as the escape character.
REGXXXX,LAND ROVER,”100′””,1973-11-16,Diesel,2500,Silver
I’ve tried the csv.reader but this throws an error when it gets to “100′”” even if I specify encoding=”utf8″.
I can just open it as a text file (using encoding=”utf8″), take a line at a time and use split(“,”), which handles the inch though it still has speech marks at each end: [‘LAND ROVER’, ‘”100′””‘]
However, I now have a problem that some of the models have a comma in their description – e.g. “R1, STREETFIGHTER” incorrectly gets split up into ‘”R1’ and ‘ STREETFIGHTER”‘ (I haven’t checked if csv.reader does the same).
REGXXXX,YAMAHA,”R1, STREETFIGHTER”,2002-04-02,Petrol,998,White
How can I parse the line, without splitting the text within speech mark? Or do I just need to put together my own procedure?
Thanks
Tim
in_file = open(in_folder + in_file, "r", encoding="utf8") for line in in_file: row_list = line.split(",")
Doc Tim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.