I have a csv file with header, that is expected to have 14 fields.
My gawk
script checks the file and writes good and bad lines to 2 separate files.
However, the header line is not output to either file. It’s the only missing line.
while read -r line;
do
gawk -k '{if (NF==14) print $0 > "good_file.csv";
else print "["NR"]", "["NF"]", $0 > "bad_file.csv"}';
done < hd.csv
csv input file sample: hd.csv
test_id,vehicle_id,test_date,test_class_id,test_type,test_result,test_mileage,postcode_area,make,model,colour,fuel_type,cylinder_capacity,first_use_date
1645480751,1374211238,2016-01-01,4,NT,P,117033,SM,VOLKSWAGEN,POLO,BLACK,PE,1600,2000-06-23
1393462389,1153769898,2016-01-01,4,NT,P,99292,NE,VOLKSWAGEN,PASSAT,BLUE,DI,1968,2006-11-30
1863202023,1485039300,2016-01-01,7,NT,PRS,170320,E,MERCEDES,SPRINTER 313 CDI LWB,WHITE,DI,2148,2005-01-14
1304292863,1097073904,2016-01-01,4,RT,P,70623,NN,MINI,MINI,GREY,PE,1598,2004-04-08
845810407,1166548800,2016-01-01,4,NT,P,21567,DL,NISSAN,JUKE,BLUE,PE,1612,2011-07-07
I can see that the header row has 14 fields.
gawk -k '{print NF, $0}' hd.csv
14 test_id,vehicle_id,test_date,test_class_id,test_type,test_result,test_mileage,postcode_area,make,model,colour,fuel_type,cylinder_capacity,first_use_date
14 1645480751,1374211238,2016-01-01,4,NT,P,117033,SM,VOLKSWAGEN,POLO,BLACK,PE,1600,2000-06-23
I don’t understand why the header row is getting skipped.
New contributor
Mike_W is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.