I’m trying to import the contents of over 100 CSV files (but for simplicity I’ll only ask about a single one for this question).
The CSVs are from a geolocation database, mapping UK Postcodes to map grid references, lat/long coordinates, and a ton of other data that I don’t need.
e.g.:
pcd,pcd2,pcds,dointr,doterm,oscty,ced,oslaua,osward,parish,usertype,oseast1m,osnrth1m,osgrdind,oshlthau,nhser,ctry,rgn,streg,pcon,eer,teclec,ttwa,pct,itl,statsward,oa01,casward,npark,lsoa01,msoa01,ur01ind,oac01,oa11,lsoa11,msoa11,wz11,sicbl,bua11,buasd11,ru11ind,oac11,lat,long,lep1,lep2,pfa,imd,calncv,icb,oa21,lsoa21,msoa21
"AB1 0AA","AB1 0AA","AB1 0AA","198001","199606","S99999999","S99999999","S12000033","S13002843","S99999999","0","385386","0801193","1","S08000020","S99999999","S92000003","S99999999","0","S14000002","S15000001","S09000001","S22000047","S03000012","S30000026","99ZZ00","S00001364","01C30","S99999999","S01000011","S02000007","6","3C2","S00090303","S01006514","S02001237","S34002990","S03000012","S99999999","S99999999","3","1C3",57.101474,-2.242851,"S99999999","","S23000009",6715,"S99999999","S99999999","","",""
"AB1 0AB","AB1 0AB","AB1 0AB","198001","199606","S99999999","S99999999","S12000033","S13002843","S99999999","0","385177","0801314","1","S08000020","S99999999","S92000003","S99999999","0","S14000002","S15000001","S09000001","S22000047","S03000012","S30000026","99ZZ00","S00001270","01C31","S99999999","S01000011","S02000007","6","4B3","S00090303","S01006514","S02001237","S34002990","S03000012","S99999999","S99999999","3","1C3",57.102554,-2.246308,"S99999999","","S23000009",6715,"S99999999","S99999999","","",""
"AB1 0AD","AB1 0AD","AB1 0AD","198001","199606","S99999999","S99999999","S12000033","S13002843","S99999999","0","385053","0801092","1","S08000020","S99999999","S92000003","S99999999","0","S14000002","S15000001","S09000001","S22000047","S03000012","S30000026","99ZZ00","S00001364","01C30","S99999999","S01000011","S02000007","6","3C2","S00090399","S01006514","S02001237","S34003015","S03000012","S99999999","S99999999","3","6A1",57.100556,-2.248342,"S99999999","","S23000009",6715,"S99999999","S99999999","","",""
I only need the pcds, osnrth1m, oseast1m, lat & lng columns, so I’m using this query:
LOAD DATA INFILE 'C:/xampp/htdocs/postcode_updates/csv/ONSPD_MAY_2024_UK_AB.csv'
INTO TABLE geopostcodes
FIELDS TERMINATED BY ','
LINES TERMINATED BY 'n'
IGNORE 1 LINES
(@pcds, @osnrth1m, @oseast1m, @lat, @lng)
SET pcode = @pcds, grid_n = @osnrth1m, grid_e = @oseast1m, lat = @lat, lng = @lng
It pulls in the right data, but the postcodes are added with the quotes, and as the grid reference rows in my db are expecting integers, and the lat/long as decimals, the quotes mean they are imported as zero values.
I’ve tried various versions of trim and string replace functions, but can’t find one that works without breaking the query.
How do I remove the quotes?