I apologise for raising a question that’s been asked in so many forms before but I’m tearing my hair out…
I am using BULK INSERT to load what I firmly believe to be (and Notepad++ reports to be) a UTF-8 file and it contains many EU accented characters which are being lost and shown typically as 2 characters, the first of which is often ASCII 43 (typically a + sign but does not display as that). I have also done an octal dump of the file and, while “standard” characters occupy one byte, the accented ones occupy 2 bytes which I assume is UTF-8. Rows are terminated with LF, no CR.
If I import the file using SSMS for the same database, it is pulled into a table with correct accents showing but not with BULK INSERT.
I have tried using CODEPAGE = ‘RAW’ and ‘ACP’ to no avail – it imports the data but accents are lost. I have also tried DATAFILETYPE =’widechar’ and ‘widenative’ without success. Example:
BULK INSERT #Tmp1 FROM 'Fred.csv'
WITH (FIRSTROW = 2, FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a' , CODEPAGE = 'ACP' );
Any suggestions would be greatly appreciated, thank you.