Remove invalid bytes, keep valid UTF-8
(I posted a similar problem here, but this new question is not a duplicate).
Erasing incorrectly encoded byte sequences on reading
I am reading files into Ruby strings, and these strings are later processed further (for instance, using the CSV module). The external encoding of the files is a parameter, and supposedly, the files to be processed should be of that specified encoding.