I am developing ASP.NET MVC application, which is basically information system. We have a problem, that users of the application sometimes copy something from the email or something else into a user input in the app and inside of the copied text might have been some unwanted characters. I have seen for instance byte-order mark (BOM) – U+FEFF
or 0x00
char. These are characters that are not visible while copying.
We recently added whitelisting validation into the application on all of the user input to solve that problem but another problem appeared. Users still sometimes copy text from emails or even from old records, where those unwanted characters might still be because it was created before the validation was implemented. Now there are some bug reports that validation won’t let them save the record with the copied text but they obviously don’t know that there are some invisible characters that are not wanted.
My question is if there is a nice solution to this. Whether we need to show a more describing validation message? Or whether we should implement some kind of truncating those unwanted characters? (Problem with truncating is that we don’t know what different characters might it be and we also don’t want to truncate something important from the input) Or just something else?
Disclaimer: I don’t think we might force users to use the application different way.
Besides implementing whitelisting validations I have tried googling how other people deal with that. I was not lucky with googling.