I have a csv file saved as .csv format which contains non-ascii characters but Tika not able to detect the MimeType of the file and thus assigning the mimeType as application/octet-stream and during parsing it is assigning an empty parser to it which is not generating any output. The csv file is not saved as utf-8 encoded format.
I tried to give mimeType of the file as text/plain to override. But it is not detecting the character encoding of the file correctly and after extraction the generated output file is not the same as the input/original file. and also how to detect the correct encoding of the file?
Anurag Anand is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.