Using python textract with docx files
I am trying to use textract to do the obvious with docx files in a AWS Lambda using python. Textract library is included in the package, as is the dependency – docx2txt. I try getting the text out of the file, but still getting the ExtensionNotSupported stating that docx is not supported. I tried putting the doc2txt library in the parsers folder too – didn’t help.
Using python textract with docx files
I am trying to use textract to do the obvious with docx files in a AWS Lambda using python. Textract library is included in the package, as is the dependency – docx2txt. I try getting the text out of the file, but still getting the ExtensionNotSupported stating that docx is not supported. I tried putting the doc2txt library in the parsers folder too – didn’t help.