I am trying to extract text from any document uploaded by a user, this is my extractFromFile
function.
import { parseOfficeAsync } from "officeparser";
export async function extractTextFromFile(filePath: string): Promise<string> {
try {
const data = await parseOfficeAsync(filePath);
return data.toString();
} catch (error) {
console.log(error);
return "File has not been parsed";
}
}
Below is where I am calling this extractTextFromFile
function:
const handleFileUpload = async (req: Request, res: Response) => {
if (!req.file) {
return res.status(400).json({ error: "No file uploaded" });
}
// Extract text from the uploaded file
const extractedText = await extractTextFromFile(req.file.path);
console.log("extracted text: " + extractedText);
return extractedText;
}
This is me defining the route for the file upload:
router.post("/upload", upload.single('file'), handleFileUpload)
Mind you I have already installed officeparser with
npm i officeparser
Now, I am getting this error below:
[OfficeParser]: Error: ENOENT: no such file or directory, copyfile ‘C:UsersAdministratorDocumentsMy documentsProgrammingMERNQuizmeserveruploadsa3428f2277a7475f4e407753ed5c130a’ -> ‘C:UsersAdministratorDocumentsMy documentsProgrammingMERNQuizmeserverofficeParserTemptempfiles172189849682200000.uploadsa3428f2277a7475f4e407753ed5c130a’
extracted text: File has not been parsed
in my console.
Upon this error, I have confirmed in my source directory and the uploads
directory is being created where the uploaded file is being saved. looking at the error message, OfficeParser is trying to copy the uploaded file into a temporary file in a temporary directory, but only a part of those directories were created, not all.
....serverofficeParserTemptempfiles
gets created when an upload request is sent. but 172189849682200000.uploads
is not present so is the file into which the uploaded file has to be copied into.
I tried creating the 172189849682200000.uploads
folder manually but this does not work either.
This is what I have tried so far.
I will be glad if you could also suggest other parsing libraries I can use in my nodejs project.