I would like to limit the files loaded using langchain_community.document_loaders.pdf.PyPDFDirectoryLoader
to a specific list. I tried using glob but there is some odd behaviour there
For instance, say I have
files_to_load = ["a.pdf", "b.pdf", "c.pdf"]
This does not work (loads 0 files)
loader = PyPDFDirectoryLoader("docs/", glob = "|".join(files_to_load))
This does not work either
loader = PyPDFDirectoryLoader("docs/", glob = "["+"|".join(files_to_load)+"]")
however running glob.glob
with the same string does return the right files
Am I missing something obvious?