I am scraping a website. When I download a pdf with multiple documents, it downloads fine as a zip. When I try to view a document which is a single PDF it opens up with a page that says show_temp.pl, with an ‘open’ button. After I click ‘open’, it says it can’t re-open the document. My I am using the last chromedriver version. I am able to view the document in the normal chrome browser. I also have adobe acrobat installed locally with brew.
Here are my settings:
"download.default_directory": os.path.abspath(self.download_dir),
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True,
# "profile.default_content_settings.popups": 0
})
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36"
chrome_options.add_argument(f'user-agent={user_agent}')
chrome_options.add_argument("disable-gpu")
chrome_options.add_argument("enable-automation")
chrome_options.add_argument("no-sandbox")
chrome_options.add_argument("disable-infobars")
chrome_options.add_argument("disable-dev-shm-usage")
I have tried:
- Changing profile.default_content_settings.popups: 0.
- Uninstalling and reinstalling adobe acrobat.
- Uninstalling and reinstalling Chrome.
- Setting the local settings of the automated browser, to download pdfs automatically and disable popups and redirects.
David Andrews is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.