I’m using Selenium in Python to scrape a lot of web pages on a daily basis and found it was leaving instances of Edge open in /tmp/ which eventually filled my linux drive space.
The folders left in /tmp/ look like that:-
/tmp/.com.microsoft.Edge.BX4XtO/
This is the code I’m using:-
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
def scrape_page(web_address):
try:
edge_options = webdriver.EdgeOptions()
edge_options.use_chromium = True
edge_options.add_argument('--remote-debugging-port=0')
edge_options.add_argument('--no-first-run')
edge_options.add_argument('--no-default-browser-check')
edge_options.add_argument('--headless=new')
edge_options.add_argument('--log-level=3')
edge_options.add_argument("--disable-logging")
edge_options.add_argument('--start-maximized')
edge_options.add_argument('--disable-infobars')
edge_options.add_experimental_option('excludeSwitches', ['disable-popup-blocking'])
# Creates 3x .com.microsoft.Edge. in /tmp/
driver = webdriver.Edge(options = edge_options)
driver.get(web_address)
time.sleep(3)
response = driver.find_element(By.XPATH, "/html/body").text
except requests.exceptions.RequestException as e: # This is the correct syntax
print(e)
logging.warning('Scraping {0} failed - {1}'.format(web_address, e))
return
finally:
driver.close()
driver.quit()
return response
What I’ve found is that when the following command runs:-
driver = webdriver.Edge(options = edge_options)
it creates three instances of Edge in /tmp/:-
.com.microsoft.Edge.bZsnHM/ .com.microsoft.Edge.tZLIcy/ .com.microsoft.Edge.wPEEWf/
After the quit() command only two of the instances are closed leaving one:-
.com.microsoft.Edge.bZsnHM/
Which is the last instance that
driver = webdriver.Edge(options = edge_options)
creates out of the three. I was able to confirm this by monitoring the /tmp/ folder.
This raises a few questions:-
Firstly, should Selenium be opening 3 instances?
Secondly, why doesn’t quit() close all the Edge instances?
Thirdly, how do I remove all the instances?
For some additional info, the exact thing occcurs when I use Chrome instead of Edge.
Thanks for your help and time ????.