I have a Python web scraping app that runs perfectly on my local machine. It deploys successfully on the Digitalocean app platform, but when running, the Chromedriver unexpectedly exits and causes an error (status code 127). I have a specialized environment for the app, a requirements.txt file with what I believe to be the correct requirements, and a gunicorn_config.py file for the app to deploy.
My gunicorn_config.py file has:
bind = "0.0.0.0:8080"
workers = 2
My requirements.txt has:
attrs==23.2.0
beautifulsoup4==4.12.3
bidict==0.23.1
blinker==1.8.2
bs4==0.0.2
certifi==2024.6.2
cffi==1.16.0
click==8.1.7
colorama==0.4.6
Flask==3.0.3
Flask-SocketIO==5.3.6
future==1.0.0
gunicorn==22.0.0
h11==0.14.0
idna==3.7
itsdangerous==2.2.0
Jinja2==3.1.4
MarkupSafe==2.1.5
numpy==2.0.0
outcome==1.3.0.post0
pandas==2.2.2
pip==24.1
probableparsing==0.0.1
pycparser==2.22
PySocks==1.7.1
python-crfsuite==0.9.10
python-dateutil==2.9.0.post0
python-engineio==4.9.1
python-socketio==5.11.3
pytz==2024.1
selenium==4.22.0
setuptools==70.1.1
simple-websocket==1.0.0
six==1.16.0
sniffio==1.3.1
sortedcontainers==2.4.0
soupsieve==2.5
splinter==0.21.0
trio==0.25.1
trio-websocket==0.11.1
typing_extensions==4.12.2
tzdata==2024.1
urllib3==2.2.2
usaddress==0.5.10
websocket-client==1.8.0
Werkzeug==3.0.3
wsproto==1.2.0
Starting Code for my flask app file:
from flask import Flask, render_template, request, redirect, url_for, send_file
import pandas as pd
from splinter import Browser
from bs4 import BeautifulSoup
import time
import datetime
import re
import numpy as np
import io
from selenium.webdriver.chrome.options import Options
import os
import usaddress
from flask_socketio import SocketIO, emit
app = Flask(__name__)
expected_password = "------"
@app.route('/')
def index():
return render_template('index.html')
@app.route('/scrape', methods=['POST'])
def scrape():
# Get form data
realty_trac_url = request.form['url']
narrpr_email = request.form['email']
narrpr_password = request.form['password']
file_city_name = request.form['name']
access_password = request.form['access_password']
# Check if access password matches the expected password
if access_password == expected_password:
# Proceed with scraping
# Initialize the browser
# Set up Chrome options to open in full screen
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--headless=new")
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
chrome_options.add_argument(f"user-agent={user_agent}")
# Start browser session with Chrome options
browser = Browser('chrome', options=chrome_options)
####rest of code continues here.
End section of Code in Flask App:
if __name__ == '__main__':
app.run(debug=True)
At first I did not have a Python environment specifically for the app, which was causing the requirements.txt file to be wrong, and the app was not deploying at all. I created the new environment, then made the new requirements.txt file, and was able to successfully deploy the app. However, the app encounters an error when I run it on the live site.
I did not specify any environment variables on the Digitalocean app platform. Is this needed to correctly run selenium, chromedriver, splinter, etc?