I am working on a scrapper of Sephora.com. When I run application locally using Flask run command the scrapper runs fine. But when I run the scrapper using Docker, the application throws an error. The error is
WARNING:bs4.dammit:Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
The line of code that I am using is this
soup = BeautifulSoup(b''.join(content), 'html.parser')
I have used from-encoding=”utf-8″ and from-encoding=”iso-8859-1″ but still the issue is not resolved. What issue can it be. I am getting this only when I run application using docker
This is my DockerFile
# Use an official Python runtime as a parent image
FROM python:3.9.19
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# Install any needed Python packages specified in requirements.txt
COPY requirements.txt ./app/
RUN pip install --no-cache-dir -r ./app/requirements.txt
# Copy the rest of the application code into the container
COPY . /app/
WORKDIR /app
# Create the directory for the Docker service configuration
RUN mkdir -p /etc/systemd/system/docker.service.d
EXPOSE 5001
CMD ["gunicorn", "-k", "gevent", "-w", "6", "-b", "0.0.0.0:5001", "app.wsgi:app"]
The issue is only occurs when I run application using Docker. So I am assuming that my code is correct, the issue is in DockerFile