I am trying to launch a sagemaker training job from within sagemaker studio code editor instance. I have a custom docker image with a requirements.txt file with a series of python libraries to be installed. This process leverages the sagemaker training toolkit, unfortunately, that collection of libraries has a dependency on a module deprecated in python 3.10 and greater (see this github open issue for more). So I decided to run my container using python 3.9 but that is now causing it’s own set of problems related to the toolkit’s sagemaker-containers
library. When trying to launch the training job I get this error:
ImportError: cannot import name 'escape' from 'jinja2' (/usr/local/lib/python3.9/site-packages/jinja2/__init__.py)
Some research tells me that the escape
module in jinja2
was deprecated in newer versions of the library, jinja2
is a dependency of flask
and flask
is a dependence of sagemaker-containers
. The solution is to add flask
to requirements file and specify its version date to a later version, the problem, however, is that sagemaker-containers
depends on an older version of flask==1.1.1
(see below). I have tried not specifying the version for sagemaker-containers, I have tried setting it to an older version, I have tried updating flask
and various versions. I can’t seem to find a combination that can be run. I also might add that it’s unclear to me why I need sagemaker-container
to be installed when I have the training toolkit installed via sagemaker-training
(but without I get another error that is resolved by having it installed per discussion elsewhere).
94.67 The conflict is caused by:
94.67 The user requested Flask==2.0.3
94.67 sagemaker-containers 2.8.6.post2 depends on flask==1.1.1
Python script launching job:
import sagemaker
from sagemaker.estimator import Estimator
role = 'aws-role'
s3_bucket = 's3-bucket-name'
instance_type = ''
category = 'video_games'
output_path = f's3://{s3_bucket}/classification/{category}/output'
image_uri = 'my_image_uri'
source_dir = '/home/sagemaker-user/data-analytics/data_science/classifcation/pipeline/auxilary/source_code/'
estimator = Estimator(image_uri=image_uri,
role=role,
instance_count=1,
instance_type=instance_type,
output_path=output_path,
entry_point='train.py',
source_dir=source_dir)
estimator.set_hyperparameters(category=category,
s3_bucket=s3_bucket,
target_column='annotation',
test_size=0.2)
estimator.fit()
Dockerfile
# Use python image as base
FROM python:3.9
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# Install system dependencies
RUN apt-get update
&& apt-get install -y --no-install-recommends
libpq-dev
gcc
&& rm -rf /var/lib/apt/lists/*
# Set working directory in container
WORKDIR /code
# Install Python dependencies
COPY requirements.txt /code/
RUN pip install --no-cache-dir --exists-action i -r requirements.txt
# Copy the rest of application code
COPY . /code/
requirements.txt file
Flask==2.0.3
simpletransformers==0.70.0
pandas==2.1.1
numpy==1.26.0
torch==2.2.1
sklearn-deap==0.3.0
sklearn-genetic-opt==0.10.1
sagemaker==2.215.0
sagemaker-training==4.7.4
boto3==1.33.3
sagemaker-containers