I’ve been working on deploying a simple ML model in a FastAPI web app. I’m using AWS Sagemaker endpoint and I send a request to my endpoint from a lambda function, but I keep getting the following error:
{
"errorMessage": "An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "Internal Server Error". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=xxxxxxx#logEventViewer:group=/aws/sagemaker/Endpoints/xxxxx in account xxxxxxxxx for more information.",
"errorType": "ModelError",
"requestId": "",
"stackTrace": [
" File "/var/task/lambda_function.py", line 12, in lambda_handlern response = client.invoke_endpoint(EndpointName=ENDPOINT_NAME,n",
" File "/var/runtime/botocore/client.py", line 565, in _api_calln return self._make_api_call(operation_name, kwargs)n",
" File "/var/runtime/botocore/client.py", line 1021, in _make_api_calln raise error_class(parsed_response, operation_name)n"
]
}
The web app is containerized in a docker container that I pushed to the ECR repository. Here is the docker file:
`FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN python3 -m pip install -r requirements.txt
COPY . .
EXPOSE 8080
ENTRYPOINT [“gunicorn”, “-k”, “uvicorn.workers.UvicornWorker”, “-b”, “0.0.0.0:8080”, “–config”, “settings.py” , “app:app”, “-n”]`
The Pytorch model is stored in a .pth file inside the docker image and I’m loading it in my API to make inferences.
Model settings:
model settings
Endpoint settings:
Type: Serverless
Because the app works in my local machine (tested running the container and run predictions), I’m suspecting that the source of error might be related to the endpoint settings, but was not able to figure it out. Can anyone help please?
Fzm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2