I’m attempting to move some Bayesian inference simulation studies from my local machine onto AWS Sagemaker to get the extra processing power. I’m running into a roadblock, however, in simply getting the stan language functioning. Please assume I know almost nothing about the inner workings of sagemaker and/or stan. Any input is appreciated!
Sagemaker details:
- I’m running a jupyter notebook through Amazon SageMaker Studio Classic.
- The notebook kernel is python 3, running the Data Science 3.0 image, on a ml.t3.medium instance.
What I’ve tried:
!pip install cmdstan
This returns the error:
Looking in indexes: https://company/specific/file/path
ERROR: Could not find a version that satisfies the requirement cmdstan (from versions: none)
ERROR: No matching distribution found for cmdstan
!conda install cmdstan
!conda install cmdstanpy
import cmdstanpy
import cmdstan
This runs for ~1 minute, then finishes. Worth noting: I believe Conda should check for and install dependencies for libraries, so I shouldn’t have to install or import cmdstan. In any case, I’ve tried every variation of installing/importing one or both of these libraries. They lead to the same error when trying to run a model fit that works on my local machine:
21:57:43 - cmdstanpy - INFO - No CmdStan installation found.
21:57:43 - cmdstanpy - INFO - Cannot determine whether version is before 2.27.
.
.
.
ValueError: CmdStan installataion missing binaries in /root/.cmdstan/cmdstan-2.35.0/bin. Re-install cmdstan by running command "install_cmdstan --overwrite", or Python code "import cmdstanpy; cmdstanpy.install_cmdstan(overwrite=True)"
Of course, using the suggested commands and rerunning code leads to the same errors.
I’ve also attempted !install cmdstan
. This returns:
CmdStan install directory: /root/.cmdstan
Installing CmdStan version: 2.35.0
Downloading CmdStan version 2.35.0
Download successful, file: /tmp/tmp35vjvb20
Extracting distribution
Unpacked download as cmdstan-2.35.0
Building version cmdstan-2.35.0, may take several minutes, depending on your system.
FileNotFoundError: [Errno 2] No such file or directory: 'make'
I’ve also attempted the solution posted on this forum. The urllib and shutil functions work for a while and, when they’re finished, there’s a new folder in my SageMaker file explorer named ‘cmdstan-2.23.0’. Attempting to run some stan model fitting now begins the process (displays 17:58:55 - cmdstanpy - INFO - compiling stan file /root/helper_files/stan_model.stan to exe file /root/helper_files/stan_model
), but then runs into this error:
ValueError: Failed to compile Stan model '/root/helper_files/stan_model.stan'. Console:
Command: ['make', 'STANCFLAGS+=--filename-in-msg=stan_model.stan', '/root/helper_files/stan_model']
failed with error [Errno 2] No such file or directory: 'make'
We’ve now seen the lack of a make file or directory
error twice now. There’s a make
folder within the cmdstan folder, but it may also be referencing the make
terminal command. We can try installing the make
command on Sagemaker with !sudo yum install -y make
but that returns the error: /bin/bash: line 1: sudo: command not found
. Further reading suggests sudo
commands are intentionally not allowed in SageMaker unless you’re running-as user.
The version from the above example is an outdated version of cmdstan. It’s possible that’s the reason it fails. I’ve tried running the same tar installation code, but with more up-to-date links (replacing the file path suggested there with https://github.com/stan-dev/cmdstan/releases/tag/v2.35.0/colab-cmdstan-2.35.0.tgz
which is where the current release of cmdstan resides). This, however, returns the error:
ReadError: colab-cmdstan-2.35.0.tgz is not a compressed or uncompressed tar file
Following the cmdstan installation documentation, I’ve also tried running !conda install -c conda-forge cmdstan
and !conda create -n stan -c conda-forge cmdstan
. These lead to the same errors as described under the earlier !conda install cmdstan
attempt. However, they also provide a new suggestion after running: Note: you may need to restart the kernel to use updated packages.
I’ve tried manually refreshing the kernel (doesn’t work) as well as programmatically refreshing it. Interestingly, when trying to create any kind of new environment (even without cmdstan: !conda create --name test
), it doesn’t get added to the list of environments the system recognizes. Running !conda env list
returns only:
# conda environments:
#
base /opt/conda
Which seems to go directly against SageMaker’s documentation, stating that I should have a default
environment as well by… well, by default. Further that documentation suggests the base
environment “is only for tooling and should not be used by customers”. Despite that, it seems to be the only environment I have access to. Perhaps this is the source of problems?
So there’s a wall of text on everything I’ve tried and the errors each attempt throws. Any input, advice, or ideas would be very welcome. Thank you!
Luke Rutten is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.