I have been trying for a few days to create a pipeline to execute my python script, which connects to a Mysql Flexible Server database and stores some data.
I’m really new to this and it’s my first time trying it.
I created an ADF, some linked to the DB and a dataset, then in the pipeline I added a Custom and in its “Azure Batch” tab I added the Linked and then in the Settings tab I wrote “python3 calc.py” (without the quotes) and I activate the linked service of the storage account where the script is located (calc.py) and select the path.
In Azure Batch I created a node (canonical -> ubuntu 22.04) and saved. Within the pool of nodes in the Start Task option I have to write on the command line:
bash ./setup_python.sh
Since not having this and executing the pipeline indicated to me
ModuleNotFoundError: No module named ‘pymysql’
My setup_python.sh script has the following:
#!/bin/bash
sudo apt-get update
sudo apt-get install -y python3-pip
pip3 install psycopg2-binary pymysql
Then when I run the Pipeline in ADF it stays forever in the running state….
Then, if my script is the one that connects the DB, is it necessary for the ADF to have a Linked
to the DB??
If you need more information, please let me know… I’ve been dealing with this for 3 days now and I can’t find a solution.
Being able to generate the pipeline in ADF without problem, execute my setup_python.sh script and then the calc.py script to insert the data into my DB
2