thank you for everyone will help me, I don’t know how resolve.
I’m developing a Airflow Task to ETL Flow like:
MySQL -> Postgres
I’m not able to query the MySQL via Tunnel SSH (it’s mandatory, I can’t change any setting on DB).
I have tried 3 solution (nothing works):
This the code for tunneling, equal for the 3 solutions:
with SSHHook(ssh_conn_id='connection_ssh', password=None, conn_timeout=300, cmd_timeout=300, keepalive_interval=10, auth_timeout=300).get_tunnel(remote_port=22, remote_host=X.X.X.X,local_port=3306) as tunnel:
tunnel.start()
logging.info(f"SSH Tunnel opened")
It’s works, log:
[2024-12-07, 14:59:14 CET] {sshtunnel.py:978} INFO - Connecting to gateway: X.X.X.X:22 as user 'user'
[2024-12-07, 14:59:14 CET] {transport.py:1944} INFO - Connected (version 2.0, client OpenSSH_9.2p1)
[2024-12-07, 14:59:14 CET] {transport.py:1944} INFO - Authentication (publickey) successful!
Then the 3 options tested:
- MySQLHook
mysql_hook = MySqlHook(mysql_conn_id='mysqlconn')
with mysql_hook.get_conn().cursor() as cursor:
query = "select * from order where order_id=1"
df = pd.read_sql(query, mysql_hook.get_conn())
Error:
MySQLdb.OperationalError: (2003, "Can't connect to MySQL server on '127.0.0.1:3306' (111)")
- pymsql
connection = pymysql.connect(host='127.0.0.1',
user=db_conn.login,
password=db_conn.password,
db=db_conn.schema,
charset='utf8mb4',
port=tunnel.local_bind_port,
)
connection.autocommit(1)
query = "select * from order where order_id=1"
df = pd.read_sql(query, connection)
Error:
Could not establish connection from local ('127.0.0.1', 3307) to remote ('X.X.X.X', 3306) side of the tunnel: open new channel ssh error: Timeout opening channel.
pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')
- jdbcHook
{
"driver_class": "com.mysql.cj.jdbc.Driver",
"driver_path": "/airflow/resources/mysql-connector-j-8.2.0/mysql-connector-j-8.2.0.jar",
"conection_url": jdbc:mysql://127.0.0.1:3306/dbname
}
hook = JdbcHook(jdbc_conn_id=jdbcconn)
query = "select 1"
df = pd.read_sql(query, hook.get_conn())
Error:
com.mysql.cj.jdbc.exceptions.com.mysql.cj.jdbc.exceptions.PacketTooBigException: com.mysql.cj.jdbc.exceptions.PacketTooBigException: Packet for query is too large (4,739,923 > 65,535). You can change this value on the server by setting the 'max_allowed_packet' variable.
I don’t know what to do.
Note:
If I use a normal Datagrip or DBeaver, I can connect via tunnel ssh, without any issue.