I’m encountering a performance issue with the PostgreSQL COPY
command when running it locally. My remote backend app server and my local PC have identical specs (both with SSDs, similar RAM, and CPU cores). The PostgreSQL database itself is hosted on a different remote server, and the database can only be accessed through a specific whitelisted internet connection.
The issue is that the COPY
command runs significantly slower on my local machine compared to the remote server, even though both environments are very similar. My local PC is connected to the whitelisted internet, which is also fast.
Here’s a snippet of my code:
import psycopg2
def get_connection(env):
conn = psycopg2.connect(f"""postgresql://{env["USER"]}:{env["PASSWORD"]}@{env["HOST"]}:{env["PORT"]}/{env["NAME"]}""")
try:
yield conn
finally:
conn.close()
def copy_from(file_path: str, table_name: str, env, column_string: str):
with get_connection(env) as connection:
with connection.cursor() as cursor:
with open(file_path, "r") as f:
query = f"COPY {table_name} ({column_string}) FROM STDIN WITH (FORMAT CSV, HEADER FALSE, DELIMITER ',', NULL '')"
cursor.copy_expert(query, f)
connection.commit()
for file_path in file_path_list:
copy_from(file_path, table_name, env, column_string)
Local Machine
Linux Ubuntu 22.04 LTS with Seqrite EndPoint Security
Traceroute Result:
- Gateway: ~3.5 ms
- Intermediate Hops: 6.5 ms to 40.7 ms
- Final Hops: 7.7 ms to 12.7 ms
Remote Server
OS: Linux Ubuntu
Traceroute Result:
- Gateway: ~0.2 ms to 0.7 ms
- Intermediate Hops: All responses are either not available or < 1 ms
Questions:
- Why might the
COPY
command be slower locally despite similar specs and a fast internet connection? - Are there any specific optimizations or configurations I should check on my local machine to improve performance?
- Could network latency or other factors be impacting the local performance, and if so, how can I address this?
Any insights or suggestions would be greatly appreciated!
7