I’m trying to upload a pandas DataFrame to Snowflake using its SQLAlchemy dialect.
Let’s say the DataFrame has columns: ‘AA’,’Ba’,’bb’.
For some reason when I use the .to_sql()
function it interpets ‘bb’ as case insensitive and doesn’t put “” marks around it, which causes it to resolve to BB.
final_df.to_sql(table_name, connection, if_exists='append', index=False,
chunksize=1000)
I get this error: (snowflake.connector.errors.ProgrammingError) 000904 (42000): SQL compilation error: error line 1 at position 183
invalid identifier ‘BB’
`SQL: INSERT INTO "table" (
"AA",
"Ba",
bb
) VALUES (
%(AA)s,
%(Ba)s,
%(bb)s,
)
params = {
'AA': '',
'Ba': '',
'bb': 'GL-GL1',
}`
I tried to use alter session set quoted_identifiers_ignore_case = True;
, but even though the code runs without an error the data doesnt get uploaded.
I tried to upload a DataFrame with one column being all lowercase and it failed.
This appears to be a deficiency in pandas’ .to_sql()
method. I was able to reproduce the issue using firebird+firebird://
and oracle+oracledb://
, both of which interpret unquoted column bb
as "BB"
. So for an Oracle table
CREATE TABLE "SCOTT"."TABLE_1"
( "ID" NUMBER(*,0),
"Aa" VARCHAR2(50),
"bb" VARCHAR2(50),
PRIMARY KEY ("ID")
USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255
TABLESPACE "USERS" ENABLE
) SEGMENT CREATION DEFERRED
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
NOCOMPRESS LOGGING
TABLESPACE "USERS" ;
CREATE UNIQUE INDEX "SCOTT"."SYS_C0015641" ON "SCOTT"."TABLE_1" ("ID")
PCTFREE 10 INITRANS 2 MAXTRANS 255
TABLESPACE "USERS" ;
this fails
engine.echo = True
df = pd.DataFrame([(1, "a", "Alfa"), (2, "b", "Bravo")], columns=["id", "Aa", "bb"])
df.to_sql("table_1", engine, if_exists="append", index=False)
"""
sqlalchemy.exc.DatabaseError: (oracledb.exceptions.DatabaseError) ORA-00904: "BB": invalid identifier
Help: https://docs.oracle.com/error-help/db/ora-00904/
[SQL: INSERT INTO table_1 (id, "Aa", bb) VALUES (:id, :Aa, :bb)]
[parameters: [{'id': 1, 'Aa': 'a', 'bb': 'Alfa'}, {'id': 2, 'Aa': 'b', 'bb': 'Bravo'}]]
"""
However this seems to work, at least for a simple case
table_1 = sa.Table("table_1", sa.MetaData(), autoload_with=engine)
with engine.begin() as conn:
conn.execute(sa.insert(table_1), df.to_dict(orient="records"))
"""
BEGIN (implicit)
INSERT INTO table_1 (id, "Aa", "bb") VALUES (:id, :Aa, :"bb")
[generated in 0.00057s] [{'id': 1, 'Aa': 'a', '"bb"': 'Alfa'}, {'id': 2, 'Aa': 'b', '"bb"': 'Bravo'}]
COMMIT
"""