How do you create a simple UDF in DuckDB that operates on a pyarrow object?
For an MRE, this is an attempt to re-implement a sqrt function via pyarrow. DuckDB merely gives the cryptic error listed below.
import pyarrow as pa
import duckdb
def test(x):
return x.compute.sqrt()
duckdb.create_function(
'test',
test,
type = 'arrow',
parameters = [ int ],
return_type = float,
)
N = 1000
duckdb.sql(f"""
create table t (a int);
insert into t (a) select * from generate_series(1, {N});
alter table t add column b float;
update table t set b = test(a);
""")
Error
Traceback (most recent call last):
File "/Users/w/Codin/duckdb_rust_udf/test.py", line 8, in <module>
duckdb.create_function(
TypeError: create_function(): incompatible function arguments. The following argument types are supported:
1. (name: str, function: function, return_type: object = None, parameters: duckdb.duckdb.typing.DuckDBPyType = None, *, type: duckdb.duckdb.functional.PythonUDFType = <PythonUDFType.NATIVE: 0>, null_handling: duckdb.duckdb.functional.FunctionNullHandling = 0, exception_handling: duckdb.duckdb.PythonExceptionHandling = 0, side_effects: bool = False, connection: duckdb.DuckDBPyConnection = None) -> duckdb.DuckDBPyConnection
Invoked with: 'test', <function test at 0x100237d90>; kwargs: type='arrow', parameters=[<class 'int'>], return_type=<class 'int'>