I am fairly new to PyArrow and Arrow, so I may ask something stupid. I have read the documentation of the field()
method, and as far as I understand I can say that a filed does not allow NULL values by specifying nullable=False
. So, I have tried this example:
schema = pyarrow.schema(fields=[
pyarrow.field(name='unique_id', type=pyarrow.uint64(), nullable=False),
pyarrow.field(name='age', type=pyarrow.uint8(), nullable=True),
pyarrow.field(name='favourite_colours', type=pyarrow.list_(pyarrow.string()), nullable=True)
])
data = [
[1, None, 3],
[10, None, 30],
[None, ['red', 'blue'], ['green']]
]
table = pyarrow.table(data=data, schema=schema)
print(table)
I declare the field unique_id
as nullable=False
, and therefore I was expecting some sort of error when building the pyarrow.table
, where I pass data
with a NULL value for unique_id
([1, None, 3]
in the code example).
Am I doing something wrong?
Thank you!
p.s. using pyarrow 17.0.0 with Python 3.8.16