Fluent python book describes two types of sequences: Container and Flat, with container sequences being able to hold items of different types but as reference to python objects, while flat sequences like str, byte and array.array holds items of one simple type but the values are stored directly in the sequence memory.
It also states that every Python object in memory has a header with metadata. The
simplest Python object, a float, has a value field and two metadata
fields that on a 64-bit Python build, each of those fields takes 8 bytes:
- ob_refcnt: the object’s reference count
- ob_type: a pointer to the object’s type
- ob_fval: a C double holding the value of the float
The book also says that:
An array of floats is much more compact than a tuple of floats
if I get that right, a simple float variable should be of size of 24 bytes, a list with 3 float items should be of size 24 * 3 = 72 bytes.
# Simple float variable
single_float: float = 3.14
single_float.__sizeof__() # -> 24
# A list with 3 float values
list_of_floats: list[float] = [1.12, 2.13, 3.14]
list_of_floats.__sizeof__() # -> 24 * 3 = 72
This far everything seems right, except that when I try to repeat the same thing using array.array, which is a Flat sequence supposed to be more compact as its values shouldn’t have metadata, I get a weird result, in which the sizeof the array is greater than the size of the list and that for the same values.
import array
array_of_floats = array.array("f", (1.12, 2.13, 3.14))
array_of_floats.__sizeof__() # -> 88 bytes, greater than the list of floats
# One single element
array_of_floats[0].__sizeof__() # -> 24 bytes, same as a float object
Given the previous examples, I’m wondering in which way the array of floats is more compact than the list of floats, I can understand that it’s more memory efficient and fast as values are stored as static values, but in term of size, I don’t get it.