This behavior of the in
operator on NumPy arrays I don’t understand:
>>> a=[0,1,2]
>>> x=np.array([a])
>>> b=[3,4,5]
>>> np.append(x,[b],axis=0)
array([[0, 1, 2],
[3, 4, 5]])
>>> a in x
True
>>> b in x
False
I want to build a list of distinct vectors so I need the in
operator and a way to extend my list of vectors so that an appended element is also seen as being in the list. I use a NumPy array because the in
operator doesn’t work the way I expect.
- In the above example, why is
b
not seen as an element ofy
? - How can I append a NumPy array to a NumPy array of NumPy arrays so that the
in
operator works as expected? - Is there a better way to store vectors (i.e. NumPy array of size 3) so I can add new vectors, and such that the
in
operator sees newly added vectors as elements of the structure?
3
np.append
does not modify the array in place. It creates a new array containing b
as the second row, but x
remains the same.
>>> np.append(x,[b],axis=0)
array([[0, 1, 2],
[3, 4, 5]])
>>> x
array([[0, 1, 2]])
>>> b in np.append(x,[b],axis=0)
True
Also, in
makes no sense in the first place for NumPy arrays. row in two_d_array
doesn’t test whether row
is a row of two_d_array
. It gives you (row == two_d_array).any()
, which for a 1D row
and a 2D two_d_array
, tests whether any element of row
shows up as an element of the corresponding column of two_d_array
.
2
I found a work-around for my problem:
-
use a simple list to store my vector (numpy arrays)
-
write a function which checks if a particular vector is contained in the list:
def contains(vlist, vector): for v in vlist: if (v == vector).all(): return True return False if not contains(my_list, new_vector): my_list.append(new_vector)
That way i get the functionality i need.
This answer does not compete for performance or optimality, it only addresses a way how to use an
in
-operator over vectors, as was asked above :
Q :
“is there a better way to store vectors (i.e. numpy array of size 3), to that I can add new vectors, and such that thein
operator sees newly added vectors as being elements of the structure?”
We may “decimate” arrays into hash-able format:
aDict = { x.tobytes(): 1, }
aDict
{'x00x00x00x00x00x00x00x00x01x00x00x00x00x00x00x00x02x00x00x00x00x00x00x00': 1}
and enjoy the syntax of in
-operator over ( hash-able ) dictionary keys, perhaps with some added benefits from ability to associate some additional data to such “vector”-keys, like for frequencies of occurences or any more complex data structures in key-value contents the dictionaries are designed for :
>>> x.tobytes() in aDict
True
>>> aDict[ np.array([b]).tobytes() ] = 1
>>> aDict
{'x00x00x00x00x00x00x00x00x01x00x00x00x00x00x00x00x02x00x00x00x00x00x00x00': 1, 'x03x00x00x00x00x00x00x00x04x00x00x00x00x00x00x00x05x00x00x00x00x00x00x00': 1}
4