Matrix multiplixation over axis in numpy Suppose I have an array X of shape (B, N, N, 3, 3). I want to vectorise the operation