I’m interested in finding the length of sequences of 1’s along a single axis in a multi-dimension array.
For a 1D array I have worked my way to a solution using the answers at this older question. E.g. [0,1,0,0,1,1,1,0,1,1] –> [nan,1,nan,nan,3,nan,nan,nan,2,nan]
For a 3D array I can of course create a loop, but I’d rather not. (Background climate science, looping over all latitude/longitude grid cells is going to make this very slow.)
I’m trying to find a solution in line with the 1D solution. Help would be much appreciated, in line with the 1D code, but complete different solutions welcome too of course.
For reference, this was my 1D solution:
y = da.values[1:] != da.values[:-1]
i = np.append(np.where(y), n - 1)
z = np.diff(np.append(-1,i))
p = np.cumsum(np.append(0,z))[:-1]
runs = np.where(da[i]==1)[0]
runs_len = z[runs] # length of sequence
time_val = da.time[p[runs]] # date of first day in sequence
da_runs = xr.DataArray(runs_len,coords={'time':time_val})
_,da_runs = xr.align(da,da_runs,join='outer') # make sure we have full time axis
And this is the attempt in 3D. I’m stuck on how to shift the valid entries in i
to the front/remove NaNs from i
.
y = da.values[1:] != da.values[:-1]
y = xr.DataArray(y,coords={'time':da.time[0:-1],'lat':da.lat,'lon':da.lon})
i = y.where(y)*xr.DataArray(np.arange(0,len(da.time[0:-1])),coords={'time':y.time}) -1
KateW12 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.