I try to create a slice along the time dimension of an xarray.Dataset
. However, if I try to take a slice which ends at a full hour, the returned slice actually ends at the next hour.
So let’s say I have an xarray.DataSet
called data
with the following signature:
<xarray.Dataset> Size: 197GB
Dimensions: (x: 65049, y: 65049, z: 65049, time: 755855, receiver-ID: 65049)
Coordinates:
* x (x) float32 260kB -726.2 -726.2 -726.2 -726.2 ... 718.8 718.8 718.8
* y (y) float32 260kB -796.7 -791.7 -786.7 -781.7 ... 768.3 773.3 798.3
* z (z) float32 260kB 1.5 1.5 1.5 1.5 1.5 1.5 ... 1.5 1.5 1.5 1.5 1.5
* time (time) timedelta64[ns] 6MB 00:00:00 ... 1 days 03:57:54.750000
Dimensions without coordinates: receiver-ID
Data variables:
p2 (time, receiver-ID) float32 197GB ...
The time coordinate in the dataset has a resolution of 0.125 seconds. Now I try to select a slice from 7:00 to 12:00 along the time dimension by using
data_slice = data.sel(time=slice('7:00:00', '12:00:00'))
The resulting data_slice
has the following signature:
<xarray.Dataset> Size: 45GB
Dimensions: (x: 65049, y: 65049, z: 65049, time: 172800, receiver-ID: 65049)
Coordinates:
* x (x) float32 260kB -726.2 -726.2 -726.2 -726.2 ... 718.8 718.8 718.8
* y (y) float32 260kB -796.7 -791.7 -786.7 -781.7 ... 768.3 773.3 798.3
* z (z) float32 260kB 1.5 1.5 1.5 1.5 1.5 1.5 ... 1.5 1.5 1.5 1.5 1.5
* time (time) timedelta64[ns] 1MB 07:00:00 ... 12:59:59.875000
Dimensions without coordinates: receiver-ID
Data variables:
p2 (time, receiver-ID) float32 45GB ...
As you can see, the slice ends at 12:59:59.875, thus it contains about one hour more data than desired.
If I increase the upper bound for the slice very slightly by using
data_slice = data.sel(time=slice('7:00:00', '12:00:00.00001'))
I obtain the desired result:
<xarray.Dataset> Size: 37GB
Dimensions: (x: 65049, y: 65049, z: 65049, time: 144001, receiver-ID: 65049)
Coordinates:
* x (x) float32 260kB -726.2 -726.2 -726.2 -726.2 ... 718.8 718.8 718.8
* y (y) float32 260kB -796.7 -791.7 -786.7 -781.7 ... 768.3 773.3 798.3
* z (z) float32 260kB 1.5 1.5 1.5 1.5 1.5 1.5 ... 1.5 1.5 1.5 1.5 1.5
* time (time) timedelta64[ns] 1MB 07:00:00 07:00:00.125000 ... 12:00:00
Dimensions without coordinates: receiver-ID
Data variables:
p2 (time, receiver-ID) float32 37GB ...
Could someone explain this behavior to me? Is my slicing approach wrong? Or is this simply a bug? Thank you very much in advance!
1