I am working with ROMS ocean model outputs, which have curvilinear grids. I would like to be able to save a variable (subset) from those outputs to a NetCDF file in a way that it is recognised to have a curvilinear grid when that file is accessed again (e.g. from R, with the stars package, but I may also want to use tools outside of R, e.g. gdalwarp). Is there a way to construct the NetCDF file appropriately?
Herewith an example:
## load packages
require(stars)
require(ncdf4)
## filenames & directories
filename <- "http://dap.saeon.ac.za/thredds/dodsC/SAEON.EGAGASINI/2019.Penven/DAILY_MEANS/1_12_degree/roms_avg_Y2014M11.nc.1"
outfile <- "woesII_singleDay_temp_vertlevl60.nc"
## loading curvilinear ocean model data file
## first look at the file structure
nc_open(filename) # it is a large and complex file, but we can extract a single variable slice below
Use stars::read_ncdf() to read “temp” variable, selecting all longitudes, latitudes and only a single depth layer (3rd dimension) and time-step (4th dimension), so that we end up with a variable with two dimensions
ncvar_wII <- read_ncdf(filename, var = "temp", ncsub = cbind(start = c(1, 1, 60, 1), count = c(NA, NA, 1, 1)), curvilinear = c("lon_rho", "lat_rho"), make_time = F) %>%
st_set_crs(4326) %>%
.[,,,,drop=T] # drop the (single unit) depth and time dimensions
ncvar_wII # curvilinear grid of surface temperature
# plot(ncvar_wII) # takes a few seconds
Now, how do we save that 2-D variable in a netcdf file so that it is recognised to have a curvilinear grid?!
## trying with stars package:
write_stars(ncvar_wII, outfile)
## reloading to check:
nc_open(outfile) # look at file contents
ncvar2 <- read_ncdf(outfile, var = "Band1") # fails
ncvar2 <- read_ncdf(outfile, var = "Band1", curvilinear = c("lon", "lat"), make_time = F) # fails # Specified curvilinear coordinates are not 2-dimensional.
In my limited experience and understanding, I think curvilinear grids require their coordinates to be specified as variables that consist of 2-D matrices, which is not the case in the netcdf file saved with write_stars() above.
Trying to reconstruct a similar structure to the original netcdf file using the ncdf4 package doesn’t get me there either (but as with the stars package, it may be due to my lack of familiarity & skills):
## using ncdf4 package, fetch the relevant variables and dimensions and then rebuild a netcdf file:
dta <- nc_open(filename) #
lons <- ncvar_get(dta, 'lon_rho') # fetch the longitude variable (a 2-D matrix of longitudes)
lats <- ncvar_get(dta, 'lat_rho') # fetch the latitude variable (a 2-D matrix of laitudes)
lon_dim <- ncvar_get(dta, "xi_rho") # fetch the x dimension (a vector of integers with an almost-repeating sequence I haven't understood why)
lat_dim <- ncvar_get(dta, "eta_rho") # fetch the y dimension (a vector of integers with an almost-repeating sequence I haven't understood why)
temperature <- ncvar_get(dta, "temp", start = c(1,1,60,1), count = c(-1,-1,1,1)) # fetch a 2-D surface of temperature
nc_close(dta)
# fields::image.plot(temperature)
# Define dimensions
x_dim <- ncdim_def(name = "xi_rho", units = "", vals = lon_dim)
y_dim <- ncdim_def(name = "eta_rho", units = "", vals = lat_dim)
# define coordinate variables
lon_var <- ncvar_def(name = "lon_rho", units = "degrees_east", dim = list(x_dim, y_dim), missval = NA)
lat_var <- ncvar_def(name = "lat_rho", units = "degrees_north", dim = list(x_dim, y_dim), missval = NA)
# define the data variable
data_var <- ncvar_def(name = "temp", units = "C", dim = list(x_dim, y_dim), missval = NA)
# Create NetCDF file
ncfile <- nc_create(outfile, vars = list(data_var, lon_var, lat_var))
# Write data to the file
ncvar_put(ncfile, data_var, temperature)
ncvar_put(ncfile, lon_var, lons)
ncvar_put(ncfile, lat_var, lats)
# Add global attributes
ncatt_put(ncfile, 0, "title", "Surface temperature data on curvilinear grid")
# Close the file
nc_close(ncfile)
## check with ncdf4
nc_open(outfile)
I noticed in the original file the variable ‘temp’ has an attribute “coordinates: lat_rho lon_rho”, but I don’t know how to specify that in the netcdf file created (I tried a few things) and perhaps that is the relevant difference between them (with respect to the file/variable being recognised as having a curvilinear grid).
## try to load it with stars, specifying curvilinear grid as above:
ncvar2 <- read_ncdf(outfile, var = "temp", curvilinear = c("lon_rho", "lat_rho"), make_time = F)
# error: Specified curvilinear coordinate variables not found as X/Y coordinate variables.
If anyone could demonstrate how to construct the NetCDF file in the correct way so that the curvilinear grid is recognised, that would be much appreciated. I need to do this from R as I calculate some statistics over many files (timesteps) before creating the NetCDF file.