I’m attempting to create a plot which shows regions where one regression predictor is more important than all others, particularly where its relative importance (obtained with pingouin’s linear_regression(relimp=True)
) is the maximum value for the set of predictors. This data is stored like so:
data = xr.open_dataset('.../file.nc')
# opens file.nc as an xarray dataframe; file contains v variables on d dimensions
# example variable: relimppct. this variable's dimensions are lat, lon, modelName, and varname.
# lat and lon are simply latitude and longitude; modelName represents the
# climate model the regression data was created with and varname represents the regressor used.
# so, to create a plot with panels for each relimppct regressor averaged across all models:
fig, axs = plt.subplots(1, len(data.varname),
subplot_kw={'transform': ccrs.PlateCarree(), 'projection': ccrs.PlateCarree()})
for i in range(len(data.varname)):
data.relimppct.mean(dim='modelName').isel(varname=i).plot(ax=axs[i])
plt.show()
What I’d like to do is create a single-panel plot with filled contours showing where each regression coefficient is most dominant, with a single categorical value for each. As an example, see figure 4 from Naud et al. 2023:
Naud et al. 2023
Their plot shows where each of their meteorological variables is most correlated with the presence of clouds. My immediate thoughts are a) that I will need to use xarray’s where
function, as something along the lines of data.relimppct.where(data.relimppct.isel(varname=i)==data.relimppct.max(dim='varname'))
, and b) that I will have to step outside the built-in plotting framework. Please share any suggestions or questions.