I have code that runs through a geojson file and plots columns from that file one by one to show the user the costs of an optimisation. It ran fine, until it came across a column of just nan values.
I did some looking into this and discovered that the Geopandas (v0.14.4) plotting file firstly creates a list of which values are nan:
nan_idx = np.asarray(pd.isna(values), dtype="bool")
then it resets the values
variable to contain all the values that are not nan:
values = cat.codes[~nan_idx]
and then it fills nans with a placeholder so that it can properly plot them:
for n in np.where(nan_idx)[0]:
values = np.insert(values, n, values[0])
and this is where the issue is! As the column just contains nans, I get an IndexError from values[0]
as it’s now empty. This was reconfirmed by changing one value to a float and it working perfectly; plotting the nans as missing values.
Can anyone tell me if there is another way to plot columns containing all nan values or if this issue has been developed upon in newer versions of GeoPandas? It’s important to show that the values are all “missing values”, as that will just show that that specific column is not feasible.
All the above code is found in the geopandas/plotting.py file for GeoPandas (v0.14.4) and happens within the plot() function in the snippet of my code below (where name
is the column name and hexagons
refer to a Geopandas df):
fig = plt.figure(figsize=figsize)
ax = plt.axes(projection=crs)
ax.set_axis_off()
hexagons.to_crs(crs.proj4_init).plot(
ax=ax,
column = name,
legend = legend,
cmap = cmap,
legend_kwds = legend_kwds,
missing_kwds = missing_kwds,
)
ax.set_title(name)
fig.savefig(output_folder + f"/{name}.png", bbox_inches=bbox_inches)
plt.close()
SamIAm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
To handle columns that consist only of NaN values, you can add a check before the plotting step to ensure the column isn’t empty or filled with NaN values. If it is, you can either skip the plot for that column or provide a placeholder value. Here’s an updated version of your code that should help:
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
def plot_with_nan_check(hexagons, name, figsize=(10, 10), crs=None, legend=False, cmap='viridis',
legend_kwds=None, missing_kwds=None, output_folder='./', bbox_inches='tight'):
# Check if the column contains only NaN values
if hexagons[name].isna().all():
print(f"Column '{name}' contains only NaN values.")
# Handle this case, e.g., by assigning a placeholder value or skipping the plot
values = np.zeros(len(hexagons)) # You can adjust this placeholder value as needed
else:
values = hexagons[name]
fig = plt.figure(figsize=figsize)
ax = plt.axes(projection=crs)
ax.set_axis_off()
hexagons.to_crs(crs.proj4_init).plot(
ax=ax,
column=name,
legend=legend,
cmap=cmap,
legend_kwds=legend_kwds,
missing_kwds=missing_kwds,
)
ax.set_title(name)
fig.savefig(output_folder + f"/{name}.png", bbox_inches=bbox_inches)
plt.close()
Laurentiu Muresan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
3