I’m trying to plot some data using matplotlib and struggling to get the effect that I want without manually tuning the tick labels. I have to generate a lot of these plots, and the data changes which would be a lot of manual labor, so automation of this would be preferred. Basically, I would like the code to do ~3 things:
- Set tick labels on the plots so that the min/max values of the data being shown encompassed (e.g. if the largest data point is 40, then the largest tick should be at least 40)
- Set a variable number of ticks (for my use case I typically have between 3-5)
- If appropriate, have one of the tick labels be 0 (where it is appropriate if the min of the data is less than 0, and the max is greater)
I have the first two working somewhat well, it’s the third one that is giving me trouble. If I let matplotlib do it’s thing, I get the following:
# Make the data
datax = np.arange(0.0, 11.0, 0.1)
datay1 = 10 + np.sin(datax)
datay2 = np.random.normal(0, 10.0, datax.shape)
datay = [datay1, datay2]
# Matplotlib basic
fig, axs = plt.subplots(1,2)
ax = axs.flatten()
for idx in range(len(ax)):
ax[idx].scatter(datax, datay[idx])
ax[idx].grid()
plt.tight_layout()
With this I am able to get the 0 tick label to show up on the right plot, but data is outside the biggest and smallest tick labels, and I cannot set the number of tick labels. I do like that the 0 tick label is not there on the left plot though, since adding it would unnecessarily squish everything. Making a small modification:
# Matplotlib set the ticks
num_xticks = 3
num_yticks = 5
fig, axs = plt.subplots(1,2)
ax = axs.flatten()
for idx in range(len(ax)):
ax[idx].scatter(datax, datay[idx])
# Set the ticks
ticks_x = np.linspace(np.nanmin(datax), np.nanmax(datax), num_xticks)
ticks_y = np.linspace(np.nanmin(datay[idx]), np.nanmax(datay[idx]), num_yticks)
ax[idx].set_xticks(ticks_x)
ax[idx].set_yticks(ticks_y)
ax[idx].grid()
plt.tight_layout()
You can see that I can indeed add the ticks and the data is now well encapsulated. But I no longer get a 0 in the right plot, which is a bit awkward because grounding myself to a middle value of 1.54 is much more difficult than 0. I’m not super sure what the best way to resolve this would be, so I would appreciate any help and tips you may have! Thanks!
Edit:
I think I’ve found a partial solution. I don’t think it’s ideal because it doesn’t account for uneven distributions over the 0 line. For example, with the data I showed before:
...
for idx in range(len(ax)):
...
# For now, just on the y-axis
if np.min(ticks_y) < 0.0 and np.max(ticks_y) > 0.0:
value = np.max(np.array([np.abs(np.nanmin(datay[idx])), np.nanmax(datay[idx])]))
ticks_y = np.linspace(-value, value, num_yticks)
...
It works rather well. But if the min of the data is much lower than the max, then the plot looks wonky
datay3 = datay2 + 10.0
Here I’d prefer for there to be more ticks in the positive y-axis than in the negative to reduce the whitespace
There may be an easier / better way to do this, but I found an exhaustive search method where we minimize the total white space that will be on the plot seems to at least return what I want for this one example. We will see how it fares under my actual use cases, but here is the code:
# Data where this isn't pretty
datay3 = datay2 + 10.0
datay = [datay1, datay3]
# Matplotlib set the ticks and make sure we see the 0
num_xticks = 3
num_yticks = 5
fig, axs = plt.subplots(1,2)
ax = axs.flatten()
for idx in range(len(ax)):
ax[idx].scatter(datax, datay[idx])
# Set the ticks
ticks_x = np.linspace(np.nanmin(datax), np.nanmax(datax), num_xticks)
ticks_y = np.linspace(np.nanmin(datay[idx]), np.nanmax(datay[idx]), num_yticks)
# For now, just on the y-axis
if np.min(ticks_y) < 0.0 and np.max(ticks_y) > 0.0:
# Figure out how many ticks we want in positive and negative
# We want to minimize the white space
possible_positive_tick_count = np.arange(1.0, (num_yticks-2)+1, 1.0) # +1 is to get the end point
total_white_space = []
step_size = []
for positive_tick_count in possible_positive_tick_count:
# Need to figure out the biggest step size we need to accommodate since we want
# all the ticks to be spaced evenly
positive_step_size = np.nanmax(datay[idx]) / positive_tick_count
negative_tick_count = num_yticks - 1 - positive_tick_count
negative_step_size = np.abs(np.nanmin(datay[idx])) / negative_tick_count
curr_step_size = max(positive_step_size, negative_step_size)
# Get the white space that we would have from this step size
positive_max_tick = curr_step_size * positive_tick_count
negative_max_tick = curr_step_size * negative_tick_count
total_white_space.append(
abs(positive_max_tick) - abs(np.nanmax(datay[idx])) + abs(negative_max_tick) - abs(np.nanmin(datay[idx]))
)
step_size.append(curr_step_size)
# Get the minimum
idx_chosen = total_white_space.index(min(total_white_space))
chosen_step_size = step_size[idx_chosen]
positive_tick_count = possible_positive_tick_count[idx_chosen]
negative_tick_count = num_yticks - 1 - positive_tick_count
# Create the ticks based on this
ticks_y = np.concatenate([
np.arange(-chosen_step_size*negative_tick_count, 0.0, chosen_step_size),
np.arange(0.0, chosen_step_size*positive_tick_count + chosen_step_size, chosen_step_size)
])
ax[idx].set_xticks(ticks_x)
ax[idx].set_yticks(ticks_y)
ax[idx].grid()
plt.tight_layout()