I am working with a dataset and am trying to correlate 2 variables and use the third variable as a weight within a numpy 2d histogram. However, there is a lot of noise within the output histogram and I am wondering if there is some way to filter out bins with less than, say, 5 values contained within it. The first part of my code shown below is just an example of the variables I would be inputting. The latter part is for the histogram itself.
Any help would be appreciated.
G
x1 = []
y1 = []
z1 = []
bins = 40
fig, axs = plt.subplots(1, 1, figsize=(10, 10), constrained_layout=True)
sums, xbins, ybins = np.histogram2d(x1, y1, bins=bins, weights=z1)
counts, _, _ = np.histogram2d(x1, y1, bins=bins)
with np.errstate(divide='ignore', invalid='ignore'):
img = axs.pcolormesh(xbins, ybins, sums / counts, cmap='inferno')
fig.colorbar(img, ax=axs, label='z')
axs.set(xlabel='x', ylabel='y', title='weighted by z')
fig.show()