Yet another question regarding this somewhat typical error I guess, but I neither understand nor find the correct answer.
Here is the data frame I have:
import pandas as pd
df = pd.DataFrame({
'author': ['someguy', 'someone', 'again'],
'created_utc': ['2021-01-30 18:00:38', '2021-01-28 13:40:34', '2021-01-28 21:06:23'],
'score': ['417276', '317384', '282358']
})
And this the code I can’t get working:
import matplotlib.pyplot as plt
import numpy as np
# Assuming coms_per_month is your first dataset and best_score is the top scoring authors
plt.figure(figsize=(16, 12))
# Plot the first dataset (comments per month)
ax1 = plt.gca()
coms_per_month.plot(kind='bar', ax=ax1, color='blue', alpha=0.7) # Primary y-axis
plt.title('Frequency Distribution of Comments per Month', fontsize=20)
plt.xlabel('Date', fontsize=20)
plt.ylabel('Number of Comments (Primary)', fontsize=20)
plt.grid(True)
# Create a secondary y-axis
ax2 = ax1.twinx()
top_submissions = main_submissions.nlargest(10, 'score')
top_authors = top_submissions[['author', 'created_utc', 'score']]
top_authors['log_score'] = np.log(top_authors['score'])
plt.scatter(top_authors['created_utc'], top_authors['log_score'], c='red', s=100, label='Top Authors')
# Plot the names of the top scoring authors at the positions of their highest scores
for idx, row in top_authors.iterrows():
ax2.text(row['created_utc'],row['log_score'], s=row['author'])
plt.tight_layout()
plt.savefig('results/Elites Influence.pdf')
plt.show()
I can’t reproduce the other data set, but I reckon it’s not the one causing the issue as I’m able to produce the plot with it. The only thing I’m missing is the second y-axis where I want to plot the name of author
in the coordinates of x=top_authors['created_utc']
and y=top_authors['score']
.
I either get a plot with the main data set plotted correctly, but without the second y-axis and its corresponding values (author, etc.), or I get the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/IPython/core/formatters.py in __call__(self, obj)
339 pass
340 else:
--> 341 return printer(obj)
342 # Finally look for special method names
343 method = get_real_method(obj, self.print_method)
8 frames
/usr/local/lib/python3.10/dist-packages/matplotlib/backends/backend_agg.py in __init__(self, width, height, dpi)
82 self.width = width
83 self.height = height
---> 84 self._renderer = _RendererAgg(int(width), int(height), dpi)
85 self._filter_renderers = []
86
ValueError: Image size of 891146x450 pixels is too large. It must be less than 2^16 in each direction.
<Figure size 800x400 with 2 Axes>
Which is why I tried to log the score
field, but the issue remains the same.
I’m for sure missing something, but I’m not able to find what it is. I even asked ChatGPT and Claude, without success.