I have scrapy set up with Playwright. The scrape is rather small in terms of scale – running at CONCURRENT_REQUESTS = 1
, it can still complete within a reasonable amount of time.
The scrape takes up a decent amount of memory on a Heroku dyno – easily over 1 GB after running for a handful of minutes, causing the Basic dyno size to time out.
However, running the same spider locally seems to require less memory at peak; in the scraping stats, I see 'memusage/max': 205668352
, which I assume is 205 MB.
Granted my laptop is much more powerful than the dyno. But what could account for such a difference in memory used?