I want to find a way to get the contents of a Wikipedia article on a specific date. For example, I’d like to see an article in the version it was on January 1 2022.
- To do this manually or with scraping: go to the History of a page, scroll down until I find a revision that is before my desired date and click on it (https://en.wikipedia.org/w/index.php?title=Jupiter&action=history)
- The REST API can retrieve the last 20 revisions (https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/history)
- The REST API can retrieve revisions older than another revision, but this works using a revision ID, not a date (https://en.wikipedia.org/w/rest.php/v1/page/Jupiter/history?older_than=1219856114)
- Theoretically I could get the last 20 revisions, and continually ask for the previous 20, until I go far back enough in time. However, in practice that can work a few times, but because of rate limits it will not work if browing many articles within an hour
Given none of these methods work well, what is the correct way of retrieving a Wiki entry from a given date?