I am recently try using scrapy splash to scrape data from a website that loads more data when scrolls to bottom. website: https://www.openrice.com/zh/hongkong/restaurants/district/%E5%B0%96%E6%B2%99%E5%92%80
so I first try to perform scrolling in console of dev tools using JavaScript:
for (let i = 0; i < 5; i++) {
setTimeout(() => window.scrollTo(0, (document.body.scrollHeight-1500)), i * 2000);
}
(I set the y-coord to (document.body.scrollHeight-1500) because it cannot load when it is at the very end, so it needs to be a bit upper)
The js code works perfectly when do this in the dev tool on browser, so i put this in scrapy splash:
and here’s my code:
import scrapy
from Task2.items import RestaurantItem
from scrapy_splash import SplashRequest
lua_script = """
function main(splash, args)
splash:go(args.url)
splash:wait(5.0)
local scroll = splash:jsfunc([[
function scrollWithDelay() {
for (let i = 0; i < 5; i++) {
setTimeout(() => window.scrollTo(0, (document.body.scrollHeight-1500)), i * 2000);
}
}
]])
scroll()
splash:wait(5.0)
return {html = splash:html()}
end
"""
class OpenriceTstSpider(scrapy.Spider):
name = "openrice_tst"
allowed_domains = ["www.openrice.com"]
def start_requests(self):
url = "https://www.openrice.com/zh/hongkong/restaurants/district/%E5%B0%96%E6%B2%99%E5%92%80"
yield SplashRequest(url, callback=self.parse, endpoint='execute',
args={'wait': 2, 'lua_source': lua_script, 'viewport': '1920x1080',
url: "https://www.openrice.com/zh/hongkong/restaurants/district/%E5%B0%96%E6%B2%99%E5%92%80"})
def parse(self, response):
...
I am only getting the first 20 search result from the website, which could be loaded when i keep scrolling down, it was there before using any js.
So i want to know if I missed any detail or i write anything wrong on my code? I can get some result but it is not from the js, the js works perfectly on my browser, so the problem is either the scrapy-splash is not working, or my lua script is wrong, i keep searching for 2 days and still cannot find a solution, i will give you a big thanks if u can help me!
Also is it possible that I can force it to show all result without scrolling? Thank you so much for everyone that would like to help me…
阿聰MrOnion is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.