I’m trying to mirror a website using wget but getting a 404 error, even though the site is accessible through a browser.
Command used:
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent --execute robots=off --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" -P D:client-sitesmaharjanmetal.com.np https://maharjanmetal.com.np/products
Error output:
–2024-10-23 21:12:43– https://maharjanmetal.com.np/products Resolving maharjanmetal.com.np (maharjanmetal.com.np)…
149.100.146.116 Connecting to maharjanmetal.com.np (maharjanmetal.com.np)|149.100.146.116|:443… connected. HTTP request
sent, awaiting response… 301 Moved Permanently Location:
https://maharjanmetal.com.np/products/ [following]
–2024-10-23 21:12:43– https://maharjanmetal.com.np/products/ Reusing existing connection to maharjanmetal.com.np:443. HTTP request
sent, awaiting response… 404 Not Found 2024-10-23 21:12:44 ERROR
404: Not Found.
What could be causing this 404 error when the site is clearly accessible through a browser? How can I successfully mirror this website using wget?
Environment:
Windows 11
Expected behavior:
wget should download the website content and its assets, creating a local mirror of the site
Actual behavior:
Receiving a 404 error despite the site being accessible through browsers
The command follows a 301 redirect from /products to /products/ but then fails with 404
No files are downloaded
The puzzling part is that the URL is perfectly accessible through browsers but wget consistently gets a 404 error after following the 301 redirect.
I checked your target URL and confirmed it is a 404 Not found
page so obviously wget
will stop if the response is 404
,
If you still want to download this page then use the --content-on-error
flag to ignore the 404 Not found
error
example:
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent --execute robots=off --content-on-error --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" -P D:client-sitesmaharjanmetal.com.np https://maharjanmetal.com.np/products/
2