I serve a few pages from Apache (v. 2.4.38) and have only notified one person of the paths to the pages. He says he has not shared the paths and does not use a proxy server. For the most part the apache access log shows only requests for those pages from that one person’s (static) IP (together with the usual denied random known-weakness requests).
But recently the log shows some requests for a few files on those pages from other IPs. For example, I see:
54.201.227.182 - - [07/Aug/2024:14:34:32 -0700] "GET /local/path/aurora.mkv HTTP/1.1" 200 5025320 "-" "Mozilla/5.0 (Macintosh; PPC Mac OS X 10.10; rv:51.0) Gecko/20100101 Firefox/51.0"
35.167.184.80 - - [07/Aug/2024:14:34:34 -0700] "GET /local/path/aurora.mkv HTTP/1.1" 200 73572232 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0"
Here I’ve obfuscated my local path and filename, but not the originating IPs. The requested files actually exist on my server. The requesting IPs always resolve to something like ec2-IP-address-in-reverse.us-west-2.compute.amazonaws.com
.
My Apache is configured to not allow proxy requests, no other pages link to these paths, and my robots.txt file is:
User-agent: *
Disallow: /
I am not being flooded with these mysterious requests but am very curious as to where they come from, how the originator learned my local paths, and whether I should be concerned about this. Could this be some sort of caching process? I only serve http – could someone be sniffing the urls, as unlikely as that sounds?