I am a .NET developer who has recently started working in a LAMP environment. I know that if I go to www.somedomain.com/files/test.php
, then (1) DNS resolves the URL to my server (2) my server handles the request on a given port (3) the server looks in /files/test.php and somehow runs test.php and returns the output of the file to the client.
But it would be really great to understand this process in much more detail. For instance, does Apache/nginx actually run the php file or does it pass it to the php interpreter? Does every php file run every time or does the server cache its output? It would be really helpful to know the major details/decisions that a LAMP environment makes during this process. Kind of like this answer, which explains in detail how SSL works…
https://security.stackexchange.com/questions/20803/how-does-ssl-work/20847#20847
What exactly happens depends a lot on the needs and configuration of the individual application or server.
For instance, does Apache/nginx actually run the php file or does it pass it to the php interpreter?
Both are possible, but for performance reasons, the PHP interpreter is typically integrated into the server as a module (such as mod_php for Apache and php-fpm for nginx). Additionally, the PHP code is typically not interpreted but instead compiled into bytecode and stored in a cache such as APC, and then only the bytecode is executed.
Does every php file run every time or does the server cache its output?
This is very much application-specific. High-traffic websites have to employ aggressive caching strategies and wherever possible will cache fully rendered HTML pages via a reverse proxy setup, so the LAMP server itself will not even get involved in most requests.
Within the LAMP environment, it is also common to use memcached to cache partially rendered HTML fragments or the results of SQL queries; typically this requires manual tuning in the code or (if supported via some framework) configuration.
2
To understand what’s going on it’s best to look at the Common Gateway Inteface. For efficiency reasons almost no one uses it anymore, but the modern substitutes (mod_php, fastcgi) were designed to be backwards compatible with it.
Basically, the web server receives a http request, figures out by lookingat the URL that it’s supposed to run a php script to satisfy it, then runs the php interpreter on that script with a bunch of environment variables set with the query information. The output of the php interpreter is sent back to the http client.