I am programming in PHP and I have an include file that I have to, well, include, as part of my script. The include file is 400 MB, and it contains an array of objects which are nothing more than configuration settings for a larger project. Here is an example of the contents:
$obj = new obj();
$obj->name = "myname";
....
$objs[$obj->name] = $obj->name;
{repeat}
....
return $objs;
This process repeats itself 40,000 times and ultimately is 650,000 lines long (the file is generated dynamically of course.)
If I am simply trying to load a file that is 400MB, why then would memory usage increase to 6GB? Even if the file is loaded into memory, wouldn’t only take up 400 MB of RAM?
6
There’s not just the file. There’s the parse tree that gets generated from parsing the file, the bytecode it gets compiled to, the memory taken up by the variables defined in it, etc. 6GB still sounds like a lot, but for a 650,000 line file it doesn’t surprise me much.
At some point along the way, someone should probably have thought of using a database. 🙂 Unless you’re using every item in that file every time, loading half a million lines of stuff to pull out a few things is incredibly wasteful.
1
File size is the size of the code that is executed.
Memory usage is the size of the contents of all the variables created by the execution of the code.
An algorithm computing pi may take less than 10 lines of code and need billions of bytes just to store the result.
3
A script is typically a fixed size. It is read in and effectively executed line by line. This is your file size.
While the script is running, it can say to the computer “store this information which I have generated for me”, and the computer dutifully gives it a bit of memory. This is your memory size.
Your script can ask repeatedly for more and more places to store the information it has generated, often storing the same or similar things in many different places. This can take up many megabytes, and even gigabytes. The amount of storage (in memory) is almost totally independent of script size, although the same script will normally use the same amount each time it is run.
1