I have simulations that evaluate a certain value X. I run the simulations several times and save the value of X in a vector V. When all the runs have finished I evaluate the mean and standard deviation for the vector V.
This approach works, but implies saving all the values for X. As my computer is quite old and with limited ram, I was wondering if there is a way to update the mean value M and the standard deviation S, knowing the value of X at the (n+1)-th run, and the values of M and S after n runs.
How can I update the mean value and the standard deviation as simulations are added to the set?
Please note that this is just a conceptual example, I don’t save only one number X but thousands at each simulations, so I really have problems running a big number of runs if I have to keep all the past values into the memory.
For the mean, it is enough to save the sum and a counter of how many values you added. For the standard deviation, you need an online algorithm like this one:
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm
Or here:
https://stackoverflow.com/questions/11978667/online-algorithm-for-calculating-standard-deviation
3
Old time calculators with statistics functions saved only the number of items, the sum of the items and the sum of the squares of the items.
- mean is the sum of the items divided by their number:
- standard deviation is the square root of the variance:
- variance is obtained by substracting the square of the mean from the mean of the squares: