Rolling variance algorithm

I’ve run across this problem as well. There are some great posts out there in computing the running cumulative variance such as John Cooke’s Accurately computing running variance post and the post from Digital explorations, Python code for computing sample and population variances, covariance and correlation coefficient. Just could not find any that were adapted to a rolling window.

The Running Standard Deviations post by Subluminal Messages was critical in getting the rolling window formula to work. Jim takes the power sum of the squared differences of the values versus Welford’s approach of using the sum of the squared differences of the mean. Formula as follows:

PSA today = PSA(yesterday) + (((x today * x today) – x yesterday)) / n

x = value in your time series

n = number of values you’ve analyzed so far.

But, to convert the Power Sum Average formula to a windowed variety you need tweak the formula to the following:

PSA today = PSA yesterday + (((x today * x today) – (x yesterday * x Yesterday) / n

x = value in your time series

n = number of values you’ve analyzed so far.

You’ll also need the Rolling Simple Moving Average formula:

SMA today = SMA yesterday + ((x today – x today – n) / n

x = value in your time series

n = period used for your rolling window.

From there you can compute the Rolling Population Variance:

Population Var today = (PSA today * n – n * SMA today * SMA today) / n

Or the Rolling Sample Variance:

Sample Var today = (PSA today * n – n * SMA today * SMA today) / (n – 1)

I’ve covered this topic along with sample Python code in a blog post a few years back, Running Variance.

Hope this helps.

Please note: I provided links to all the blog posts and math formulas
in Latex (images) for this answer. But, due to my low reputation (<
10); I’m limited to only 2 hyperlinks and absolutely no images. Sorry
about this. Hope this doesn’t take away from the content.

Leave a Comment Cancel reply