Saturday, April 16, 2022

Population vs Standard: Deviation and Covariance

 Population vs Standard: Deviation and Covariance



Population Deviation vs Standard Deviation


How is the population deviation related to the standard deviation?


Population Deviation (of a data set x_i):


σx = √( Σ(x_i - mean(x)) / n)


where mean(x) is the arithmetic mean of the data set over x_i


Standard Deviation:


sx = √( Σ(x_i - mean(x)) / (n - 1))


n is the size of the data set x_i.  


Suppose we can calculate the standard deviation by multiplying a factor (let's call it ß for the purpose of this example) to the population deviation.   


ß * σx = sx


ß * √( Σ(x_i - mean(x)) / n) = √( Σ(x_i - mean(x)) / (n - 1))


ß  * √( Σ(x_i - mean(x))) /  √n = √( Σ(x_i - mean(x))) / √(n - 1)


ß * √( Σ(x_i - mean(x)))  / √( Σ(x_i - mean(x))) = √n / √(n - 1)


ß  = √n / √(n - 1)


ß  = √(n/(n - 1))


Hence:


sx =  √(n/(n - 1)) * σx


and


σx = sx * √((n-1)/n)



Example:


x = {4, 7, 10, 16, 38}   

n = 5


σx = 12.16552506

sx = 12.16552506 * √(5/4) = 13.60147051



Population Covariance vs Standard Covariance


For the data sets x_i and y_i, population covariance:


cov_σ = 1/n * Σ((x_i - mean(x)) * (y_i - mean(y)))


And the sample covariance:


cov_s = 1/(n - 1) * Σ((x_i - mean(x)) * (y_i - mean(y)))


We will use the similar tactic above to find a relationship between population covariance and sample covariance:


ß * cov_σ = cov_s


ß * 1/n * Σ((x_i - mean(x)) * (y_i - mean(y))) = 

1/(n - 1) * Σ((x_i - mean(x)) * (y_i - mean(y)))


ß * Σ((x_i - mean(x)) * (y_i - mean(y))) / Σ((x_i - mean(x)) * (y_i - mean(y))) =

n/(n - 1)


ß = n/(n - 1)



Hence:


cov_ s = n/(n - 1) * cov_σ 


and


cov_σ = (n - 1)/n * cov_s



Example:


x = {4, 5, 6, 8}

y = {-2, -1, 2, 0}

n = 4


mean(x) = 5.75

mean(y) = -0.25


cov_σ = 1.1875

cov_s = 1.1875 * 4/3 = 1.5833333333


Hope you find this helpful,


Eddie 


All original content copyright, © 2011-2022.  Edward Shore.   Unauthorized use and/or unauthorized distribution for commercial purposes without express and written permission from the author is strictly prohibited.  This blog entry may be distributed for noncommercial purposes, provided that full credit is given to the author. 


  Casio fx-7000G vs Casio fx-CG 50: A Comparison of Generating Statistical Graphs Today’s blog entry is a comparison of how a hist...