Variance | Standard Deviation (2024)

← previous

Video Available

3.2.4 Variance

Consider two random variables $X$ and $Y$ with the following PMFs.$$ \label{eq:X-var}\nonumber P_X(x) = \left\{\begin{array}{l l}0.5 & \quad \text{for } x=-100\\0.5 & \quad \text{for } x=100\\0 & \quad \text{otherwise}\end{array} \right.\hspace{10pt} (3.3)$$
$$ \label{eq:Y-var}\nonumber P_Y(y) = \left\{\begin{array}{l l}1 & \quad \text{for } y=0\\0 & \quad \text{otherwise}\end{array} \right.\hspace{20pt} (3.4)$$
Note that $EX=EY=0$. Although both random variables have the same mean value, their distributionis completely different. $Y$ is always equal to its mean of $0$, while $X$ is either $100$ or $-100$,quite far from its mean value. The variance is a measure of how spread out the distribution ofa random variable is. Here, the variance of $Y$ is quite small since its distribution is concentrated ata single value, while the variance of $X$ will be larger since its distribution is more spread out.

The variance of a random variable $X$, with mean $EX=\mu_X$, is defined as$$\textrm{Var}(X)=E\big[ (X-\mu_X)^2\big].$$

By definition, the variance of $X$ is the average value of $(X-\mu_X)^2$. Since $(X-\mu_X)^2 \geq 0$,the variance is always larger than or equal to zero. A large value of the variance means that $(X-\mu_X)^2$is often large, so $X$ often takes values far from its mean. This means that the distribution is veryspread out. On the other hand, a low variance means that the distribution is concentrated around its average.

Note that if we did not square the difference between $X$ and its mean, the result would be $0$. That is$$E[X-\mu_X]=EX-E[\mu_X]=\mu_X-\mu_X=0.$$$X$ is sometimes below its average and sometimes above its average. Thus, $X-\mu_X$ is sometimesnegative and sometimes positive, but on average it is zero.

To compute $Var(X)=E\big[ (X-\mu_X)^2\big]$, note that we need to find the expected value of $g(X)=(X-\mu_X)^2$,so we can use LOTUS. In particular, we can write$$\textrm{Var}(X)=E\big[ (X-\mu_X)^2\big]=\sum_{x_k \in R_X} (x_k-\mu_X)^2 P_X(x_k).$$For example, for $X$ and $Y$ defined in Equations 3.3 and 3.4, we have$$\textrm{Var}(X)=(-100-0)^2(0.5)+(100-0)^2(0.5)=10,000$$$$\textrm{Var}(Y)=(0-0)^2(1)=0.$$
As we expect, $X$ has a very large variance while Var$(Y)=0$.

Note that Var$(X)$ has a different unit than $X$. For example, if $X$ is measured in $meters$ thenVar$(X)$ is in $meters^2$. To solve this issue, we define another measure, called the standard deviation,usually shown as $\sigma_X$, which is simply the square root of variance.

The standard deviation of a random variable $X$ is defined as$$\textrm{SD}(X)= \sigma_X= \sqrt {\textrm{Var}(X)}.$$

The standard deviation of $X$ has the same unit as $X$. For $X$ and $Y$ defined in Equations 3.3 and 3.4,we have

$\sigma_X$	$=\sqrt{10,000}= 100$
$\sigma_Y$	$=\sqrt{0}=0$.

Here is a useful formula for computing the variance.

Computational formula for the variance:$$\hspace{70pt} \textrm{Var}(X)=E\big[X^2\big]-\big[EX\big]^2 \hspace{70pt} (3.5)$$

To prove it note that\begin{align}%\label{}\nonumber \textrm{Var}(X) &= E\big[ (X-\mu_X)^2\big]\\\nonumber &= E \big[ X^2-2 \mu_X X + \mu_X^2 \big]\\\nonumber &= E\big[X^2\big]-2E\big[\mu_X X\big]+E\big[\mu_X^2\big] &\textrm{ by linearity of expectation.}\end{align}
Note that for a given random variable $X$, $\mu_X$ is just a constant real number. Thus,$E\big[\mu_X X\big]=\mu_X E[X]=\mu_X^2$, and $E[\mu_X^2 \big]=\mu_X^2$, so we have

\begin{align}%\label{}\nonumber\textrm{Var}(X) &= E\big[X^2\big]-2\mu_X^2+\mu_X^2\\\nonumber &= E\big[X^2\big]-\mu_X^2.\end{align}
Equation 3.5 is usually easier to work with compared to $\textrm{Var}(X)=E\big[ (X-\mu_X)^2\big]$.To use this equation, we can find $E[X^2]=EX^2$ using LOTUS$$E X^2=\sum_{x_k \in R_X} x_k^2 P_X(x_k),$$and then subtract $\mu_X^2$ to obtain the variance.

Example

I roll a fair die and let $X$ be the resulting number. Find $EX$, Var$(X)$, and $\sigma_X$.

Solution
- We have $R_X=\{1,2,3,4,5,6\}$ and $P_X(k)=\frac{1}{6}$ for $k=1,2,...,6$. Thus, we have$$EX=1 \cdot \frac{1}{6}+ 2 \cdot \frac{1}{6}+ 3 \cdot \frac{1}{6}+ 4 \cdot \frac{1}{6}+ 5 \cdot \frac{1}{6}+ 6 \cdot \frac{1}{6}=\frac{7}{2};$$$$EX^2=1 \cdot \frac{1}{6}+ 4\cdot \frac{1}{6}+ 9\cdot \frac{1}{6}+ 16 \cdot \frac{1}{6}+ 25\cdot \frac{1}{6}+ 36 \cdot \frac{1}{6}=\frac{91}{6}.$$Thus$$\textrm{Var}(X)=E\big[X^2\big]-\big(EX\big)^2=\frac{91}{6}-\left(\frac{7}{2}\right)^2=\frac{91}{6}-\frac{49}{4}\approx 2.92,$$$$\sigma_X= \sqrt {\textrm{Var}(X)}\approx \sqrt{2.92} \approx 1.71$$

Note that variance is not a linear operator. In particular, we have the following theorem.

Theorem
For a random variable $X$ and real numbers $a$ and $b$,$$\hspace{70pt} \textrm{Var}(aX+b)=a^2 \textrm{Var}(X) \hspace{70pt} (3.6)$$

Proof

If $Y=aX+b$, $EY=aEX+b$. Thus,\begin{align}%\label{}\nonumber \textrm{Var} (Y) &= E[ (Y-EY)^2 ]\\\nonumber &= E[ (aX+b-aEX-b)^2 ]\\\nonumber &= E[a^2(X-\mu_X)^2]\\\nonumber &= a^2 E[(X-\mu_X)^2]\\\nonumber &= a^2 \textrm{Var}(X)\\\end{align}

From Equation 3.6, we conclude that, for standard deviation, $\textrm{SD}(aX+b)=|a|\textrm{SD}(X)$. Wementioned that variance is NOT a linear operation. But there is a very important case, in whichvariance behaves like a linear operation and that is when we look at sum of independent random variables.

Theorem
If $X_1, X_2,\cdots ,X_n$ are independent random variables and $X=X_1+X_2+\cdots+X_n$, then$$\hspace{70pt} \textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n) \hspace{70pt} (3.7)$$

We will prove this theorem in Chapter 6, but for now we can look at an example to see how we can use it.

Example

If $X \sim Binomial(n,p)$ find Var$(X)$.

Solution
- We know that we can write a $Binomial(n,p)$ random variable as the sum of $n$ independent$Bernoulli(p)$ random variables, i.e., $X=X_1+X_2+\cdots+X_n$. Thus, we conclude$$\textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n).$$If $X_i \sim Bernoulli(p)$, then its variance is$$\textrm{Var}(X_i)=E[X_i^2]-(EX_i)^2=1^2 \cdot p+0^2 \cdot (1-p)-p^2=p(1-p).$$Thus,
  $\textrm{Var}(X)$ $=p(1-p)+p(1-p)+\cdots+p(1-p)$
  $=np(1-p)$.

← previous

The print version of the book is available on Amazon.

Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI

FAQs

What does variance deviation tell us? ›

Standard deviation measures how far apart numbers are in a data set. Variance, on the other hand, gives an actual value to how much the numbers in a data set vary from the mean. Standard deviation is the square root of the variance and is expressed in the same units as the data set.

Find Out More ›

How do you interpret the answer of variance? ›

The variance in statistics is the average squared distance between the data points and the mean. Because it uses squared units rather than the natural data units, the interpretation is less intuitive. Higher values indicate greater variability, but there is no intuitive interpretation for specific values.

What is the standard deviation of 1 2 3 4 5 6 7 8 9? ›

The mean of the given data is the sum of all the terms divided by the number of terms in the data. Now, we will obtain the difference of mean from each term of the data. Now, we will square each of the differences obtained and add them. Therefore, the standard deviation is 2.58.

Learn More Now ›

How do I calculate deviation? ›

Step 1: Calculate the mean of the data—this is ‍ in the formula. Step 2: Subtract the mean from each data point. These differences are called deviations. Data points below the mean will have negative deviations, and data points above the mean will have positive deviations.

Learn More Now ›

How do you interpret variance and standard deviation? ›

Variance is the average squared deviations from the mean, while standard deviation is the square root of this number. Both measures reflect variability in a distribution, but their units differ: Standard deviation is expressed in the same units as the original values (e.g., minutes or meters).

Get More Info ›

How to interpret the results of standard deviation with an example? ›

For example, if looking at population traits like height, weight, or IQ, standard deviation creates a bell curve of the data. If the mean IQ is 100, and the standard deviation equation gives us a value of 10, then we know that roughly ⅔ of the population has an IQ between 90 – 110.

Find Out More ›

How to calculate variance? ›

How to calculate variance