Variance | Standard Deviation (2024)

previous

next

Video Available


3.2.4 Variance

Consider two random variables $X$ and $Y$ with the following PMFs.$$ \label{eq:X-var}\nonumber P_X(x) = \left\{\begin{array}{l l}0.5 & \quad \text{for } x=-100\\0.5 & \quad \text{for } x=100\\0 & \quad \text{otherwise}\end{array} \right.\hspace{10pt} (3.3)$$
$$ \label{eq:Y-var}\nonumber P_Y(y) = \left\{\begin{array}{l l}1 & \quad \text{for } y=0\\0 & \quad \text{otherwise}\end{array} \right.\hspace{20pt} (3.4)$$
Note that $EX=EY=0$. Although both random variables have the same mean value, their distributionis completely different. $Y$ is always equal to its mean of $0$, while $X$ is either $100$ or $-100$,quite far from its mean value. The variance is a measure of how spread out the distribution ofa random variable is. Here, the variance of $Y$ is quite small since its distribution is concentrated ata single value, while the variance of $X$ will be larger since its distribution is more spread out.

The variance of a random variable $X$, with mean $EX=\mu_X$, is defined as$$\textrm{Var}(X)=E\big[ (X-\mu_X)^2\big].$$


By definition, the variance of $X$ is the average value of $(X-\mu_X)^2$. Since $(X-\mu_X)^2 \geq 0$,the variance is always larger than or equal to zero. A large value of the variance means that $(X-\mu_X)^2$is often large, so $X$ often takes values far from its mean. This means that the distribution is veryspread out. On the other hand, a low variance means that the distribution is concentrated around its average.

Note that if we did not square the difference between $X$ and its mean, the result would be $0$. That is$$E[X-\mu_X]=EX-E[\mu_X]=\mu_X-\mu_X=0.$$$X$ is sometimes below its average and sometimes above its average. Thus, $X-\mu_X$ is sometimesnegative and sometimes positive, but on average it is zero.

To compute $Var(X)=E\big[ (X-\mu_X)^2\big]$, note that we need to find the expected value of $g(X)=(X-\mu_X)^2$,so we can use LOTUS. In particular, we can write$$\textrm{Var}(X)=E\big[ (X-\mu_X)^2\big]=\sum_{x_k \in R_X} (x_k-\mu_X)^2 P_X(x_k).$$For example, for $X$ and $Y$ defined in Equations 3.3 and 3.4, we have$$\textrm{Var}(X)=(-100-0)^2(0.5)+(100-0)^2(0.5)=10,000$$$$\textrm{Var}(Y)=(0-0)^2(1)=0.$$
As we expect, $X$ has a very large variance while Var$(Y)=0$.

Note that Var$(X)$ has a different unit than $X$. For example, if $X$ is measured in $meters$ thenVar$(X)$ is in $meters^2$. To solve this issue, we define another measure, called the standard deviation,usually shown as $\sigma_X$, which is simply the square root of variance.

The standard deviation of a random variable $X$ is defined as$$\textrm{SD}(X)= \sigma_X= \sqrt {\textrm{Var}(X)}.$$


The standard deviation of $X$ has the same unit as $X$. For $X$ and $Y$ defined in Equations 3.3 and 3.4,we have

$\sigma_X$$=\sqrt{10,000}= 100$
$\sigma_Y$$=\sqrt{0}=0$.

Here is a useful formula for computing the variance.

Computational formula for the variance:$$\hspace{70pt} \textrm{Var}(X)=E\big[X^2\big]-\big[EX\big]^2 \hspace{70pt} (3.5)$$


To prove it note that\begin{align}%\label{}\nonumber \textrm{Var}(X) &= E\big[ (X-\mu_X)^2\big]\\\nonumber &= E \big[ X^2-2 \mu_X X + \mu_X^2 \big]\\\nonumber &= E\big[X^2\big]-2E\big[\mu_X X\big]+E\big[\mu_X^2\big] &\textrm{ by linearity of expectation.}\end{align}
Note that for a given random variable $X$, $\mu_X$ is just a constant real number. Thus,$E\big[\mu_X X\big]=\mu_X E[X]=\mu_X^2$, and $E[\mu_X^2 \big]=\mu_X^2$, so we have

\begin{align}%\label{}\nonumber\textrm{Var}(X) &= E\big[X^2\big]-2\mu_X^2+\mu_X^2\\\nonumber &= E\big[X^2\big]-\mu_X^2.\end{align}
Equation 3.5 is usually easier to work with compared to $\textrm{Var}(X)=E\big[ (X-\mu_X)^2\big]$.To use this equation, we can find $E[X^2]=EX^2$ using LOTUS$$E X^2=\sum_{x_k \in R_X} x_k^2 P_X(x_k),$$and then subtract $\mu_X^2$ to obtain the variance.


Example

I roll a fair die and let $X$ be the resulting number. Find $EX$, Var$(X)$, and $\sigma_X$.

  • Solution
    • We have $R_X=\{1,2,3,4,5,6\}$ and $P_X(k)=\frac{1}{6}$ for $k=1,2,...,6$. Thus, we have$$EX=1 \cdot \frac{1}{6}+ 2 \cdot \frac{1}{6}+ 3 \cdot \frac{1}{6}+ 4 \cdot \frac{1}{6}+ 5 \cdot \frac{1}{6}+ 6 \cdot \frac{1}{6}=\frac{7}{2};$$$$EX^2=1 \cdot \frac{1}{6}+ 4\cdot \frac{1}{6}+ 9\cdot \frac{1}{6}+ 16 \cdot \frac{1}{6}+ 25\cdot \frac{1}{6}+ 36 \cdot \frac{1}{6}=\frac{91}{6}.$$Thus$$\textrm{Var}(X)=E\big[X^2\big]-\big(EX\big)^2=\frac{91}{6}-\left(\frac{7}{2}\right)^2=\frac{91}{6}-\frac{49}{4}\approx 2.92,$$$$\sigma_X= \sqrt {\textrm{Var}(X)}\approx \sqrt{2.92} \approx 1.71$$

Note that variance is not a linear operator. In particular, we have the following theorem.

Theorem
For a random variable $X$ and real numbers $a$ and $b$,$$\hspace{70pt} \textrm{Var}(aX+b)=a^2 \textrm{Var}(X) \hspace{70pt} (3.6)$$


Proof

If $Y=aX+b$, $EY=aEX+b$. Thus,\begin{align}%\label{}\nonumber \textrm{Var} (Y) &= E[ (Y-EY)^2 ]\\\nonumber &= E[ (aX+b-aEX-b)^2 ]\\\nonumber &= E[a^2(X-\mu_X)^2]\\\nonumber &= a^2 E[(X-\mu_X)^2]\\\nonumber &= a^2 \textrm{Var}(X)\\\end{align}

From Equation 3.6, we conclude that, for standard deviation, $\textrm{SD}(aX+b)=|a|\textrm{SD}(X)$. Wementioned that variance is NOT a linear operation. But there is a very important case, in whichvariance behaves like a linear operation and that is when we look at sum of independent random variables.

Theorem
If $X_1, X_2,\cdots ,X_n$ are independent random variables and $X=X_1+X_2+\cdots+X_n$, then$$\hspace{70pt} \textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n) \hspace{70pt} (3.7)$$


We will prove this theorem in Chapter 6, but for now we can look at an example to see how we can use it.


Example

If $X \sim Binomial(n,p)$ find Var$(X)$.

  • Solution
    • We know that we can write a $Binomial(n,p)$ random variable as the sum of $n$ independent$Bernoulli(p)$ random variables, i.e., $X=X_1+X_2+\cdots+X_n$. Thus, we conclude$$\textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n).$$If $X_i \sim Bernoulli(p)$, then its variance is$$\textrm{Var}(X_i)=E[X_i^2]-(EX_i)^2=1^2 \cdot p+0^2 \cdot (1-p)-p^2=p(1-p).$$Thus,
      $\textrm{Var}(X)$$=p(1-p)+p(1-p)+\cdots+p(1-p)$
      $=np(1-p)$.

previous

next


The print version of the book is available on Amazon.

Variance | Standard Deviation (2)


Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI

Variance | Standard Deviation (3)

Variance | Standard Deviation (2024)

FAQs

What does variance deviation tell us? ›

Standard deviation measures how far apart numbers are in a data set. Variance, on the other hand, gives an actual value to how much the numbers in a data set vary from the mean. Standard deviation is the square root of the variance and is expressed in the same units as the data set.

How do you interpret the answer of variance? ›

The variance in statistics is the average squared distance between the data points and the mean. Because it uses squared units rather than the natural data units, the interpretation is less intuitive. Higher values indicate greater variability, but there is no intuitive interpretation for specific values.

What is the standard deviation of 5 5 9 9 9 10 5 10 10? ›

The standard deviation of the data set {5, 5, 9, 9, 9, 10, 5, 10, 10} is 2.2913. Given, The data set: 5, 5, 9, 9, 9, 10, 5, 10, 10.

How do you interpret standard deviation answers? ›

A standard deviation (or σ) is a measure of how dispersed the data is in relation to the mean. Low, or small, standard deviation indicates data are clustered tightly around the mean, and high, or large, standard deviation indicates data are more spread out.

What is a good standard deviation? ›

Statisticians have determined that values no greater than plus or minus 2 SD represent measurements that are are closer to the true value than those that fall in the area greater than ± 2SD. Thus, most QC programs require that corrective action be initiated for data points routinely outside of the ±2SD range.

Is a low standard deviation good? ›

Standard deviation is a mathematical tool to help us assess how far the values are spread above and below the mean. A high standard deviation shows that the data is widely spread (less reliable) and a low standard deviation shows that the data are clustered closely around the mean (more reliable).

Is high variance good or bad? ›

Variance is neither good nor bad for investors in and of itself. However, high variance in a stock is associated with higher risk, along with a higher return. Low variance is associated with lower risk and a lower return.

What is a good variance value? ›

As a rule of thumb, a CV >= 1 indicates a relatively high variation, while a CV < 1 can be considered low. This means that distributions with a coefficient of variation higher than 1 are considered to be high variance whereas those with a CV lower than 1 are considered to be low-variance.

What variance is considered high? ›

Distributions with CV < 1 (such as an Erlang distribution) are considered low-variance, while those with CV > 1 (such as a hyper-exponential distribution) are considered high-variance.

What is the mean deviation of 3 10 10 4 7 10 5? ›

Thus mean deviation =∑|xi−μ|N=|3−7|+3(10−7)+|4−7|+(7−7)+|5−7|7 =4+9+3+0+27=2.57.

What is the standard deviation of 1 2 3 4 5 6 7 8 9? ›

The mean of the given data is the sum of all the terms divided by the number of terms in the data. Now, we will obtain the difference of mean from each term of the data. Now, we will square each of the differences obtained and add them. Therefore, the standard deviation is 2.58.

How do I calculate deviation? ›

Step 1: Calculate the mean of the data—this is ‍ in the formula. Step 2: Subtract the mean from each data point. These differences are called deviations. Data points below the mean will have negative deviations, and data points above the mean will have positive deviations.

How do you interpret variance and standard deviation? ›

Variance is the average squared deviations from the mean, while standard deviation is the square root of this number. Both measures reflect variability in a distribution, but their units differ: Standard deviation is expressed in the same units as the original values (e.g., minutes or meters).

How to interpret the results of standard deviation with an example? ›

For example, if looking at population traits like height, weight, or IQ, standard deviation creates a bell curve of the data. If the mean IQ is 100, and the standard deviation equation gives us a value of 10, then we know that roughly ⅔ of the population has an IQ between 90 – 110.

How to calculate variance? ›

How to calculate variance
  1. Determine the mean of your data.
  2. Find the difference of each value from the mean.
  3. Square each difference.
  4. Calculate the squared values.
  5. Divide this sum of squares by n – 1 (sample) or N (population).

What is the standard deviation of 5 5 9 9910 5 10 10? ›

Therefore, the standard deviation of the set {5,5,9,9,9,10,5,10,10} is 2.

What is the standard deviation for 15 22 27 11 9 21 14 9? ›

So the standard deviation for the given data is 6.4613.

What is the standard deviation for the following data 21 16 13 11 9 14 8 14? ›

= 119.5 / 7. V = 17.07. Now standard deviation = √ variance. = √ 17.07.

What is the standard deviation 12 6 7 3 15 10 18 5? ›

Answer: Question: Find the standard deviation of the distribution 12, 6, 7, 3, 15, 10, 18, 5. = 4.8734.

Top Articles
Latest Posts
Article information

Author: Moshe Kshlerin

Last Updated:

Views: 5986

Rating: 4.7 / 5 (77 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Moshe Kshlerin

Birthday: 1994-01-25

Address: Suite 609 315 Lupita Unions, Ronnieburgh, MI 62697

Phone: +2424755286529

Job: District Education Designer

Hobby: Yoga, Gunsmithing, Singing, 3D printing, Nordic skating, Soapmaking, Juggling

Introduction: My name is Moshe Kshlerin, I am a gleaming, attractive, outstanding, pleasant, delightful, outstanding, famous person who loves writing and wants to share my knowledge and understanding with you.