15:00
STAT 20: Introduction to Probability and Statistics
Lab 2: Class Survey (both parts) due Tuesday at 8am
Summarizing Distributions of Data using:
You can construct a statistical graphic to show the shape, which you can describe in terms of modality and skew
You can calculate a measure of center to convey a sense of a typical (representative) observation
And you can calculate a measure of spread (i.e. scatter, dispersion, variation) to capture how much variability there is in the data
Typical value?
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
How can we express the variability in this data set using a single number?
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
\[ {\Large 6} \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad {\Large 11}\]
\[\textrm{range:} \quad max - min\]
\[ 11 - 6 = 5\]
Characteristics
\[ 6 \quad 7 \quad {\Large 7 \quad 7} \quad 8 \quad {\large 8} \quad 9 \quad {\Large 9 \quad 10} \quad 11 \quad 11\]
The difference between the 3rd quartile, \(Q_3\), and the 1st quartile, \(Q_1\) (i.e. the middle 50% of the data)
\[\textrm{IQR:} \quad Q_3 - Q_1\]
\[ 9.5 - 7 = 2.5 \]
Characteristics
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Take the differences from each observation, \(x_i\), to the sample mean, \(\bar{x}\), take their absolute values, add them up, and divide by \(n\). Simply put, this is the average distance from the mean.
\[MAD: \quad \frac{1}{n}\sum_{i = 1}^n |x_i - \bar{x}| \]
\[ MAD = 1.4 \]
Characteristics
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Take the differences from each observation, \(x_i\), to the sample mean, \(\bar{x}\), square them, add them up, and divide by \(n - 1\) .
\[s^2: \quad \frac{1}{n - 1}\sum_{i = 1}^n (x_i - \bar{x})^2 \]
\[ s^2 = 2.87 \]
Characteristics
\[ 6 \quad 7 \quad 7 \quad 7 \quad 8 \quad 8 \quad 9 \quad 9 \quad 10 \quad 11 \quad 11\]
Take the differences from each observation, \(x_i\), to the sample mean, \(\bar{x}\), square them, add them up, divide by \(n - 1\), then take the square root.
\[ S: \sqrt{\frac{1}{n - 1}\sum_{i = 1}^n (x_i - \bar{x})^2} \]
\[ s = 1.70 \]
Characteristics
15:00
ggplot2
Demo
Coding Activity: Graphing Numerical Data
25:00
05:00
30:00