If you’d like to export this presentation to a PDF, do the following
This feature has been confirmed to work in Google Chrome and Firefox.
Useful when modeling a continuous random variable that has a bell-shaped distribution.
\(\mu\): mean - determines the center of the distribution
\(\sigma\): standard deviation - determines the spread of the distribution
\[f(x) = \frac{1}{\sqrt{2\pi \sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\] for \(x\) in \((-\infty, \infty)\)
Expectation: \(E(X) = \mu\)
Variance:\(Var(X) = \sigma^2\)
The Standard Normal Distribution is a Normal distribution with mean \(\mu=0\) and standard deviation \(\sigma=1\).
\[Z \sim N(0, 1)\]
For a Normal random variable, \(X\), a z-score represents the number of standard deviations any observation \(x\) is from the mean.
\[z = \frac{x-\mu}{\sigma}\]
For a particular bridge, recorded vehicle speeds are normally distributed with a mean of 58 mph and a standard deviation of 10 mph. Suppose a randomly chosen vehicle is going 40 miles per hour.
How many standard deviations away from the mean is 40 mph?
Calculate the z-score!
\(z=\)\(\frac{x-\mu}{\sigma}\)\(=\frac{40-58}{10}\)\(=-1.8\)
The randomly chosen vehicle is traveling 1.8 standard deviations slower than the average vehicle on the bridge.
For a particular bridge, recorded vehicle speeds are normally distributed with a mean of 58 mph and a standard deviation of 10 mph.
What is the probability of a randomly selecting a vehicle going less than 40 mph?
\[P(X < 40) = \int \limits_{-\infty}^{40} \frac{1}{\sqrt{2\pi 10^2}}e^{-\frac{(x-58)^2}{2(10^2)}} dx\]
We cannot solve this analytically - we must use R!
Normal Distribution
\(F(x) = P(X \leq x)\):
pnorm(q, mean, sd, lower.tail = TRUE)
\(p^{th}\) percentile:
qnorm(p, mean, sd, lower.tail = TRUE)
Please complete the short Class 7 Activity in Top Hat.
05:00
Please do the following:
Answer the one question in the google form that can be accessed in any of the following ways:
Typing the following URL into your browser. The URL is case sensitive. https://beav.es/GaJ
Find the Week 4 Survey link on Canvas under the Week 4 module
Scan the QR code
Recall that inferential statistics use information from a sample to estimate or test characteristics from a population of interest.
Typically, we calculate a point estimate from the sample as our best guess of the parameter of interest.
Naturally, our best guess for the population mean, \(\mu\), from a sample is the sample mean, \(\overline{x}\).
Our best guess for the population proportion, \(p\), is the sample proportion, \(\hat{p}\).
Even when robust sampling schemes are used, different samples will yield different point estimates.
Population
Sample 1
Sample 2
Sample 3
\(\hat{\theta}_1\)
\(\hat{\theta}_2\)
\(\hat{\theta}_3\) \(\hat{\theta}\) represents a generic point estimate.
Population Distribution
Distribution of the entire collection of interest.
SamplED Distribution
Distribution of \(n\) observations obtained from a single sample.
SamplING Distribution
Distribution of a sample statistic, such as \(\overline{x}\) or \(\hat{p}\), from repeated samples of size \(n\) from the population.
Understanding the sampling distribution of commonly used statistics, such as \(\overline{x}\) and \(\hat{p}\), allows us to quantify the uncertainty in our point estimates.
Recall that because of sampling variability, a statistic from a sample is a random variable.
A statistic is called unbiased if its expectation is equal to the corresponding population parameter.
\(\overline{x}\), \(\hat{p}\), and \(s^2\) are unbiased.
\(E(\overline{x}) =\) \(\mu\)
\(E(\hat{p}) =\) \(p\)
\(E(s^2) =\) \(\sigma^2\)
A point estimate is called consistent if it converges in probability to its corresponding population parameter.
Under the Law of Large Numbers, we have that as sample size, \(n\), increases the point estimate will approach the population parameter.
\(\overline{x}\), \(\hat{p}\), and \(s^2\) are consistent.
Therefore, as \(n\) increases towards the size of the population
\(\overline{x} \rightarrow\) \(\mu\)
\(\hat{p} \rightarrow\) \(p\)
\(s^2 \rightarrow\) \(\sigma^2\)
The variability of the point estimate is called the standard error.
The standard error is the standard deviation of the sampling distribution.
As \(n\) increases, the standard error of the point estimate decreases.
When observations are independent and the sample size, \(n\), is sufficiently large, the central limit theorem states that the distributions of \(\hat{p}\) and \(\overline{x}\) are approximately Normal.
The sample size conditions (“sufficiently large”) and the details of these normal distributions differ for \(\hat{p}\) and \(\overline{x}\).
Sample Proportion, \(\hat{p}\)
\[\hat{p}\sim N\bigg(p, \sqrt{\frac{p(1-p)}{n}}\bigg)\] where \(p\) represents the population proportion
Sample Mean, \(\overline{x}\)
\[\overline{x}\sim N\bigg(\mu, \frac{\sigma}{\sqrt{n}}\bigg)\] where \(\mu\) and \(\sigma\) represent the population mean and standard deviation, respectively.
The sample size conditions needed to apply the Central Limit Theorem differ depending on the statistic.
Sample proportion, \(\hat{p}\)
For the CLT to apply to the distribution of the sample proportion, we need the following sample size conditions to be met:
\(np \geq 10\)
\(n(1-p) \geq 10\)
Sample mean, \(\overline{x}\)
Use the sample size and observe the shape of the sampled distribution to determine if the sample size is sufficiently large:
If \(n\geq 30\), we can typically assume the sampling distribution of \(\overline{x}\) is approximately Normal and the CLT applies.
If \(n < 30\), we need to look at the sampled distribution. If there are no clear outliers or strong skewness in the sampled data, we can assume the sampling distribution of \(\overline{x}\) is approximately Normal and the CLT applies.
If the sample size conditions aren’t met, we cannot apply the results of the CLT.
Question of interest: What proportion of students in ST 314 have attended a career fair this year?
Parameter of interest:
Please complete the Class 8 Activity in Top Hat (can be found under the Assigned tab). Collaboration is encouraged! When you are finished with the activity, you are free to go.