stats_Chapter 10
View on GitHub | Download Local
Extracted Content (for search)
Click to view slide text
CHAPTER 10 Wayne Stewart
LABS 3 AND 4 ARE CRITICAL!!
Pvalue song
# 10.72 v ← Intro2R::myreadxl() helium ← v$HELIUM names(helium) ylm ← lm(PROPPASS ~ TEMP, helium) library(ggplot2) g ← ggplot(helium, aes( x = TEMP, y = PROPPASS)) + geom_point() g ← g + geom_smooth(method = “lm”, formula = y ~ x) g summary(ylm)
Summary information
summary(ylm) Call: lm(formula = PROPPASS ~ TEMP, data = helium) Residuals: Min 1Q -0.30732 -0.02940
Median 0.03045
3Q 0.05943
Max 0.17014
You must know how to interpret ALL of this output!!
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -13.490347 2.073772 -6.505 0.000187 *** TEMP -0.052829 0.007728 (-6.836)^2 0.000133 *** —Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
ෝ 𝝈
Residual standard error: 0.1333 on 8 degrees of freedom Multiple R-squared: 0.8538, Adjusted R-squared: 0.8356 F-statistic: 46.73 on 1 and 8 DF, p-value: 0.0001329
?
𝜎2 𝛽መ1 ∼ 𝑁 𝛽1 , 𝑆𝑆𝑋𝑋 𝜎2 ∑𝑥𝑖2 𝑛 መ 𝛽0 ∼ 𝑁 𝛽0 , 𝑆𝑆𝑋𝑋 𝑅𝑆𝑆 2 2 𝜎 =𝑆 = 𝑛−2
F STATISTIC LAST LINE OF SUMMARY
WHAT WE NEED TO COVER
SOME DEFINITIONS
THE MODEL
ASSUMPTIONS
ERROR DISTRIBUTIONS
RESIDUALS
NOTICE THE BOOK USES SSE NOT SSR
CAN YOU PROVE THESE?
THE FOLLOWING THEOREM IS PROVED IN MATH 4773 WHERE LA IS USED EXTENSIVELY
𝑅𝑆𝑆 2 𝑆 = 𝑛−2
THEOREM 10.1
𝑆2
INTERPRETATION OF 𝑆
RAR MAN!!
CI FOR 𝛽1
TEDIOUS CALCULATIONS DONE BY R THROUGH lm()
CORRELATION
LINEAR ASSOCIATION
CORRELATION ≠ CAUSALITY
STORKS DELIVER BABIES?
• Pioneering statistician George Udny Yule, author of the seminal 1911 textbook Introduction to the Theory of Statistics, explained confounding factors with a pleasing reference to reproduction. He noted that in Alsatian villages numbers of human newborns are correlated with numbers of storks nesting locally. It is tempting to conclude that storks do actually deliver babies, but the real explanation is far more mundane. Larger villages have more houses with chimneys for storks to build nests, and more babies are of course delivered in larger villages. The confounding factor is village size.
cor.test() in R
COEFFICIENT OF DETERMINATION OR MULTIPLE 𝑅 2
SIZE
PRACTICAL INTERPRETATION
ADJUSTED 𝑅 2 = 𝑅 𝑎2 HERE 𝑘 = 1
cor.test() in R
ESTIMATION AND PREDICTION
PREDICTION INTERVAL
EXAMPLE 10.11
OUTSIDE THE RANGE – WATCH OUT!!
F-STAT
IN R, 𝐻0 : 𝛽1 = 0
How do we combine levels???
We can order the levels by using “factor()”
BASSBUFF first – why?
OTHER ISSUES
B E AWA R E O F T H E I S S U E S R E L AT E D TO THE CLASSICAL PA R A D I G M ( P VA L U E S )
EASY TO UNDERSTAND WARNINGS
P-VALUE
MORE LESSONS (VERY USEFUL)
FIND A
Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept)
-0.6949
4.6645
x
8.2396
31.360
<2e-16 ***
FIND B
Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept)
-0.6949
4.6645
x
8.2396
31.360
<2e-16 ***
FIND C
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)
-0.6949
4.6645
-0.149
x
8.2396
0.2627
31.360
<2e-16 ***
—Signif. codes:
0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.46 on 28 degrees of freedom Multiple R-squared:
0.9723,
Adjusted R-squared:
p-value: F
0.9713
FIND D
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)
-0.6949
4.6645
-0.149
x
8.2396
0.2627
31.360
<2e-16 ***
—Signif. codes:
0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.46 on 28 degrees of freedom Multiple R-squared:
0.9723,
Adjusted R-squared:
p-value: F
0.9713
FIND E
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)
-0.6949
4.6645
-0.149
x
8.2396
0.2627
31.360
<2e-16 ***
—Signif. codes:
0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.46 on 28 degrees of freedom Multiple R-squared:
0.9723,
Adjusted R-squared:
p-value: F
0.9713
FIND F
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)
-0.6949
4.6645
-0.149
x
8.2396
0.2627
31.360
<2e-16 ***
—Signif. codes:
0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.46 on 28 degrees of freedom Multiple R-squared:
0.9723,
Adjusted R-squared:
p-value: F
0.9713
FIND THE SAMPLE PEARSON CORRELATION COEFFICIENT Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)
-0.6949
4.6645
-0.149
x
8.2396
0.2627
31.360
<2e-16 ***
—Signif. codes:
0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.46 on 28 degrees of freedom Multiple R-squared:
0.9723,
Adjusted R-squared:
p-value: F
0.9713
r