Shapiro-Wilk test
Test just tells if data is likely distrubited normall
- = data normal distreubited
- =! normal distreubited
NULL hypothesis
-
- Closely “false” (more evidence against) aka hypothesis is wrong
-
- Closely “true” when higher p-value (more for)
- higher p value means error is liekly normally distrubited
Warning
We never say that the null hypothesis is “True”, only that we fail to reject the null
High p means we don’t have enough evidence to reject the null. Doesn’t mean the null is true, but it means support for Ha isn’t strong enough to reject Ho.
p-value
A high p value only tells you you had limited evidence against the null. However if you had a very large sample it might be reasonable to conclude the null is either true or the true value differs only a small amount from the null (any true effect is small).
Residuals
Residuals = observed value − predicted value.
Adjusted vs Multiple
where is some linear model
- Adjusted how well data fits model
summary(quad.lm)$r.squared- w
summary(spruce.lm)$r.squaredCooks Distance:
Piecewise Regression
lowess smoother?
fitted values? Fitted()
anova
Proof
Prove using latex that where I() is 1 when and 0 else.
Code
normcheck(plot_1,plot_2)
# plots 2 graphs side by side,
- What is H1, H2 in norm check?
- Null hypothesis?
- What is P-value in normality check
I(...)
# AS IS FORMULA, do not interperate a ^ Predict the Height of spruce when the Diameter is 15, 18 and 20cm (use predict())
predict(quad.lm, data.frame(BHDiameter = c(15, 18, 20)))adjusted R squared determine which is “better” (whatever with higher value)
summary(spruce.lm)$adj.r.squared