Shapiro-Wilk test

Test just tells if data is likely distrubited normall

  • = data normal distreubited
  • =! normal distreubited

NULL hypothesis

    • Closely “false” (more evidence against) aka hypothesis is wrong
    • Closely “true” when higher p-value (more for)
    • higher p value means error is liekly normally distrubited

Warning

We never say that the null hypothesis is “True”, only that we fail to reject the null

High p means we don’t have enough evidence to reject the null. Doesn’t mean the null is true, but it means support for Ha isn’t strong enough to reject Ho.

p-value

A high p value only tells you you had limited evidence against the null. However if you had a very large sample it might be reasonable to conclude the null is either true or the true value differs only a small amount from the null (any true effect is small).

Residuals

Residuals = observed value − predicted value.

Adjusted vs Multiple

where is some linear model

  • Adjusted how well data fits model
summary(quad.lm)$r.squared
  • w
summary(spruce.lm)$r.squared

Cooks Distance:

Piecewise Regression

lowess smoother?

fitted values? Fitted()

anova

Proof

Prove using latex that where I() is 1 when and 0 else.

Code

normcheck(plot_1,plot_2)
# plots 2 graphs side by side, 
 
  • What is H1, H2 in norm check?
  • Null hypothesis?
  • What is P-value in normality check
I(...)
# AS IS FORMULA, do not interperate a ^ 

Predict the Height of spruce when the Diameter is 15, 18 and 20cm (use predict())

predict(quad.lm, data.frame(BHDiameter = c(15, 18, 20)))

adjusted R squared determine which is “better” (whatever with higher value)

summary(spruce.lm)$adj.r.squared