I'm working with the "iris" dataset in R (version 4.0.3), and I'm trying to find the best model for predicting Sepal.Length
. Over the course of my analysis, I've come across two models that I can't compare using anova()
:
colnames(iris) <- c("sl", "sw", "pl", "pw", "species")model1 <- lm(sl ~ pl + sw + species + pl:sw + pl:species, iris)model2 <- lm(sl ~ I(pl^2) + I(sw^2) + species + I(pw^2) + I(pl^2):I(sw^2), iris)
When I run anova(model2, model1)
, I get this:
Analysis of Variance TableModel 1: sl ~ I(pl^2) + I(sw^2) + species + I(pw^2) + I(pl^2):I(sw^2)Model 2: sl ~ pl + sw + species + pl:sw + pl:species Res.Df RSS Df Sum of Sq F Pr(>F)1 143 12.591 2 142 12.754 1 -0.16296
But this isn't the case for other models, such as anova(model2, lm(sl ~ ., iris)
:
Analysis of Variance TableModel 1: sl ~ I(pl^2) + I(sw^2) + species + I(pw^2) + I(pl^2):I(sw^2)Model 2: sl ~ sw + pl + pw + species Res.Df RSS Df Sum of Sq F Pr(>F) 1 143 12.591 2 144 13.556 -1 -0.96513 10.961 0.001178 **---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1
Why can't the first two models be compared using anova()
?