R-Squared vs. Adjusted R-Squared

In this lesson, we will demonstrate the differences between the standard r-squared diagnostic measure and the adjusted r-squared meausure.

We begin by randomly generating 800 observations of a respnonse variable y as well as 600 potential predictors. The values for each variable are drawn from pairwise independent normal distributions.

n = 800
iter = 600
df <- data.frame(replicate(iter, rnorm(n, 10, 2)))
y <- rnorm(n, 10, 2)

To confirm that the variables are independent of one another, we will create a pairs plot of y and the first 4 predictors.

temp_df <- data.frame(y, df[1:4])
pairs(temp_df)

We will now create 600 regression models. Each model will use y as the response. Model number i will use the first i possible predictors. For each model, we will store the r-squared value, as well as the adjusted r-squared value.

r_sqr <- c()
adj_r <- c()
for (i in 1:iter){
  mod <- lm(y ~ ., df[1:i])
  s <- summary(mod)
  r_sqr <- c(r_sqr, s$r.squared)
  adj_r <- c(adj_r, s$adj.r.squared)
}

We will now create line plots of see how r-squared and adjusted r-squared change as we add in more predictors.

plot(r_sqr, pch=".", col="white", ylim = c(-0.2, 01))
abline(h=0)
lines(1:iter, r_sqr, lwd=2, col="salmon")
lines(1:iter, adj_r, lwd=2, col="cornflowerblue")

LS0tDQp0aXRsZTogIkxlc3NvbiAxNyAtIEFkanVzdGVkIFItU3F1YXJlZCINCmF1dGhvcjogIlJvYmJpZSBCZWFuZSINCm91dHB1dDoNCiAgaHRtbF9ub3RlYm9vazoNCiAgICB0aGVtZTogZmxhdGx5DQogICAgdG9jOiB5ZXMNCiAgICB0b2NfZGVwdGg6IDINCi0tLQ0KDQojIFItU3F1YXJlZCB2cy4gQWRqdXN0ZWQgUi1TcXVhcmVkDQoNCkluIHRoaXMgbGVzc29uLCB3ZSB3aWxsIGRlbW9uc3RyYXRlIHRoZSBkaWZmZXJlbmNlcyBiZXR3ZWVuIHRoZSBzdGFuZGFyZCByLXNxdWFyZWQgZGlhZ25vc3RpYyBtZWFzdXJlIGFuZCB0aGUgYWRqdXN0ZWQgci1zcXVhcmVkIG1lYXVzdXJlLiANCg0KV2UgYmVnaW4gYnkgcmFuZG9tbHkgZ2VuZXJhdGluZyA4MDAgb2JzZXJ2YXRpb25zIG9mIGEgcmVzcG5vbnNlIHZhcmlhYmxlIGB5YCBhcyB3ZWxsIGFzIDYwMCBwb3RlbnRpYWwgcHJlZGljdG9ycy4gVGhlIHZhbHVlcyBmb3IgZWFjaCB2YXJpYWJsZSBhcmUgZHJhd24gZnJvbSBwYWlyd2lzZSBpbmRlcGVuZGVudCBub3JtYWwgZGlzdHJpYnV0aW9ucy4gDQoNCg0KYGBge3J9DQpuID0gODAwDQppdGVyID0gNjAwDQpkZiA8LSBkYXRhLmZyYW1lKHJlcGxpY2F0ZShpdGVyLCBybm9ybShuLCAxMCwgMikpKQ0KeSA8LSBybm9ybShuLCAxMCwgMikNCg0KZGZbMToyMF0NCmBgYA0KDQpUbyBjb25maXJtIHRoYXQgdGhlIHZhcmlhYmxlcyBhcmUgaW5kZXBlbmRlbnQgb2Ygb25lIGFub3RoZXIsIHdlIHdpbGwgY3JlYXRlIGEgcGFpcnMgcGxvdCBvZiBgeWAgYW5kIHRoZSBmaXJzdCA0IHByZWRpY3RvcnMuIA0KDQpgYGB7cn0NCnRlbXBfZGYgPC0gZGF0YS5mcmFtZSh5LCBkZlsxOjRdKQ0KcGFpcnModGVtcF9kZikNCmBgYA0KDQpXZSB3aWxsIG5vdyBjcmVhdGUgNjAwIHJlZ3Jlc3Npb24gbW9kZWxzLiBFYWNoIG1vZGVsIHdpbGwgdXNlIGB5YCBhcyB0aGUgcmVzcG9uc2UuIE1vZGVsIG51bWJlciBgaWAgd2lsbCB1c2UgdGhlIGZpcnN0IGBpYCBwb3NzaWJsZSBwcmVkaWN0b3JzLiBGb3IgZWFjaCBtb2RlbCwgd2Ugd2lsbCBzdG9yZSB0aGUgci1zcXVhcmVkIHZhbHVlLCBhcyB3ZWxsIGFzIHRoZSBhZGp1c3RlZCByLXNxdWFyZWQgdmFsdWUuIA0KDQpgYGB7cn0NCnJfc3FyIDwtIGMoKQ0KYWRqX3IgPC0gYygpDQoNCmZvciAoaSBpbiAxOml0ZXIpew0KICBtb2QgPC0gbG0oeSB+IC4sIGRmWzE6aV0pDQogIHMgPC0gc3VtbWFyeShtb2QpDQogIHJfc3FyIDwtIGMocl9zcXIsIHMkci5zcXVhcmVkKQ0KICBhZGpfciA8LSBjKGFkal9yLCBzJGFkai5yLnNxdWFyZWQpDQp9DQoNCmBgYA0KDQpXZSB3aWxsIG5vdyBjcmVhdGUgbGluZSBwbG90cyBvZiBzZWUgaG93IHItc3F1YXJlZCBhbmQgYWRqdXN0ZWQgci1zcXVhcmVkIGNoYW5nZSBhcyB3ZSBhZGQgaW4gbW9yZSBwcmVkaWN0b3JzLiANCg0KYGBge3J9DQpwbG90KHJfc3FyLCBwY2g9Ii4iLCBjb2w9IndoaXRlIiwgeWxpbSA9IGMoLTAuMiwgMDEpKQ0KYWJsaW5lKGg9MCkNCmxpbmVzKDE6aXRlciwgcl9zcXIsIGx3ZD0yLCBjb2w9InNhbG1vbiIpDQpsaW5lcygxOml0ZXIsIGFkal9yLCBsd2Q9MiwgY29sPSJjb3JuZmxvd2VyYmx1ZSIpDQpgYGANCg0K