Question 1
True or False.
The value of
R2 is only useful when the number of data points is substantially larger than the number of
β parameters in the model.
◦ True
◦ False
Question 2
As part of a study at a large university, data were collected on
n = 224 freshmen computer science (CS) majors in a particular year. The researchers were interested in modeling
y, a student's grade point average (GPA) after three semesters, as a function of the following independent variables (recorded at the time the students enrolled in the university):
x1 = average high school grade in mathematics (HSM)
x2 = average high school grade in science (HSS)
x3 = average high school grade in English (HSE)
x4 = SAT mathematics score (SATM)
x5 = SAT verbal score (SATV)
A first-order model was fit to data with
R2 = 0.211.
What is the correct interpretation of
R2, the coefficient of determination for the model?
◦ We expect to predict GPA to within approximately .21 of its true value.
◦ Approximately 79% of the sample variation in GPAs can be explained by the first-order model.
◦ We are 79% confident that the model is useful for predicting
y.
◦ Approximately 21% of the sample variation in GPAs can be explained by the first-order model.