Exam 01 review
Announcements
HW 02 due TODAY at 11:59pm
Exam 01: Tuesday, February 17 (in-class + take-home)
Friday’s lab: Exam 01 review - due Sunday at 11:59pm
Exam 01
In-class: 75 minutes during February 17 lecture
Take-home: due February 19 at 9am
- Will have lecture on February 19
If you miss any part of the exam for an excused absence (with academic dean’s note or other official documentation), the final exam grade will be imputed for the exam 01 grade
Outline of in-class portion
- Closed-book, closed-note.
- Question types:
- Short answer (show work / explain response)
- Interpretations
- Derivations (show work)
- Will be provided all relevant R output and page of math rules
- Can use any results from class or assignments without reproving them. State the results you’re using!
- Just need a pencil or pen. No calculator permitted on exam.
Outline of take-home portion
- Released: Tuesday, February 17 right after class
- Due: Thursday, February 19 at 9am
- Similar in format to lab / applied HW exercises
- Will receive Exam questions in README of GitHub repo
- Push work to GitHub and submit a PDF of responses to Gradescope
Tips for studying
- Rework derivations from assignments and lecture notes
- Review exercises in AEs and assignments, asking “why” as you review your process and reasoning
- Focus on understanding not memorization
- Explain concepts / process to others
- Ask questions in office hours
- Review lecture recordings as needed
Content: Weeks 1 - 6
Exploratory data analysis
Fitting and interpreting linear regression models
Model evaluation
Different types of predictors
Inference for regression
Matrix representation of regression
Hat matrix
Finding the least-squares estimator
Assumptions for least-squares regression
Population-level vs. sample-level models
Statistical model (population-level model)
\[ \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}, \quad \epsilon \sim N(\mathbf{0}, \sigma^2_{\epsilon}\mathbf{I}) \]
Estimated regression model (sample-level model)
\[ \hat{\mathbf{y}} = \mathbf{X}\hat{\boldsymbol{\beta}}\quad \quad \mathbf{e} = \mathbf{y} - \hat{\mathbf{y}} \]
Model in matrix form
\[ \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \]
- What are the dimensions of \(\mathbf{y}, \mathbf{X}, \boldsymbol{\beta}, \boldsymbol{\epsilon}\) ?
- What assumption do we make about the columns of \(\mathbf{X}\)? Why is that important?
Model in matrix form
\[ \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \]
- What assumptions do we make about \(\boldsymbol{\epsilon}\) making given this model?
- What does this model tell us about the distribution of \(\mathbf{y}\) ?
Least-squares estimator \(\hat{\boldsymbol{\beta}}\)
Expected value of \(\hat{\boldsymbol{\beta}}\)
Variance of \(\hat{\boldsymbol{\beta}}\)
Find \(Var(\hat{\boldsymbol{\beta}})\) under the usual assumptions.
. . .
Assume \(Var(\boldsymbol{\epsilon}) = \mathbf{XV}\), such that \(\mathbf{V}\) has the appropriate dimensions. All other assumptions hold.
What are the dimensions of \(\mathbf{XV}\)?
Derive \(Var(\hat{\boldsymbol{\beta}})\). What are the dimensions of \(\mathbf{V}\)?
SSR
Show
\[ SSR = \mathbf{y}^\mathsf{T}\mathbf{y} - \hat{\boldsymbol{\beta}}^\mathsf{T}\mathbf{X}^\mathsf{T}\mathbf{y} \]