CSE 599K
Empirical Research Methods

Winter 2025

Linear regression and t test
(two sides of the same coin)

Packages used

Tidyverse package

library(tidyverse)

Example data set

Create a data set with two groups

Two groups (grp-1 and grp-2) – each with 1000 data points, normally distributed (mu is -0.1 and 0.1, respectively).

Categorical (factor) variable for Grp to ease interpretation of lm output.

Difference in means

Plot the two distributions

Testing for significance

T test

Linear regression

Interpretation: lm vs. t.test

Work out the math to convince yourself that the below is true.

The linear model uses a single, categorical predictor (Grp)

  • lm uses dummy encoding for the two levels of Grp:
    • 0 = grp-1
    • 1 = grp-2
  • The model being fit is: Value ~ b1 * Grp + b0, with:
    • b0 (intercept): mean of grp-1, and
    • b1 (Grp coefficient): difference in means (grp-1 vs. grp-2).