CSEP 590
Building Data Analysis Pipelines

Fall 2024

Statistical Significance and Effect Size

Packages used

Tidyverse and Effsize

library(tidyverse)
library(effsize)

Example data set

Create a data set with two groups

Create a tibble with two groups (Treat and Ctrl) – each with 5 data points, say indicating the duration of a coding task.

Tidy up the data

Point plot of the data

Testing for significance

Parametric T test

Formula syntax vs. passing individual vectors

The formula syntax (Duration~Grp) on long data is equivalent to calling t.test with vectors when data is in wide format: t.test(t$Ctrl, t$Treat)

One-sided vs. two-sided tests

A two-sided test (no difference) is the default

Set the alternative argument for a one-sided test.

Non-parametric U test

Compute the U test result “by hand”

Create example data

Tidy up the data (for plotting)

Point plot of the data

Expected result (wilcox.test, one-sided)

All possible pairs with `expand`

Compute “wins”

Sum all “wins”

Look up or compute the p value

p-value: what is the probability of observing the given outcome (W score), or a more extreme outcome?

Exercise: work out the math

How many possible ranking permutations are there in total?
How many ranking permutations have the same W score (or more extreme W
score) as the observed ranking?
In a two-tailed test, consider extremes on both ends of the distribution.

A12 effect size

What is the A12 effect size?

How do we compute A12?

CSEP 590Building Data Analysis Pipelines

Packages used

Tidyverse and Effsize

Example data set

Create a data set with two groups

Tidy up the data

Point plot of the data

Testing for significance

Parametric T test

One-sided vs. two-sided tests

Non-parametric U test

Compute the U test result “by hand”

Create example data

Tidy up the data (for plotting)

Point plot of the data

Expected result (wilcox.test, one-sided)

All possible pairs with expand

Compute “wins”

Sum all “wins”

Look up or compute the p value

A12 effect size

CSEP 590
Building Data Analysis Pipelines

All possible pairs with `expand`