Degrees of Freedom

The number of values in a question that are free to vary independently.

Sage the owl wearing a graduation cap and looking at a green hat.

Example: Choosing Hats

You have 4 hats (blue, gold, red and green) and want to wear a different one every day.

  • On the first day you can choose any hat
  • On the 2nd day you have 3 choices left
  • On the 3rd day you have 2 choices left
  • On the 4th day you have only 1 hat left, so no choice at all really

In fact your "degrees of freedom" turned out to be only 3, by the 4th day you had no freedom to choose.

So, depending on the situation, the degrees of freedom can be less (but never more) than the number of items you are dealing with:

df = n − r

In the hats example, n is the number of hats, and r is the restriction that you have one less choice each day, so df = 4 − 1 = 3

Why Do Degrees of Freedom Matter?

Degrees of freedom tell us how much independent information we really have.

Example: The Mean Uses One Degree of Freedom

Suppose we have 4 numbers with a mean of 10.

If we choose the first three numbers freely, the fourth number is already decided.

Why? Because the total must be 40:

Mean = 10, so the Total = 4 × 10 = 40

If the first three add up to 32, the last one must be 8.

So only 3 values were free to vary.

df = 4 − 1 = 3

That's why sample variance uses n − 1 instead of n: once we calculate the mean, one value is no longer free.

In general, every time we estimate something from the data (like a mean), we lose one degree of freedom.

Each estimated parameter adds a restriction.

Here are some dfs by topic:

Formula
Sample Variance df = n − 1 the 1 restriction is the mean
Independent Student's t-test df = n1 + n2 − 2 r=2 because of two separate means
Paired Student's t-test df = n − 1 We have one (overall) mean
Chi Square Test df = (rows − 1) × (cols − 1)