Degrees of Freedom

The number of values in a question that are free to vary independently.

You have 4 hats (blue, gold, red and green) and want to wear a different one every day.

In fact your "degrees of freedom" turned out to be only 3, by the 4th day you had no freedom to choose.

So, depending on the situation, the degrees of freedom can be less (but never more) than the number of items you are dealing with:

df = n − r

In the hats example, n is the number of hats, and r is the restriction that you have one less choice each day, so df = 4 − 1 = 3

Why Do Degrees of Freedom Matter?

Degrees of freedom tell us how much independent information we really have.

Suppose we have 4 numbers with a mean of 10.

If we choose the first three numbers freely, the fourth number is already decided.

Why? Because the total must be 40:

Mean = 10, so the Total = 4 × 10 = 40

If the first three add up to 32, the last one must be 8.

So only 3 values were free to vary.

df = 4 − 1 = 3

That's why sample variance uses n − 1 instead of n: once we calculate the mean, one value is no longer free.

In general, every time we estimate something from the data (like a mean), we lose one degree of freedom.

Each estimated parameter adds a restriction.

Here are some dfs by topic:

	Formula
Sample Variance	df = n − 1	the 1 restriction is the mean
Independent Student's t-test	df = n₁ + n₂ − 2	r=2 because of two separate means
Paired Student's t-test	df = n − 1	We have one (overall) mean
Chi Square Test	df = (rows − 1) × (cols − 1)