Let’s use this section to learn a math concept.
We begin with a question:
You drive to the store and back. The store is 50 miles away. You drive 50 mph to the store and 100 mph coming back. What’s your average speed in MPH for the trip?
[Space to think about the problem]
[If you think the answer is 75 there are 2 problems worth pointing out. One of them is you have the wrong answer.]
[The other is that 75 is the obvious gut response, but since I’m asking this question, you should know that’s not the answer. If it’s not the answer that should clue you in to think harder about the question.]
[You’re trying harder, right?]
[Ok, let’s get on with this]
The answer is 66.67 MPH
If you drive 50 MPH to a store 50 miles away, then it took 60 minutes to go one way.
If you drive 100 MPH on the way back you will return home in half the time or 30 minutes.
You drove 100 miles in 1.5 hours or 66.67 MPH
Congratulations, you are on the way to learning about another type of average or mean.
You likely already know about 2 of the other so-called Pythagorean means.
- Arithmetic mean
Simple average. Used when trying to find a measure of central tendency in a set of values that are added together.
- Geometric mean
The geometric mean or geometric average is a measure of central tendency for a set of values that are multiplied together. One of the most common examples is compounding. Returns and growth rates are just fractions multiplied together. So if you have 10% growth then 25% growth you compute:
1 x 1.10 x 1.25 = 1.375
If you computed the arithmetic mean of the growth rates you’d get 17.5% (the average of 10% and 25%).
The geometric mean however answers the question “what is the average growth rate I would need to multiply each period by to arrive at the final return of 1.375?”
In this case, there are 2 periods.
To solve we do the inverse of the multiplication by taking the root of the number of periods or 1.375^1/2 – 1 = 17.26%
We can check that 17.26% is in fact the CAGR or compound average growth rate:
1 x 1.1726 * 1.1726 = 1.375
Have a cigar.
The question about speed at the beginning of the post actually calls for using a 3rd type of mean:
The harmonic mean
The harmonic mean is computed by taking the average of the reciprocals of the values, then taking the reciprocal of that number to return to the original units.
That’s wordy. Better to demonstrate the 2 steps:
- “Take the average of the reciprocals”
Instead of averaging MPH, let’s average hours per mile then convert back to MPH at the end:
50 MPH = “it takes 1/50 of an hour to go a mile” = 1/50 HPM
100 MPH = “it takes 1/100 of an hour to go a mile” = 1/100 HPM
The average of 1/50 HPM and 1/100 HPM = 1.5/100 HPM
- “Take the reciprocal of that number to return to the original units”
Flip 1.5/100 HPM to 100/1.5 MPH. Voila, 66.67 MPH
Ok, right now you are thinking “Wtf, why is there a mean that deals with reciprocals in the first place?”
If you think about it, all means are computed with numbers that are fractions. You just assume the denominator of the numbers you are averaging is 1. That is fine when each number’s contribution to the final weight is equal, but that’s not the case with an MPH problem. You are spending 2x as much time as the lower speed as the higher speed! This pulls the average speed over the whole trip towards the lower speed. So you get a true average speed of 66.67, not the 75 that your gut gave you.
I want to pause here because you are probably a bit annoyed about this discovery. Don’t be. You have already won half the battle by realizing there is this other type of mean with the weird name “harmonic”.
The other half of the battle is knowing when to apply it. This is trickier. It relies on whether you care about the numerator or denominator of any number. And since every number has a numerator or denominator it feels like you might always want to ask if you should be using the harmonic mean.
I’ll give you a hint that will cover most practical cases. If you are presented with a whole number that is a multiple, but the thing you actually care about is a yield or rate then you should use the harmonic mean. That means you convert to the yield or rate first, find the arithmetic average which is muscle memory for you already, and then convert back to the original units.
- When you compute the average speed for an entire trip you actually want to average hours per mile (a rate) rather than the rate expressed as a multiple (mph) before converting back to mph. Again, this is because your periods of time at each speed are not equal.
- You can’t average P/E ratios when trying to get the average P/E for an entire portfolio. Why? Because the contribution of high P/E stocks to the average of the entire portfolio P/E is lower than for lower P/E stocks. If you average P/Es, you will systematically overestimate the portfolio’s total P/E! You need to do the math in earnings yield space (ie E/P). @econompic wrote a great post about this and it’s why I went down the harmonic mean rabbit hole in the first place:
The Case for the Harmonic Mean P/E Calculation (3 min read)
- Consider this example of when MPG is misleading and you actually want to think of GPM. From Percents Are Tricky:
Which saves more fuel?
1. Swapping a 25 mpg car for one that gets 60 mpg
2. Swapping a 10 mpg car for one that gets 20 mpg
You know it’s a trap, so the answer must be #2. Here’s why:
If you travel 1,000 miles:
1. A 25mpg car uses 40 gallons. The 60 mpg vehicle uses 16.7 gallons.
2. A 10 mpg car uses 100 gallons. The 20 mpg vehicle uses 50 gallons
Even though you improved the MPG efficiency of car #1 by more than 100%, we save much more fuel by replacing less efficient cars. Go for the low-hanging fruit. The illusion suggests we should switch ratings from MPG to GPM or to avoid decimals Gallons Per 1,000 Miles.
- The Tom Brady “deflategate” controversy also created statistical illusions based on what rate they used. You want to spot anomalies by looking at fumbles per play not plays per fumble.
Why Those Statistics About The Patriots’ Fumbles Are Mostly Junk (14 min read)
The most important takeaway is that whenever you are trying to average a rate, yield, or multiple consider
a) taking the average of the numbers you are presented with
b) doing the same computation with their reciprocals then flipping it back to the original units. That’s all it takes to compute both the arithmetic mean and the harmonic mean.
If you draw the same conclusions about the variable you care about, you’re in the clear.
Just knowing about harmonic means will put you on guard against making poor inferences from data.
For a more comprehensive but still accessible discussion of harmonic means see:
On Average, You’re Using the Wrong Average: Geometric & Harmonic Means in Data Analysis: When the Mean Doesn’t Mean What You Think it Means (20 min read)
This post is so good, that I’m not sure if I should have just linked to it and not bothered writing my own. You tell me if I was additive.