Notes on How Not to Be Wrong: The Power of Mathematical Thinking

How Not to Be Wrong: The Power of Mathematical Thinking
by Jordan Ellenburg


  • Math gives you tools to extend your common sense of reasoning and logic. He uses the analogy of Ironman’s suit.
  • Competitions including war are often decided by small edges. Being 5% better at x or y can decide the outcome over a long enough game. (Similar to my experience with boardgames. A more efficient engine in a game of 7 Wonders or Settlers can save you actions; like reducing the cost of capturing winning victory points.)
  • Zooming in on points of a curve they look and can be approximated by lines. Trajectories are curves influenced by gravity but objects appear to move in straight lines. Using lines to approximate curves is the basis of calculus and even the derivation of pi as the area of a circle (Archimedes did this by iteratively computing the edges of a circumscribed circle as a polygon. Imagine an octagon, then a polygon with 64 sides, and so forth until it looks like a circle. You can then use trigonometry to compute the area of the triangles you keep creating until the sum of all the area approximates the area of the circle)
    • Critics of this method like Zeno highlighted the uncomfortable paradox of constantly halving something until you get nowhere. Comically, the skeptic Diogenes countered Zeno by simply walking across the room to make the point that motion is indeed possible!
  • The law of large numbers explains why South Dakota can have the highest rate of brain cancer and North Dakota the smallest. They both have small populations. Be careful when comparing quantities from 2 very different sample sizes. Small samples are more volatile. If you flip 10 coins, your odds of getting 8 heads is unlikely but real. But flip 1000 coins it’s nearly impossible to get 800 heads.

Data mining 

“The more chances you give yourself to be surprised the higher your threshold for surprise had better be.”

“A significance test is a scientific instrument and like any other instrument, it has a certain degree of precision. If you make the test more sensitive by increasing the size of the studies population, for example, you enable yourself to see ever-smaller effects. That’s the power of the method but also the danger”

  • An underpowered study has the opposite problem. You dismiss an effect that your method was too weak to see. A good example is the original 1985 hot hand studies. They rejected the idea of a hot hand but it turns out the methods they used rejected a hot hand even on data sets that were generated by simulations that deliberately baked in a hot hand! In fact, their methods failed to notice even the effects of good vs bad defenses which we know influences offensive shooting percentages.
    • The final verdict is there may be some hot hand effect but it is too difficult to detect because if it exists it is very small. In fact, players who think they are hot take harder shots and perform worse so it’s best for them to not believe in the effect since it will be more than offset by an unjustifiably confident shot selection.

The Bayesian examples in the book are great.

  • In a Bayesian framework how much you believe something after you see the evidence depends not just on what the evidence shows but much you believed it to begin with. Posterior probabilities still depend on the strength of your priors.
  • On conspiracy theories: “If you do happen to find yourself partially believing a crazy theory, don’t worry — probably the evidence you encounter will be inconsistent with it, driving down your degree of belief in the craziness until your beliefs come in line with everyone else’s. Unless, that is, the crazy theory is designed to survive the winnowing process. That’s how conspiracy theories work”.

Tradeoffs and cost of perfection

  • Stigler type arguments that optimal decisions often leave a margin for error. Getting to the airport early enough to have a 100% chance of making the flight is probably so conservative it’s wasteful (depends on your utility curve but almost certainly wasteful to be 100% certain vs say 95%). When you read a story about social security overpaying people bc they were actually dead, it turns out that mistake represents less than 1 basis point of payments. In other words, they do a great job not making this mistake and the cost of being 100% compliant may simply not be cost-effective to be worthwhile.

St Petersburg and the role of expected utility

  • Fran Lebowitz utility curve of money: she would drive a cab each month until she could eat and pay rent. Afterwards, she would write. In other words, she had a linear utility curve which flattened abruptly. If you raise her taxes she works more as opposed to someone with a logarithmic curve who is at the point of indifference between work and leisure
  • Ellsburg Paradox highlights the limitations of utility theories. It highlights the difference between what Rumsfeld called “known unknowns” or what mathematics refers to as risk vs “unkown unknowns” or uncertainty. Utility theory may help with uncertainty but formal math is less useful.

Regression to the mean explains many phenomena that are usually attributed to another reason.

  • Examples, best-performing companies (competition attribution), musician/writer sophomore slump, RB after signing a big contract, dietary fiber speeding or slowing digestion, Scared Straight juvenile detention program, diet effects when people are at their peak weights. When something is at an extreme we should expect reversion simply bc of math and therefore be very careful of attributing to an intervention.

Correlations between variables reduce the information content of the variable.

  • You try to identify criminals by foot and hand size you are choosing highly correlated variables.
  • Strong correlations lie behind how we compress images and music files. A green pixel is probably next to a green pixel.

Leave a Reply