I would tell people who smoked that their chances of getting lung cancer from smoking was very high. Later on I found out that the actual risk is somewhere in the neighborhood of 10-25%, depending on the study. I overestimated the risk based on onesample, a sample that was very immediate to me. This type of bias has a name: the law of small numbers. It’s not an actual law – its just a name for the type of mistake people make by using sample sizes that are too small when they make decisions.
Hi Ears,
You’ve been talking a lot about the law of small numbers. As a guy who was the captain of a championship math team, and who came very close to getting a PhD in Math instead of an MD, let me point out what I believe you are missing.
Even though a small sample usually can’t prove anything it can give you a very strong indication.
Your sample of one, your brother, didn’t prove that everyone who smokes gets lung cancer, but it gave you a strong indication of a likelihood that smoking is dangerous. After all, a 10% to 25% chance of lung cancer is a big deal. And you were correct in what you were telling people: Men who smoke are 23 times more likely to develop lung cancer than never smokers. Women are 13 times more likely, compared to never smokers. And lung cancer is no fun, let me tell you.
You’re not getting 100% proof with your small sample, but you sure are getting likelihood that smoking is dangerous. You were right, not wrong.
If you have a barrel of black and white marbles, and you are testing the hypothesis that the barrel is all black marbles rather than a 50:50 split, and you draw out of it four black marbles in a row, that doesn’t prove anything, because it’s only a sample of 4 and four black marbles could happen randomly 1 in 16 times with a evenly divided barrel. However, even your small sample of four black marbles gives you a 15/16 probability that there are a lot more black marbles in the barrel than white. If you draw the first ten marbles and they’re all black, you’ll be down to less than 1 possibility in a 1000 that it’s happening by chance.
In a medical drug study, or other scientific study, you look for the results to be statistically significant. Meaning the chances of them happening by chance is less than 5%, or 1%, depending on the requirements of the study. If your sample is too small and you miss and come in with a 10% possibility that the results could be by chance, the decision is usually made to get more subjects for the study. BUT remember, even a 10% chance that the results are accidental does mean that there’s a 90% chance that they are correct! A small sample size may be too small for proof, but it sure gives you an indication!
In other words, data that indicates a certain result is correct, even if it’s only a small number of data points, indicates (but doesn’t prove decisively) that that is a correct interpretation. If you think about it, that’s even what some of TA in the stock market is about: it happened this way in the last four recessions so the chances are it will happen this way again.
The corollary to that being that when seven people who reported results using a particular method of investing achieved results of plus 75% to 97%, when the market was up well under 20% (and then Bear felt he had to apologize for only being up 60% YTD), that doesn’t “prove” anything because it’s a “small sample”, but the likelihood of it happening by chance is extremely tiny, and this “small sample” demonstrates that this fashion of investing is very likely to beat investing in an index, or any other large group of good, medium, and poor stocks.
For a different and sort of surrealistic take on probabilities read the play Rosencrantz and Guildenstern Are Dead by Tom Stoppard. A very entertaining tragicomedy written from the perspective of two minor characters of Shakespeare’s Hamlet.
Aren’t you missing the gist, namely, that smallish sample sizes can provide degrees of insight? …Not absolute proof.
The success of Saul’s approach is clear. The sample size is very large (25+ years). The results do vary quite a bit year to year, (2016 a modest return); However, the overall trend is overwhelmingly high performance. It’s fine if you don’t want to invest in primarily small cap growth tech companies. Actually, it’s foreign to me, too. But, this picking little points that miss the main idea is ridiculous.
Over what period Saul? That really matters. That investing mainly in a couple of sub-sectors in a time of astonishing part-genuine revolution, part artificially stimulated momentum in those sub-sectors during the bull market to end all bull markets should beat the index comes as no surprise to me at all.
But the question on the table is - are those investors flexible enough to respond when the story changes? You will be (it did not take you long to notice that a good market for Slater’s PEG had, amazingly, been superseded by something even better!). Will they notice say (hypothetically) that CSCO offers better risk-reward than ANET? Or when INTC looks like a bargain again.
I think it will be difficult to keep up the trend for a realistic investment timescale, say 20 years, against a portfolio of 10-20 well-chosen domestic, sector and international ETFs re-allocated just once a year.
After all, so far the evidence indicates no mutual fund manager can do that, even against just one ETF!
HP: the sample size is not large but small. In fact it consists of just one single investor who we agree is exceptional - but almost certainly an outlier.
What does the draw of the first marble tell you – sample of one?
If it’s white that here are white marbles there, at least one. Extend to any color you wish.
BTW, the concept of “uncertainty” says that there might not be any more white marbles there because sampling the thing alters it. I believe you need to put the marble back in to continue your sampling accurately. A practical application of this principle in the real world is used by “counters” playing black jack. Each card drawn changes the odds of the game. By recording the cards being drawn the counters can change their tactics and beat the house. How does this apply to medical studies and the stock market?
Then there is the question of the distribution of the thing under study. In statistics they assume a normal distribution because that is what roulette, cards, and dice produce. Hurricanes, earthquakes, and stock market prices do not have a normal distribution but a power law distribution which is one reason they get their results wrong. Read Nassim Nicholas Taleb of Black Swan fame and The Mis-behaviour of Markets by Benoit Mandelbrot who cover this stuff.
Not being a mathematician myself, I understand that the reason they keep on using the normal distribution in financial markets is because the math is well understood but the maths of power law distributions is very complex. From what I understand, power law does not produce a single point of equilibrium but many, one of which is chosen at random. In the practical world, The Gorilla Game advised “buy the basket and sell the losers” because there is no way of telling which would be the winner. This is a real world confirmation of results found in academia (Santa Fe Institute).
I think one needs to accept that the future is uncertain, that the best one can do is to tilt the odds in one’s favor. This requires an understanding of markets and control of one’s instincts which were not evolved to navigate the stock market.
The two items I think Saul might reexamine are whether he is using the right distribution and whether he is assuming an unchanging market. Sector rotation throws many people off. From what I have seen so far, Saul was very successful in switching from previously held kind of stocks to high tech.
1. Whether smoking is dangerous is a different question.
2. What does the draw of the first marble tell you – sample of one?
3. Are you familiar with any drug study or scientific study that uses a sample of one?
4. I’d advise your readers to do their own research on this topic.
Ears,
Actually you have stumbled onto a great debate within the statistics community that has raged for several centuries.
The idea of small samples is covered throughly in Khan Academy in the Pre Calc play list. The way small sample size is overcome is with fake data. Bayes Rule uses “fake data” or “priors” to work. The statistical purist have raged agsinst this abomination for centuries. At the same time, Bayes rule built the actuary tables for workmens comp, predicted air raids in the Battle of Britian, and cracked the enigma code. Lately circa 2000, they are used in Google search results and voice commands. The latter are probably now running more on AI than Bayes, but I suspect that there is a Baysean formula in seed.
I recommend “The Theory That Would Not Die”
a quote from the last page.
“Talking of religion, I am reminded of a strip of cartoons about Bayesians that appeared some time ago. They showed a series of monks. One was looking lost, one was dressed as a soldier, one was holding a guide book and one had his tongue stuck out. They were, respectively, a vague prior, a uniform prior, an informative prior and, of course, an improper prior . . .”
But, this picking little points that miss the main idea is ridiculous.
HP,
My point to Bear was this, and ONLY this: Kahneman’s work challenges the assumption that investors
behave rationally. His findings – in collaboration with others and based on empirical research –
showed that decision making is subject to cognitive bias. He and Amos Tversky were the first to
theorize the bias of anchoring. Another area they explored was how our intuition is often mistaken
when presented with small sample sizes. Kahneman won the Nobel Prize in 2002 for his work. I was
encouraging Bear to explore Kahneman’s findings. Understanding your biases is not a little point.
It’s arguably the thing most critical to your success as an investor.
In Fooled by Randomness, Nicholas Taleb tells the story of the fellow with two PhDs who
asked him what he thought the stock market would do that day. Taleb tried to be polite, and offered
that he didn’t really know, but it might go down. The market went up significantly. The next day
the guy challenged Taleb’s credibility – how could his ‘prediction’ be so wrong? The guy was
reaching a conclusion about Taleb’s ability as a trader based on one observation.
Taleb goes on to say:
Clearly there are two problems. First, the quant did not use his statistical brain when making the
inference, but a different one. Second, he made the mistake of overstating the importance of small
samples (in this case just one single observation, the worst possible inferential mistake a person
can make). Mathematicians tend to make egregious mathematical mistakes outside of their theoretical
habitat. When Tversky and Kahneman sampled mathematical psychologists, some of whom were authors
of statistical textbooks, they were puzzled by their errors. "Respondents put too much confidence
in the result of small samples and their statistical judgement showed little sensitivity to sample
size". The puzzling aspect is that not only should they have known better, "they did know better".
There is a short paragraph in Taleb’s book (page 113) about the urn/ball problem that Saul brought
up. Taleb’s discussion maybe will give you a different insight from Saul.
Kahneman is a psychologist, not a statistician. He has not been a participant in the debate you
are referencing. Much of his work has focused on identifying human errors that arise from rules
of thumb and biases (e.g., price anchoring).
Kahneman did do some work with Bayes Theorem. He showed how bias can lead to unwitting violations
of Bayes Theorem. His work helped make the application of Bayes Theorem more rigorous.
Your message to be skeptical about outlier performance in high flying stocks is great. Especially lately with SHOP, SQ,… of a portfolio. I’ve had trouble starting a position in these stocks for awhile. You seem concerned that newbies will get burned by the leaders of this board. That’s fine and good.
I think you should keep in mind that a lot of experienced investors are following this board, and they’re aware of the negative consequences. Many of us are trying to learn about discovering in the early phases of growth, new companies, sizing appropriately, and when to sell.
Saul is not just a sample of 1, btw. His returns are in the top tier, yes, but others have improved their investing significantly because of this board. And not just this year of tremendous growth in a narrow set of tech companies.
…Daniel Kahneman’s Thinking Fast and Slow is wonderful, and a lot of us are familiar with his work.