Chapter 10 takes up inductive generalizations and analogical arguments. Both of these types of inductive arguments use the properties shared by members of a group to establish the properties of something else. That something else is a larger group when we generalize inductively, an individual in cases of analogy. In both cases we must begin by looking at a large enough initial group (sample) that accurately represents the cases of the property we are interested in.
This chapter introduces some technical vocabulary from statistics, specific warnings about public-opinion polls, and warnings about fallacies that can derail inductive thinking.
1. Inductive arguments are arguments that try to apply what is known about known objects or situations to unknown objects or situations.
Almost all inductive arguments fit the following pattern.
A known thing X has certain properties (a, b, c, etc.).
Another thing Y that is not known as well as X has the same properties.
X also has some additional property (p).
On the basis of these three premises the argument concludes that Y also has the additional property (p).
The general idea is that if Y is like X in some ways (relevant ways of course) then it could also be like X in other ways.
If the premises of an inductive argument are all true then the conclusion has some likelihood or probability of being true.
Conclusions of inductive arguments are more or less likely to be true; the arguments are correspondingly stronger or weaker.
These arguments are never valid or invalid, for they are not deductive arguments.
This general idea of an inductive argument can be stated more precisely with the help of certain key concepts:>
The property under investigation – the "p" of the argument – is the property or feature in question.
The sample is a group of things that are already known or believed to possess the property in question.
The target, target class, or target population is an individual or a group about which one is asking: Does this target possess the property in question?
Whether the target is a single thing or a larger group of things, the goal of an inductive argument is to show that it possesses the property in question.
Every inductive argument's conclusion attributes the property in question to the target.
2. Inductive arguments may be either analogical arguments or generalizations.
The similarities between these two forms of argument are far important than any differences between them.
Both forms of argument begin with a sample, identify a property of the members of that sample, and conclude that the property is also shared by one or more items outside the sample.
Because of these similarities, the two forms of argument are evaluated in exactly the same way.
The difference between the two consists in the nature of their targets.
In generalizations the target is a class, i.e. a group of things.
You might generalize from the fact that humans feel sleepy after big meals to the conclusion that all animals feel sleepy after big meals.
In every case, the sample is part of the target class. The target class contains the sample.
After all, you are generalizing to a larger group.
In analogical arguments the target is an item, a single thing.
Because Tom is a clown like Dick and Harry and they both wear big shoes, you conclude that Tom must wear big shoes too.
In every case, the target item is not part of the sample. The target lies outside the sample.
After all, by virtue of what an analogy is, you are stretching from what you know to what you don't know. The sample by definition is what you know; the target can't be part of it.
3. The strength of an inductive argument depends on the similarity between sample and target, and on the size of the sample.
The similarity is known as the representativeness of the sample.
The sample represents the target in so far as it possesses all the features of the target that are relevant to the property in question.
The less you can trust the sample's resemblance to the larger class it's in or to the target item, the less you can trust an argument based on that sample.
If half the cats in your town never go outdoors, then before claiming any conclusions about all your town's cats you should consider a group in which about half never go out.
A biased sample is one that fails to represent the target in some relevant respect.
The trick is to choose a sample that reflects all relevant features of the target, even though you don't always know which features will turn out to be relevant.
If you're generalizing about the methods that auto thieves use, do you make your sample represent the educational levels of auto thieves in general? That makes sense, because more educated ones might use more sophisticated tools. What about left- or right-handedness? Would that be relevant?
In the case of analogical arguments a related question comes up: Suppose you don't know whether the target does or does not possess some property?
In both cases bias is best avoided by the selection of a varied sample.
A large sample makes for a better inductive argument, as long as it is representative.
Typically, larger samples will also be more representative, all other things being equal.
When the larger sample is not more representative, the representativeness is more important than the sample's size.
Homogeneous classes are adequately represented even by tiny samples. One sip of the ocean anywhere is enough to show you the whole thing is salty, because ocean water is homogeneous. Heterogeneous classes on the other hand call for larger samples.
4. Inductive arguments can be either formal or informal.
The popularity of polls and large scientific studies might make you think that inductive arguments are only the domain of scientists. But such arguments turn up, informally, in daily life as well.
The general principles at work are the same for both formal and informal arguments.
Even when you don't realize that you are looking for a representative sample, for instance, you still trust generalizations about human behavior that hold for many different ethnic groups and ages more than generalizations based on your own ethnic group, or based only on people your age.
Two differences are worth noting.
Formal arguments tend to be generalizations and not arguments from analogy.
Formal arguments pay attention to details, and rely on mathematical analysis, in ways that informal arguments don't.
5. Formal inductive arguments tend to follow rigorous guidelines for sample selection and size, and their accuracy is evaluated by means of precisely defined mathematical concepts.
In a poll (for instance) the sample being polled must be representative of the target class it belongs to.
To represent a target class a sample must possess all the features of the target class that are relevant to the property in question, and must possess those features in the same way (meaning: to the same degree, in the same proportion) that the target class does.
Speaking as accurately and precisely as possible, we say that a feature or property P is relevant to another property Q if it is reasonable to suppose that the presence or absence of P could affect the presence or absence of Q.
It's not always clear which features will be relevant. On top of that, it can be exceedingly hard to select an exactly representative sample even when all the relevant features are known.
A random sample can still be trusted even when the sample is not deliberately chosen so as to be representative.
A randomly selected sample is the type most likely to represent its target class. This is the great virtue of random selection: It is not a process esteemed for its own sake, but because it helps to make a sample representative.
In a random sample, every individual is as likely to be selected as every other one.
Sample size, error margin, and confidence level help evaluate the accuracy of generalizations from random samples.
Error margins express the potential for error created by random variations.
Randomly chosen samples are bound to differ from their target class to some degree. One-tenth of all Americans may be left-handed, but a random assortment of 50 Americans might contain only one lefty, or none at all, or as many as 10 or 15.
The occurrence of some property in any random sample will be close to its occurrence in the target class; this "closeness" is indicated by a percentage plus or minus some percentage points.
Thus the error margin measures the range within which a generalization about the sample also describes the target – e.g., 5 ± 2.
The closely related concept of confidence level measures the likelihood that an inductive generalization has produced a true conclusion.
Confidence level indicates the percentage of all random samples in which the property in question occurs within the error margin.
Suppose a study reports that 28 percent of Estonians have curly hair, with an error margin of ± percent and a confidence level of 88 percent. Assuming this study looked at a random sample of some given size, this result means that 88 percent of all random samples of the same size will contain between 25 and 1 percent curly-haired people.
In most scientifically produced studies, confidence level is set at 95 percent.
The confidence level and error margin both depend on the size of the sample.
As the size increases, the error margin decreases, or the confidence level increases, or both.
A poll of 50 people (within a sufficiently large target class) has an error margin of ±14 with a 95 percent confidence level. If the sample size doubles to 100, the error margin falls to ±10 at the same confidence level.
6. Informal inductive arguments follow the same principles as formal ones, but they express error margin and confidence level more loosely.
The general principle remains the same. The smaller the sample of a heterogeneous population, the less reliable the conclusion of the argument will be.
Error margins find their informal analogues in cautious conclusions.
You make a conclusion cautious by expressing it in less precise terms.
Thus instead of saying that the temperature is 8. degrees lower in the middle of the lake, you might say it's somewhere around 10 degrees lower.
Confidence levels in turn get translated into expressions of greater or lesser likelihood that a conclusion is true.
Ordinary registers of likelihood, from "you can bet that …" through "it is pretty likely that …" on down to "it's possible that …" correlate to technical statements of confidence level.
Thus instead of saying that a generalization holds only at a confidence level of 65 percent, you might say the conclusion is more likely to be true than false.
7. A few fallacies most commonly arise with inductive arguments.
A biased generalization or biased sample uses a sample that may be large enough, yet does not represent the target class.
The fallacy of hasty conclusion occurs in arguments that use too small a sample. If the argument is an inductive generalization, this fallacy is also known as hasty generalization.
The fallacy of anecdotal evidence is one kind of hasty generalization. In such arguments, one tells a story about one or two examples that is mean to demonstrate a general truth.
8. Besides the general information about statistical arguments that has been covered in this chapter, you should keep a few specific warnings in mind when evaluating public-opinion polls.
Polls are based on self-selected samples when the people whose opinions are being measured have put themselves into the sample.
A television call-in poll overrepresents people with strong enough opinions to call and register their opinions.
Although such polls may measure strength of feeling, they do not indicate what the general population believes.
Advocacy groups might commission polls that are worded in such a way as to produce a more favorable response. These are polls with slanted questions.
Slanted questions introduce a form of bias that is separate from the representativeness of the sample.
Loaded questions can pretend to solicit an opinion while suggesting a claim: "Do you favor reducing death and street crime by means of emergency housing?"
At other times it is the order that questions come in that skews the results.
9. The importance of large sample size in inductive arguments follows from a statistical principle, the law of large numbers.
When an event occurs at random with predictable ratios of results, it will get closer to those predictable ratios the more often the event is repeated.
Roll a die 12 times and you might get a twice, but you might get it more times than that and you might not get it once.
Roll the same die 12,000 times and it becomes more and more likely that occurs one-sixth of the time.
The law of large numbers assures us that the individuals in samples, when those samples are large enough, behave with the regularity that characterizes the target class.
Sometimes the law of large numbers is misunderstood; then it becomes the gambler's fallacy.
The gambler's fallacy is the belief that past random occurrences influence the next occurrence, that in order to reach the predicted ratios the next events have to "catch up."
The gambler's fallacy occurs very often in nongambling contexts. You might think: "We've had four boys in a row. The next child is more likely to be a girl."
Although large samples do tend to produce the numbers one expects, those numbers do not influence any particular event. Each time is a new flip of the coin.