## Posts Tagged ‘statistics’

How definitive are Olympic results? The answer is as varied as the scoring systems the events employ. For many Olympic competitions a team tallies wins and losses–a set of nominal variables. Such variables are qualitative and considered to be of the lowest level of measurement. They ask a question (e.g. won?, lost?). The data answer the question with either true, represented by one, or false, represented by zero. Nominal categories must be mutually exclusive–one cannot both win and lose a particular game. The categories must also be fully exhaustive–a team must win or lose any game played. This method of determining a winner often satisfies spectators because outcomes are usually indisputable. Because a team’s record places equal value on all wins, it cannot be objectively compared across time periods. Opponents and skill levels change. Whether a current team is better than a previous team therefore becomes primarily a matter of opinion.

At the other end of the spectrum is the ratio scale. Two relevant examples are time and distance. Ratio variables have two simple criteria. First, zero has to represent nothing. The zero-minute-mile is eternally elusive and zero-meter javelin tosses denote a complete lack of ability. Second, when a value is doubled it must mean double the value. Four minutes is twice the duration of two minutes and six yards is twice the length of three yards. These results can be compared across time periods. Whether examining the luge, shot put, or high jump, today’s Olympians can be directly compared to those from the past. Ratio level measurement is necessary for world records.

Ordinal measurements lie somewhere between nominal and ratio scales. They can be put in order, but the distances between them contain no additional information. Judges give ordinal scores. Higher means better, but we cannot say how much better. Two additional points from judges today are not equivalent to two additional points from judges yesterday, last year, or in 1980. Fans have to accept that ordinal scores are subjective and cannot be compared across time (or even across judges).

So, to start a discussion about an event’s greatest athlete, choose an event judged on a nominal or ordinal scale. Debating one with a world record holder won’t give you much to talk about.

The number of bathrooms in a student’s house is highly correlated with his or her ACT scores. Why? When a house has a greater number of bathrooms, the property value of that house increases. When property values go up, so do property taxes. This leads to better funded local schools. Such schools produce students that do relatively better on standardized tests. Without controlling for more relevant variables, one might argue that more toilets lead to better test scores. This illustrates the importance of theory in the practice of statistics.

When discussing the difference between correlation and causation I am often asked, “But what if one event always occurs before the other one?” Often when I am flying, the pilot will flip on the “fasten your seatbelt” sign several minutes before the plane hits some turbulence. Based on the logic that causation can be shown if one event regularly occurs before another, I could conclude that pilots cause turbulence by turning on seatbelt lights.

A disproportionate amount of DUIs are given to people driving older vehicles. People who drive older vehicles tend to be poorer. Are the police actively looking to punish poorer people by targeting older cars when deciding which cars to pull over late at night? Not necessarily. An officer I know told me that someone driving without their headlights on after midnight is the most common signal of intoxication. Newer vehicles don’t allow for drivers to make this mistake because of automatic lighting systems. Simple statistics might lead one to conclude that police are biased against poorer drivers. A little insight tells us that wealthier drivers just have an additional protection against signaling intoxication.