Monday, 11 November 2013

Why you don't ask about city populations

Back on October 16, the fourth and final round at Moose's Down Under consisted of a single question which provided players a list of 10 of the world's great megacities and asked each team to rank them in descending order of population.

I suspect that the question writer believed that the question, which could score anything from 0 to 10 points, would result in a point distribution resembling a normal bell curve. What it produced instead was a mushroom cloud. The approximately 13 teams playing combined for a collective two points.

Fortunately, even an accidental nuclear blast can be a teaching moment, and there are two lessons to be learned in this case: (1) don't ask questions about relative city populations; and (2) don't ask players to compare numbers that are really close to each other.

There are a host of reasons why, which I'll explain below, and demonstrate how Moose's question managed to trip just about all of them, a fact which virtually guaranteed disaster.

1. There are multiple ways to measure the population of a city, and none of them are authoritative or universally accepted

The predominant issue with relative city population ranking is one of definitions. When one talks about the population of a city, does one mean the population that lives only within the municipal city limits? Or should one ignore the imaginary line on Boundary Road and instead regard the city as an organic phenomenon, consisting of anyone who lives within a single socioeconomic aura? An even broader view defines a city as any blob or smear of contiguous urban area, even if it includes separate, non-overlapping auras.

Consider an everyday example that illustrates the problem: if your apartment is in Coquitlam, would you when abroad introduce yourself as a Vancouverite?

Unfortunately, which metric you choose makes a huge difference. According to the metropolitan agglomeration definition, Vancouver boasts over 3 million inhabitants and rounds out the great triumvirate of Canada's big three cities with Toronto and Montreal. If, however, you insist on formal city boundaries, our hometown scarcely clears 600,000 and finishes a mortifying eighth behind an ignoble crop of welterweight contenders including Calgary, Ottawa, and even Winnipeg.

This is by no means an unusual situation. Take a look at this handy table and notice that Tokyo may be the world's biggest city, or, perhaps, only 14th; Istanbul likewise is about 20th by metropolitan standards, but vaults to 2nd biggest by municipal boundaries.

Hence much of the difficulty in Moose's puzzle. Tokyo and Istanbul were both among the cities teams were supposed to consider. Though players had no way to know this at the time, the question writer was operating from the municipal definition, which put the Japanese capital way down near the bottom of the list and old Constantinople near the top.

The lesson here, question writers, is that definition matters. You might attempt to get around this by specifying the metric in play, but this runs the risk of a) confusing players and b) coming off as excessively arbitrary (that is, if a player asks you why only this-or-that definition, what else can you say but "that's the one I chose so the question works," which is not much of an answer).

Therefore I think the better move is to use cities that are so far apart in population that their relative size remains the same in any system or combination of systems. As an easy example, Victoria-Boston-Sao Paulo leaves no room for error. Crafting a less obvious list will be a challenge for the writer, but can probably be done.

2. The margins for error were stupefyingly small

A quantitative look at the cities Moose's included in their question reveals that the differences between the population figures were so razor-thin that even if everyone had known exactly which population measurement was in play, there would not have been much improvement in scores.

I crunched some numbers using this list of municipal populations and discovered that, among the ten cities Moose's players were asked to rank, the average difference between each city and the next largest was only 7 percent, and only exceeded a 10 percent margin in two instances. Mumbai and Moscow, which ranked 4 and 5 on the list, have a difference of only 4 percent; Moscow and Beijing, 5 and 6, are only 2 percent different. The difference between 8th and 9th place was smallest of all: only 1 percent--less than 100,000 people in a total of 9 million--separates Tokyo from Mexico City. This is a small relative difference indeed, and seems even less significant when one considers the error factor built into the measurement, as well as the fact that city populations are changing all the time and at different rates.

Probably no one will disagree that distinguishing two figures with only a four-parts-in-one-hundred difference between them is, in general, too much to ask of players. I'd instead suggest 10 percent as an absolute minimum difference between figures if players are expected to compare them. This is based partly on my intuition and my experience in trivia, but the main reason being (and my apologies to the non-mathematically inclined) that my 10-percent rule means that players need never consider more than two significant figures.

In conclusion, the which-city-is-bigger-than-which question is a risky venture. Players generally don't know which population definition is in effect, which can introduce a kind of unintentional and artificial difficulty to the question.

Furthermore when players are expected to compare two or more numbers, those numbers can be close, but in my view should be different by no less than 10 percent. 

Moose's quiz is a superb one, but their cities question was a definite misfire. Still, it might have been fixed. In the first place, instead of selecting the 10 cities from the top 14, use the top 30 instead. This would both space out the margins between them and lessen the impact of using the wrong definition of city size.

No comments:

Post a Comment