“A motorist is less likely to collide with a person walking and bicycling when there are more people walking or bicycling. Modeling this relationship as a power curve yields the result that at the population level, the number of motorists colliding with people walking or bicycling will increase at roughly 0.4 power of the number of people walking or bicycling. For example, a community doubling its walking can expect a 32% increase in injuries (2^0.4 = 1.32). Taking into account the amount of walking and bicycling, the probability that a motorist will strike an individual person walking or bicycling declines with the roughly -0.6 power of the number of persons walking or bicycling. An individual’s risk while walking in a community with twice as much walking will reduce to 66% ((2^0.4)/2 = 2^-0.6 = 0.66). Accordingly, policies that increase the numbers of people walking and bicycling appear to be an effective route to improving the safety of people walking and bicycling.”
These names are incorrect, because the name “capita” refers to different populations within each type. When associated with casualties, this refers to the number or proportion of users of the mode, but when associated with amount of use this refers to total population. I will use cyclist and cycling for convenience, but the same applies to the data for walkers and walking, and I will use distance for distance regardless of factors of ten used.
This is shown by both Jacobsen’s discussion of data and the seemingly persuasive graphs that he presents. The six graphs that he presents give a rate of casualties to users versus a measure of users versus population.
Note that for each of his sets of data, Jacobsen compares two ratios. For Fig 3, these two ratios are: Fatalities per distance cycled and distance cycled per person in the population. For Fig 2, these are: Injuries per distance cycled and distance cycled per person in the population. Note also that although two ratios would normally require four variables, Jacobsen uses only three variables. One variable occurs twice in these two ratios. Furthermore, that variable occurs once as the denominator of the first ratio and once as the numerator of the second ration. Thus:
When two such ratios are calculated, they inevitably produce what look like the declining power curves that Jacobsen presents. This is demonstrated by using pure random data for the three variables, as demonstrated by the spreadsheet attached.
This spreadsheet uses three sets of random numbers with appropriate ranges. The population random number ranges from 20,000 to 500,000. The cyclist random number ranges from 2,000 to 20,000. The accident random number ranges from 50 to 200. These ranges ensure that no unrealistic ratios are calculated, such as having more cyclists than the population size. Each time that the spreadsheet is opened, the random number generators generate new values, and the scatter plot is updated to match. A typical scatter plot appears below. All scatter plots so far generated show a curve that looks like a declining power curve, such as Jacobsen claims to represent a real safety-in-numbers process. The spreadsheet program can be downloaded in two formats, for Quattro-Pro and for Excel:
Whenever two ratios produced from three variables, one used in the numerator of one ratio and then used in the denominator of the other ratio, are plotted against each other, this will result. I don’t have a mathematical proof, but here are two observations. The first is that whenever the common variable is small, it makes the ratio in which it is the denominator small and the ratio in which it is the numerator large, and vice versa. The second is this empirical test. However, Andreassen did comment: “Predictive equations with one variable forming both part of the independent and dependent variable will suffer from spurious correlation.” (3)
Although Jacobsen has failed to demonstrate the existence of a safety in numbers effect, it is possible that it may exist in some form or another. From this standpoint, it may be useful to consider the claimed causal result. That is, increasing the proportion of cyclists in a population reduces the individual’s accident probability.
Jacobsen’s work is descended from that of Smeed, as incorporated in Smeed’s Law of 1949 (1). Adams copies a graph from Smeed that plots Deaths/Vehicle against Vehicles/Population (2). Therefore, Smeed’s formulation suffers from the same mathematical defect as does Jacobsen’s later formulation.
However, whatever may be the validity of the correlation, no single causal effect has been shown. Smeed has been quoted as ascribing the reduction in accident rate to an effect of group psychology. Adams states a similar reason: “The long-term decrease in death rates per measure of exposure is much more plausibly attributable to myriad behavioural adjustments in response to perceived increases in the threat of traffic.” In summary, the greater the familiarity and use of motor traffic, the lower the casualty rate caused by motor traffic.
The fact that the correlation exists for periods in which motoring has increased at its “natural” rate does not demonstrate a causal relationship. That would require that a sudden forced increase (or decrease) in motoring produced the predicted change in rate. The sharp reduction in motoring, followed by a sharp return, produced in the US by Word War 2, show such a decrease and increase (Adams, Figs 6, 7). However, there were so many simultaneous factors affecting motor transportation that the effect of volume cannot be separated. What may be the causal relationship between volume of motoring and motoring-caused accident rate has not been determined.
Furthermore, Jacobsen makes a rather different prediction than these based on Smeed’s Law. “Whose behavior changes, the motorist’s or that of the people walking and bicycling? It seems unlikely that people walking or bicycling obey traffic laws more or defer to motorists more in societies or time periods with greater walking and bicycling. Indeed it seems less likely, and hence unable to explain the observed results. Adaptation in motorist behavior seems more plausible ... For example, a community doubling its walking can expect a 32% increase in injuries. ... Accordingly, policies that increase the number of people walking and bicycling appear to be an effective route to improving the safety of people walking and bicycling.”
Not only does Jacobsen rule out better learning by cyclists, but he posits a different causal route than might be indicated in Smeed’s analysis. Instead of positing that greater familiarity with motor traffic operations improves the process of motoring, Jacobsen posits that greater familiarity with bicycle traffic improves the process, not of cycling, but of motoring. Therefore, whatever credence might be given to a Smeed process should not be credited to a Jacobsen process.
Jacobsen had previously circulated a study of only the California data, the data to which he refers in this paper. To one who knows most of those cities, and all of their areas, it was easy to conclude that there was more cycling in those areas in which the social conditions for cycling were more favorable: university locations and social liberalism being the most obvious. Since those were also the cities with the lower casualty rates, it is reasonable to consider that the lower casualty rates were caused by the same social conditions that produced more cycling. Jacobsen’s hypothesis would require that artificially doubling the number of cyclists in a city with poor social conditions for cycling, a low cycling rate, and a high casualty rate would automatically reduce the casualty rate per cyclist by about 30%. That can only be a thought experiment, but I suspect that the casualty rate reflects social conditions (both of the local general public and of the local cyclists themselves) rather than mere numbers.
This criticism extends only to claims made on the basis of two ratios calculated from only three variables. Studies which compare two ratios calculated from four independent variables do not suffer from the problem stated. For example, a study of crashes per highway mile against vehicles per hour would not suffer from this defect. It is commonly accepted that many accident/volume relationships are non-linear; for example, for freeway travel the accident frequency is low for both very high, slow flow and very low, fast flow, with a maximum somewhere in between. There is a growing field of work that attempts to work out mathematical normalizing functions that can be used to discover those locations, or conditions, with significantly higher than predicted accidents. That is entirely different work than is criticized in this paper.