We statistically analysed the five core concepts of 51 Sprints sex, body, race, nation and class, and we tried to quantify how these may be factors that influence the running times of the sprinters.
Using the toggle in the Equaliser, you can adjust the runner’s performance by switching on or off the influence of these factors. For example: statistically speaking a finalist’s running time is influenced by the sex of the runner. Females relatively run less fast than the average end time of all sprinters and men run averagely faster. Using the Equaliser, you add 0.431 seconds to the running times of males and you subtract 0.529 of the running times of females. This equalises the statistical difference between men and women.
This experiment has some limitations and ambiguities that immediately catch attention. The chosen concepts are highly contested in themselves and often politically charged. Difficult to define, their meaning changes over time. In addition, other important factors are not accounted for, and the available information on some factors is very sparse. However, bearing these and other limitations in mind, we worked in accordance with rigorous standards of statistical analysis.
We believe our analysis to be insightful, precisely because it uses politically problematic simplifications that play a big role in public perception. Although our statistical analysis can not claim to make any statement on the absolute influence of these categories, it still reveals relative inequalities between groups of athletes that cannot be denied.
How did we proceed?
Initially, we investigated which available data might be able to give quantifiable measurements (so called proxy-variables) to our five fuzzy categories of interest. This led to a set of about 35 variables. After checks on data availability, statistical assumptions and content validity, we were left with 20 variables for which we had information on most of the 385 sprint finalists in the modern Olympics since 1896.
Athletes are categorised as male, female or intersex. Since there was only one confirmed intersex participant (Stella Walsh, gold in Los Angeles 1932), we could statistically not structure this category. Other complexities of current and past gender concepts escaped our quantification.
The concept of race presents certain dilemma’s. It is hugely charged in political and social terms and a deeply layered issue. For some it is a question of identity, for others it is a social category. Yet others state it should be ignored altogether. Although the notion cannot be delineated, it can also not be denied.
It is a prominent factor in the politics and perception of sports, it appears to play a role in running times of sprinters. With this in mind, we decided to radically oversimplify the notion of race to a difference of three types of skin colour: black, yellow and white. Focusing on these few visual positions, in an otherwise complex spectrum of racial identity allows us to highlight some relative tendencies.
This factor consists of the following variables: standard or non-standard body plan, (the latter with specifications of lower-leg amputation or blindness), body height, weight, age at the time of performance and use of doping if this was ever officially recorded during the runner’s career.
This factor consists of the following variables: nationality, gross domestic product of the count that the runner represents, population level at the time of performance, type of government (democratic, socialist, authoritarian), and the confirmed presence of state-sponsored doping programmes.
This factor consists of the following variables: profession, college educated (yes/no), education funding (private, public or none), privileged upbringing, and sponsorship (present/not present).
Towards the first overall conclusions
Once the data table was complete, we were interested in identifying the variables which contribute the most to the finishing time of the athlete. We used the multiple linear regression (MLR) statistical method. In this case, MLR has enough explanatory potential for a thought experiment: it can foreground relative differences among groups of athletes, but it does not give any explanation of the causes for these differences.
During the analysis, it became apparent that the year in which the respective Olympic Games took place indicated most of the variance in finishing times. Runners have become faster over the last 100-plus years. Because we are interested in the story of the individual athlete embedded in a specific context of our five factors of interest (and taking into account a number of statistical limitations), we excluded chronology as an explicit factor. In other words: we accept without explicit statistical correction the fact that later runners are faster then earlier ones.
Running times were used as the dependent variable and a full model of the factors was fitted. This model was then used to calculate, per runner and per variable, the running time offset (gain or loss) in seconds.
The data collection and analysis methods applied here have been tailored to the demands of credibly enabling a thought experiment. The emergent display of differences does not explain causes. They are no more but also no less than relative differences among groups of athletes that can be made visible through statistical analysis of a dataset. Having said that, we did our utmost to conduct the most proper statistical investigation of the data possible. In case of questions, please contact either Random Studio or, for analysis related inquiries, Mikhail S. Spektor and Dr. Gilles Dutilh from the University of Basel.
Bearing in mind the limitations of our analysis in scope and ambition, we invite you to analyse the data yourself. It can be downloaded here. Numerous interesting stories are hidden in this table, and we sincerely hope that more of them will be uncovered and shared.
In the selection overview not all athletes are visible, some runners are represented by a computer-generated image. This is since either some runners where disqualified, some Olympics don’t have video footage or we did have access to the original footage.
International Paralympic Committee (IPC)
Olympic Television Archive Bureau (OTAB)
UCLA Film & Television ArchiveAll Olympic Footage available in this production is copyright of and reproduced only with the consent of the International Olympic Committee.