4070183

Conditional probability and the length of a championship series in baseball, basketball, and hockey

(Bedingte Wahrscheinlichkeit und die Länge einer Meisterschaftsserie im Baseball, Basketball und Eishockey)

This paper re-examines the assumption that the probability of winning the World Series, the NBA Finals, and the Stanley Cup is constant across the series. This assumption is the primary basis for models that endeavor to explain the length of a series, but we demonstrate that this model is inconsistent with historical data in all three sports. We adjust the model to incorporate conditional probabilities and fit it with historical data. While one can always backfit historical frequencies to conditional probabilities, doing so shows that the variation in conditional frequencies within and across sports is too wide to support the constant probability model. We also define a new notion of the concept of two teams being evenly matched. Sports championships draw considerable interest from the general public, the media, and statisticians. Major League Baseball`s World Series, the National Basketball Association`s Finals, and the National Hockey League`s Stanley Cup are all best-of-seven series that are watched and followed by millions of people worldwide. Fans wait with great anticipation for the outcomes of these events, not knowing if a series will go four, five, six, or seven games. Broadcasters pay millions of dollars for the rights to televise these games, thereby making substantial investments in an event with uncertain length. Moreover, a considerable amount of money is wagered on these games, some legal and a great deal illegal. An obvious question that arises from these best-of-seven events is the number of games required to determine the winner. Only a modest amount of research, however, and amounting to but four scientific studies, has been done on this subject. All of these studies have been based on the assumption that the games represent a series of independent events with each team having a constant probability of winning a given game. Naturally, the naïve assumption is that this probability is 0.5, a condition often referred to as the teams being "evenly matched." Perhaps this assumption is simply convenient, though perhaps it is motivated by the fact that the series is between the two best playing teams at the time, and thus, they are likely to be similar in quality. As we discuss later, an argument can be made that an evenly-matched series is not the same notion as an evenly-matched game. Nonetheless, this issue can be set aside for the moment. The primary question of this paper is whether the probability of victory is constant throughout the series and if not, how we can incorporate a non-constant probability of a given team winning. In previous analyses, a model based on constant probability and independence has always been assumed, and some researchers have attempted to extract that probability from the data. In this paper, we present evidence that these assumptions are incorrect and lead to results that are inconsistent with the history of these sports championships. We develop an improved model for the distribution of games by incorporating conditional probability, the notion that a team is more or less likely to win based on certain conditions that exist at a given time. While one can always fit conditional probabilities to historical data, in doing so we show that the variation across and within sports is too great to support the constant probability model. In particular, a constant probability of 0.5 is a poor fit.
© Copyright 2020 Journal of Sports Analytics. IOS Press. Alle Rechte vorbehalten.

Schlagworte: Basketball Baseball Eishockey Wettkampfperiode Prognose mathematisch-logisches Modell Modellierung
Notationen: Spielsportarten Organisationen und Veranstaltungen
DOI: 10.3233/JSA-200422
Veröffentlicht in: Journal of Sports Analytics
Veröffentlicht: 2020
Jahrgang: 6
Heft: 2
Seiten: 111-127
Dokumentenarten: Artikel
Sprache: Englisch
Level: hoch