Date: Tue, 02 Mar 1999 03:39:21 -0500 From: Marco Daniele Paserman

Newsgroups: rec.sport.soccer Subject: Serie A value-weighted goals 1997-98: a new approach (LONG) 1. INTRODUCTION A while ago, I proposed a method to measure more accurately the value of goals and to use this alternative measure to rank a league's goalscorers. The basic idea was to weight more heavily decisive goals, and to downgrade meaningless 90th minute goals that do nothing more than put the nails in the coffin of a 4-0 win. My initial simple proposal weighted goals by their ex-post value: so, for example, a goal that put one team up 1-0 would be weighted very heavily if the scoring team was able to preserve the lead, but would be worthless if the other team made a comeback. This led to suggestions by Ken Overton and Paul Mettewie to create an alternative, ex-ante measure, that would better take into account the impact that each goal would have on that udefinable aspect of a game, "momentum". Indeed, Ken Overton took this criticism seriously, and worked very hard to construct a data set with all Serie A goals for the 1997/98 season, and to compute such an ex-ante measure of the value of a goal. Ken's very interesting results are now posted on Benny's European soccer web-site (www.soccer-europe.com). Ken's method weighted each goal based only on the margin of difference in the score at the time the goal was scored. This resulted in a couple of flaws: 1) The weighting mechanism was somewhat arbitrary. 2) No value was give to crunch time goals: a goal that put one team up 1-0 in the 10th minute was valued exactly the same as a goal that put a team up 1-0 in the 85th minute. Lev Polinsky suggested that one could follow baseball's example and weight the goals by the change in the expected points gained by the scoring team as a result of the goal. Having a little spare time on my hands, I decided to embark in the task of constructing this new measure. I use Ken's data and I thank him again for going through the trouble of putting it together. Some interesting results come out. 2. THE NEW WEIGHTING MECHANISM [Skip directly to Part 4 if you're allergic to numbers] As I mentioned above, a goal should be weighted by the additional points that it is expected to give to the scoring team. For example, assume that the score is 0-0 in the 89th minute. There's a very high probability that this match will end in a draw. However, if one scores a goal in the 89th minute, the probability that the scoring team wins shoots up, and the probability that the match ends in a draw drops down to nearly zero. Such a goal is worth approximately 2 net league points to the scoring team. Formally, we can define the value of a goal scored in minute t, that makes the difference in score equal to d goals as: Value= 3*{P[win | min=t, diff=d] - P[win | min=t, diff=d-1]} + + 1*{P[draw | min=t, diff=d] - P[draw| min=t, diff=d-1]} The first line tells us the change in the probability of winning when one scores the d-difference goal in the t-th minute; the change is multiplied by 3, given that a win is worth 3 points. The second line tells us the change in the probability of a draw, and is multiplied by one. This looks like a very neat measure. A nice feature is that it has some real, not just nominal, meaning. In fact, when we add up the weighted goal values for each player, we get a measure of the expected number of points that a player gave to his team as a result of his goals. Now, all we need is to construct the probability measure for every possible minute and every possible difference in scores. This can be done using Ken's data on last year's 306 Serie A matches; however, one must be a bit cautious, since the sample used to estimate these probabilities becomes quite small for large margins. On the other hand, one hopes that these biases are relatively small, and that the overall measure will still be meaningful. 3. IMPLEMENTATION In practice, I used two alternative methods to estimate the matrix of probabilities. The first method estimates different probabilities for home and away teams: this is actually a more refined and improved method compared to the one described above, as it also gives extra weight to away goals. On the other hand, more data is needed to estimate the probabilities accurately: some of the goals scored in the very first minutes may receive an inaccurate weight. The second method ignores the distinction between home and away goals, but it uses more data to estimate the probabilities. Here are some examples: the probability that the home team wins a game given that it is up by one goal in the 15th minute is 62.7%. The probability that it draws given that it is up by one goal in the 15th minute is 30.2%. The probability that it wins given that the score is level in the 15th minute is 45.5%. The probability that it draws given that the score is level in the 15th minute is 29.5%. A quick application of the above formula reveals that the value of a goal that puts the home team up by one goal in the 15th minute is 0.525. [0.525 = 3*(0.627-0.455) + 1*(0.302-0.295)] What about a goal by the away team in the 15th minute? The probability of an away win given that the away team is up by one goal is 57.1%, and that of a draw is 22.9%. The probability of an away win given that the score is level in the 15th minute is 25%, and that of a draw is 29.5%. Hence, the value of a goal that puts the away team up by one in the 15th minute is 0.898. The second method (partially) ignores the distinction between home and away teams. The probability that a team up by 1 at the 15th minute eventually wins the game is 60.3%, and the probability that it draws is 26.9%. The probability that the home team wins given that the score is level in the 15th minute is 45.5% and the probability that it draws is 29.5%. The probability that the away team wins given that the score is level in the 15th minute is 25%, and the probability that it draws is 29.5%. Hence, the value of a goal that puts a team up by one in the 15th minute is 0.416 if the goal is scored by the home team, and is 1.032 if the goal is scored by the away team. [Wait a minute: didn't I just say that in this method the distinction between home and away teams is immaterial? Well, it is in the "general" case, but one needs to make a distinction between home and away teams when calculating goals that either break a tie, or create a tie. The reason for this is that the probability that a team wins (or loses) given that the score is level is not well defined]. Without going into all the details, let's look at some more goal values for other types of goals: Method 1 Method 2 Home Away Home Away Puts team 1 goal up in 15th minute 0.525 0.898 0.416 1.032 Puts team 1 goal up in 75th minute 1.266 1.480 1.236 1.530 Puts team 2 goals up in 62nd minute 0.397 0.694 0.520 0.520 Equalizes in 88th minute 1.003 0.879 1.011 0.874 These are just some examples, but they give the main idea: goals scored later in the game are valued more; goals that put the team 1 goal up are more valuable than equalizers and goals that put the team 2 goals up. Note also that the home equalizer in the 88th minute is more valuable than the away equalizer in the 88th minute: this is because home teams are more likely to score a winning goal in the final minutes (including injury time). 4. RESULTS So what does all this give us? Here is a table that ranks Serie A's leading scorers in 1997/98 according to different methods: (1) (2) (3) (4) (5) (6) Raw PAS1 PAS2 Expost1 Expost2 KOV Bierhoff,O. 27 19.98 20.09 30.50 45.00 19.17 Ronaldo,L. 25 16.95 16.25 27.00 32.05 16.45 Baggio,R. 22 14.44 14.32 16.30 26.25 14.25 Batistuta,G. 21 15.81 15.23 14.35 20.05 14.23 Del Piero,A. 21 11.78 10.71 22.40 27.85 14.50 Montella,V. 20 15.05 15.44 18.77 21.00 14.83 Inzaghi,F. 18 15.28 15.20 19.10 25.25 12.67 Hubner,D. 16 11.41 10.64 8.75 8.25 11.50 Oliveira,L. 15 11.27 12.19 14.35 10.55 10.42 Balbo,A. 14 9.11 8.09 10.25 18.75 8.12 Esposito,C. 14 8.49 6.43 10.05 10.60 7.92 Totti,F. 13 10.64 11.44 10.68 11.60 9.58 Andersson,K. 12 7.67 7.47 8.70 15.50 7.42 Crespo,H. 12 9.77 9.11 11.75 11.25 9.08 Paulo Sergio,S. 12 7.89 7.71 8.88 8.30 7.17 Nedved,P. 11 8.24 8.01 12.25 16.75 8.33 Bellucci,C. 10 7.61 7.03 3.00 3.00 6.53 Boksic,A. 10 3.79 4.35 12.25 13.25 5.67 Chiesa,E. 10 7.30 7.49 9.50 10.50 7.33 Palmieri,F. 10 6.25 6.48 7.25 9.00 5.42 Poggi,P. 10 8.05 7.74 8.25 13.75 6.95 Weah,G. 10 7.40 8.14 8.50 6.50 6.83 The first column shows the raw goal total. Columns (2) and (3) show the two new measures described above. Columns (4) and (5) show my initial ex-post measures (you can look at the detailed description in my article dated May 27 1998), and column (6) presents Ken Overton's measure. [Formulas: (4) weight = 3*(1/finalA) if finalA>finalB 1*(1/finalA) if finalA=finalB (5) weight = 3*(1/(finalA-finalB)) if finalA>finalB 1*(1/finalA) if finalA=finalB (6) weight = 1/diff if diff>=1 1/abs(diff-2) if diff<=0 ; where finalA is the final number of goals of the scoring team, finalB is the final number of goals of the conceding team, and diff is the current difference in score generated by a goal. The formulas for columns (2) and (3) are given above.] This table presents some interesting results. The first two places in the ranking are unchanged, no matter what measure we use. However, as we go down the ranking, some interesting things happen. For example, Batistuta, who ranked very low based on the ex-post measures, redeems himself when we use the new measures. This is because Bati tended to score often in big Fiorentina wins last year. These goals received little value in the ex-post measures, but the ex-post measure ignored the fact that Bati was often the one who opened the flood gates for Fiorentina's success. On the other hand, Del Piero does pretty badly in the new ranking. My guess is that many Del Piero goals were at home and several came late in the game. On the other hand, Inzaghi scores high with the new measure: his goals are the heaviest, in the sense that they have the highest value weighted to actual goal ratio. Boksic stands out for his very low value-weighted total. Four of his ten goals were scored in the final minutes, and received a weight of zero. Compare that to his ex-post total, that places him high in the ranking, mostly because his goals coincided with Lazio's period of grace. Finally, Totti and Crespo scored relatively high-powered goals, and Hubner and Bellucci's goals, which received very little value in the ex-post measures, are revalued using the new method, even if in the end these goals didn't save their teams from relegation. 5. CONCLUSION If you've read this far, you must be a very brave person, or a total football and statistics junkie like myself. Well, I hope you found these stats interesting, and I'm looking forward to hear comments and suggestions. Daniele