Relative Importance Value (RIV) (June 1 2017)
I’m a City fan and a maths analyst so as it's the off season and I was bored at the weekend (and inspired by the book Soccermatics by David Sumpter), I thought I'd take a quick look over our results in the hope of finding out something reasonably interesting. The result of this is what I've called a player's “Relative Importance Value” - (RIV) ie to see if we have performed better when starting with a player or when not playing him.  To do this I used the ratio between the points gained per start of a player and the points per missed game*.
You could compare  two players by comparing their points per game values (PPG), but there is no way to understand how the team performed when those players weren't playing. If Player A achieved a PPG of 1.2, which seems better than Player B's 1.1 PPG, then we would say that the team did better when Player A played. Though, if the team achieved 1.4 PPG when Player A did not play and 0.8 PPG when Player B did not play, we could assume that Player B was a more integral part of the side than Player A.
The most basic interpretation of the RIV score is if a player's score = 1, then the team performs equally as well when the player plays and does not play. A value greater than 1 indicates that the team achieves more points when this player starts; and a value less than 1 indicates the opposite, so the team performs worse when the player starts. My interpretation is that when the score is high, a player fits into the system that the manager picks and the team benefits. A low score indicates not necessarily that the player is bad (it may be in some cases) but they do not fit, or work well in the system that has been picked.
Some caveats though. This formula is very basic and I really do appreciate that there are multiple levels of complexity that can be added to give a better overall picture of the squad. An assumption I've made is that substitutions do not affect the outcome of the match and essentially we live and die by the team that kicks off. Of course, in reality this isn't the case and substitutions do occasionally have an effect (Rotherham at home?) and it's something that I'd like to develop. Something else you'll notice from the graphs is the lack of any bar for Aden Flint. This is simply because he played every game, which means there isn't anything to divide his points per game by.
Results
On to the results. I've uploaded a couple of graphs to the link below. The first is a graph ranking the players by their RIV score and the second I've tried to group them by position and then rank them.
Note: Engvall and McCoulsky are there because their squad numbers were between some of the other players' numbers. Djuric's bar is blank because we lost the three games that he started in.
Are the results what you might have thought? Generally most of the graph is as expected, especially with Giefer and Adam Matthews down the bottom! Interestingly our most effective performers are Matty Taylor and Jamie Paterson. If you remember in the short amount of time that Tammy didn't play, we didn't fare too badly, which is reflected by his score of 1.2. A few other things to note
·         Pack and Smith score highly. Smith started 14/21 games in the same side as Pack, so I believe there is a relationship between their scores. Smith's score could have been brought down by him not starting those 7 games alongside Pack.
·         Fielding may be shaky, but he's no Giefer!
·         The general improvement with Wright in defence over Magnusson.
·         Bobby Reid, Callum O'Dowda and Zak Vyner are not starters at the moment and Wilbs might be more suited to off the bench appearances!
·         The performance of the side is very slightly better when Tomlin does not play.
·         Hopefully we'll find a place for Hegeler to fit next year.
·         Joe Bryan with a score less than 0.8??
The Curious Case of Joe Bryan
On the back of that bullet point, I tried to work out why Joe Bryan's score was so low, since this is a player who has started the majority of games this season (39) and allegedly been scouted by Premier League clubs. Surely he must be more important to us than that? It occurred to me that he is probably the only player in the side whose position is split between two: left back and left midfield. I've gone back through all of the starting lineups to find the games where I've assumed him to have started at left back or left midfield. The result is the following graph:
The difference between the performance of the team in Joe's two positions is pretty clear. We seem to struggle when he plays in midfield but his importance at left back puts him third in the overall squad ranking (ignoring Flint!).
Summary
The RIV is a basic measure of the effectiveness of starting particular players. It isn't affected by the strength of the opposition squad, the subs made or the overall context of the match. In that sense it is very basic. Though it is not a coincidence that the top players in the squad are those that finished the season so strongly, it is interesting to see the positive effect that Matty Taylor and Bailey Wright had on the club and the performances of the team.
Hopefully this little bit of analysis has been interesting. I look forward to reading, replying and learning from your thoughts. I'm happy to send the data file I've created to anyone who wants it, since it's all open sourced from the BBC website; and if there's anything of interest that I can post then request and I'll let you know if I can do it!
*The Method
I've been through the team selection (from the BBC match reports) for every league match this season and noted the number of points earned by the players who have started, divided this by the number of games they've started to find their points gained per start. Then I've found the difference between the total number of points City achieved and the number of points that player has achieved and then divided that by the number of games they didn't start in. The ratio of these two numbers is the RIV.
For example: The team earned 35 points in the 30 games that Lee Tomlin started. 35/30 = ~1.17 points per start. This then means that we got 19 points in the 16 games he didn't start. 19/16 = ~1.19 points per missed game. RIV = 1.17/1.19 = 0.98
Comments
Post a Comment