Trying to Find Midfielders

With midfielders often being the most involved in games, looking for midfielders can be tough. Midfielders are often expected to be more well-rounded than other positions, meaning it’s even more important to holistically assess their skill set as opposed to just looking at metrics in isolation.

With this in mind I thought it’d be fun to try and find some midfielders and look into how to assess players using multiple metrics. The basis for this piece is pretty much just me finding out what z-scores are and going crazy applying them.

I’m not going to try and look at a players entire skill set/output, to be more specific I want to find a certain type of midfielder. The word I had in mind was ‘progressive’ but having looked up its definition I’m not so sure it’s what I was looking for.

I want to find midfielders who progress the ball into dangerous areas and resist pressure, basically midfielders who can dribble and get the ball into the final third – as opposed to just taking the easy pass.

Before starting, a quick note on the xP stuff. Since updating my data I’ve had problems with my xP method, I overwrote my last method thinking I’d remember what I did (I didn’t), so had to start from scratch. I tried to recreate it but it didn’t go to plan as everyone with over 500 passes was completing a lot more than ‘expected’.

In the end I settled on using a logistic regression with a load of features which gives okay results and I’ll stick with for the time being, but it means I won’t be able to compare these xP values with older values from previous pieces.

The new method also seems to be harsher than the older method, with the max xP Rating going down from around 1.2 to around 1.1. With that in mind, while I’ll use them, I won’t put too much weight into these xP Ratings, as I feel like I need have a thorough look at the model.

Another bit of information, I’ve added Ligue 1, Liga NOS and Eredivisie data to my collection, though there are 6 missing games in the Ligue 1 data. I have my suspicions about which ones they might be but I haven’t properly dug through yet, so keep in mind some of the p90 values in Ligue 1 might be skewed.

Finally, for my data ‘midfielder’ includes some wingers, so expect to see some wide players dominating the dribbling metrics. It’s not ideal, but I’m too lazy to go through and a add if they’re a wide player or not.

What are we looking for?

Before delving into the data I’ve got to outline what I want. To make it easier I’ve decided to separate the types of midfielder I’m looking for into 2 simple groups – deeper lying midfielders and more advanced.

The deeper lying midfielders should be comfortable collecting the ball from defence, resisting pressure from the opposition and then distributing – preferably into dangerous areas. This means things like xP Rating and Vertical Passing are more important, as we want players who can efficiently move the ball and find players in dangerous areas.

For the sake of this piece I’m only going to look at them on the ball, as opposed to also looking at things like defensive contribution.

Bringing this together the kind of things we’ll be looking at are: xP Rating, vertical passing and take-ons in deeper areas.

Then, for the more advanced midfielder we want players who can take players on and move the ball into threatening areas. This means we’ll be looking at things such as: take-ons in advanced areas, passes into the box, vertical passes in advanced areas.

All this is fairly obvious, we want the deeper lying player to connect the defence and attack, while the more advanced players should be those using the ball in the final third.

In terms of the methods for using z-scores, I calculated each players z-score for each metric, then added together these z-scores for different groups. I’m not sure if this is a good idea or not (it really feels like it isn’t), but it’s an attempt to gauge who’s overall output is best when it comes to different areas.

What I’m using to convince myself it isn’t terrible is that if you sum the z-scores you’re summing how far above average they are with more relative/normalised numbers, so it’ll give you an overall look of how far above average they’re performing across multiple numbers.

Deeper Lying Midfielders

Starting with deeper lying midfielders, as mentioned above, we want someone who can efficiently distribute the ball and also resist pressure, with bonuses for those who use their dribbling to advance the ball as well.

With this in mind the groups used are deeper passing, looking at xP Rating, their vertical passes, their vertical passes into the final third and after clustering vertical passes, looking at the ones they attempted in their own half and their completion of them compared to average. The full list of metrics used are:

  • xP Rating
  • Vertical passes from own half attempted and completed p90
  • Vertical passes from own half ‘expected’
  • Vertical passes from own half rating (actual completions / expected completions)
  • Vertical passes attempted and completed p90
  • Vertical passes to final third attempted and completed p90

It’s mostly looking at vertical passes as I wasn’t sure on the best way to look at how ‘progressive’ other passes are, whereas vertical passes help what we’re looking for as they both move the ball forward and exclude passes that go into wider areas.

The next step was to look at their deeper dribbling, which is just looking at take-ons that happen in the first 60% of the length of the pitch and within the width of the 18-yard-box. The metrics here were just their attempted take-ons, successful take-ons and take-on success percentage in that deeper area.

The z-scores for these two groups should hopefully give an overall picture a players output in comparison with others, as opposed to comparing each metric individually.

Jumping right in and plotting this gives us a few highlights. The graph below shows their combined deeper dribbling and passing z-scores.

Click to Enlarge

One of the first names to jump out is Jorginho. Unsurprisingly when looking at a passing metric the Napoli midfielder is in a league of his own. Dribbling isn’t a big part of his game, attempting just 0.552 take-ons p90, but that shouldn’t discount from his extraordinary passing numbers.

Another name to stand out is 20-year-old Frenkie de Jong, he’s appeared in both defence and midfield – more often in defence – but that makes his dribbling numbers look even better.

Obviously these numbers come playing for a dominant side in a weaker league, so shouldn’t be directly compared those putting up similar numbers ins stronger leagues, but De Jong is still a player with huge potential.

A big part of De Jong’s game is his dribbling, he’s attempted the second most take-ons in deeper areas this season, behind only Eden Hazard. Equally as impressive is his success rate of 96.2%, the highest of all players who attempt more than 1 take-on in a deeper area p90.

@finalthrd recently made a great video looking at De Jong, which can be seen below:

A lot of other names that perform well tend to be the usual suspects when looking at midfielders.

If we define the top corner as a passing z-score of 45 and dribbling z-score of 15 then get the distance/hypotenuse of our point to that point, the ten closest are:

Player Team
Jorginho Napoli
Frenkie de Jong Ajax
David Silva Manchester City
Marco Verratti PSG
Andres Iniesta Barcelona
Cesc Fabregas Chelsea
Marek Hamsik Napoli
Mousa Dembele Tottenham
Nuri Sahin Dortmund
Mesut Ozil Arsenal

Repeating the above process for those under 24 (born after 1/6/1993) gives us the following 10 players:

Player Team
Frenkie de Jong Ajax
Giovani Lo Celso PSG
Philippe Sandler* PEC Zwolle
Filip Krovinovic Benfica
Renato Tapia Feyenoord
Naby Keita RB Leipzig
Adrien Rabiot PSG
Adrien Tameze Nice
Lucas Torreira Sampdoria
Fredrik Midtsjo AZ

* Sandler has played centre-back this season.

From the above Filip Krovinovic is an interesting name, although he’s currently suffering from a cruciate ligament injury, when he has played this season he’s had some impressive numbers. @TiagoEstv posted a great thread including a couple videos about Krovinovic which can be seen below:

Other young players who impress, but don’t quite make the top 10, include Celta’s Stanislav Lobotka, Villarreal’s Rodri, Bournemouth’s Lewis Cook and Lyon pair Houssem Aouar and Tanguy NDombele.

This can be seen more when looking at a graph on u24’s:

Click to Enlarge

One of the problems with above is that it looks at activity as opposed to performance. You’d expect a player who makes more passes to make more vertical passes.

It could be interesting to look purely at performance numbers, such as the percentage of passes which are vertical or how successful they are with take-ons.

Obviously, the problem with this is that those with a small output could be overrated. However, combining the above, or just filtering out those with certain numbers, with a more performance based approach could produce some interesting results.

Looking firstly at deeper passing, it shakes things up a bit.  Using their xP Rating, the percentage of their passes which are vertical and the rating of vertical passes originating in own half for a combined z-score gives the following top 10 (for players with a z-score greater than -0.5 for attempted numbers, so (I think) half a standard deviation below average, to weed out those with minimal attempted passes):

Player Team
Cesc Fabregas  Chelsea
Frenkie de Jong Ajax
Alex Oxlade-Chamberlain Liverpool
Philippe Coutinho Liverpool
David Silva Manchester City
Mousa Dembele Spurs
Philippe Sandler* PEC Zwolle
Lasse Schone Ajax
Eduardo Estoril
William Carvalho Sporting

* Sandler has played centre-back this season.

While far from perfect, this starts to reward players for how efficiently they use the ball as opposed to how often.

The graph below shows a comparison of the two methods, with the original on the y-axis and performance on the x-axis, you can see it’s now a lot closer among the top few rather than having Jorginho all on his own.

Click to Enlarge

Given the only performance for take-ons is the success rate, it’s not really worth running through a similar process. Plotting the success rate (or successful take-ons) vs the number of attempts will tell you the same information.

Plotting the dribbling z-score against the new passing efficiency z-score gives the following:

Click to Enlarge

Frenkie de Jong does ridiculously again, while looking at the top 5 leagues Mousa Dembele seems to be the exact type of midfielder we’re looking for. He excels with his passing efficiency and dribbling, attempting the 5th most take-ons in the 7 leagues used with the 2nd highest accuracy (for those >=1 p90) with 95.2%.

Young English pair Lewis Cook (who is 3rd in the Premier League for percentage of passes vertical) and Ruben Loftus-Cheek are also in a good position, being significantly above average in both areas, while again Ligue 1 youngsters Giovani Lo Celso and Tanguy NDombele have some great numbers.

Advanced Midfielders

Moving further up the pitch now, we want people who can collect the ball from the above players and be a serious threat to the opposition.

Similarly to above we have two groups, advanced passing and advanced dribbling. Advanced passing contains:

  • Passes into the box attempted and successful p90
  • Passes into the box success rate
  • Advanced vertical passes attempted p90, successful p90, expected p90 and rating (successful completions / expected completions)

Then the dribbling just looks at take-ons in the last 40% of the length of the pitch and again just the width of the box.

Doing the same as above and jumping in to plot the two groups gives the following graph:

Straight away you can see the trio of Kevin de Bruyne, Mesut Ozil and David Silva excel in passing, while Eden Hazard and Yacine Brahimi do the same for dribbling, while Philippe Coutinho (using his Liverpool data) and Hakim Ziyech do well at both.

Ziyech obviously has the same situation as Frenkie de Jong where he’s playing for a strong team in a weaker league, so those numbers shouldn’t be directly compared with the top 5 leagues. So, suggesting Ziyech as a Coutinho replacement may not be the best of ideas despite their placement on this graph.

Another player with impressive numbers in the Eredivisie is Martin Odegaard. Despite only just turning 19-years-old Odegaard has been referred to as forgotten after amassing a huge amount of hype in his (even) younger years.

While I haven’t seen him for Herenveen his numbers are impressive, his advanced passing z-score is the highest in the 7 leagues used for those under 24 as are his numbers for attempted passes into the box p90 and advanced vertical passes p90.

Defining the top right corner as 31 for Advanced Passing Z-Score and 14 for Advanced Dribbling Z-Score, the top 10 closest to that point are:

Player Team
Hakim Ziyech  Ajax
Philippe Coutinho Liverpool
David Silva Manchester City
Eden Hazard Chelsea
Mesut Ozil Arsenal
Kevin de Bruyne Manchester City
Cesc Fabregas Chelsea
Martin Odegaard Heerenveen
Isco Real Madrid
Andres Iniesta Barcelona

Then repeating this for those under 24 gives the following:

Player Team
Martin Odegaard  Heerenveen
Naby Keita RB Leipzig
Alex Oxlade-Chamberlain Liverpool
Giovani Lo Celso PSG
Goncalo Guedes Valencia
Alex Iwobi Arsenal
Bernardo Silva Manchester City
Sofiane Boufal Southampton
Frenkie de Jong Ajax
Yassin Ayoub FC Utrecht

Odegaard shines again, while it also gives Liverpool even more reason to be excited for Naby Keita, particularly when both him and Alex Oxlade-Chamberlain have great numbers.

It’s also incredible Frenkie de Jong does so well considering he’s spend a large portion of his time in defence, yet has some of the best advanced numbers in the top 7 leagues this season.

Following the same process as above, the next step would be to look at performance as opposed to activity.

So, instead of using number of passes we’ll use xP Rating, pass into the box success rate, advanced vertical passes rating and percentage of passes which are advanced vertical passes.

Looking just at passes, and again trying to weed out those with a tiny sample size, gives the following top 10:

Player Team
Arjen Robben  Bayern
Karim Bellerabi Leverkusen
Eden Hazard Chelsea
Goncalo Guedes Valencia
Bart Ramselaar PSV
Philippe Coutinho Liverpool
Alex Oxlade-Chamberlain Liverpool
Bernardo Silva Manchester City
Filip Krovinovic Benfica
Michael Vlap Heerenveen

From this list, it’s interesting to see both Oxlade-Chamberlain and Krovinovic pop up again having also impressed with the deeper numbers.

Plotting this shows Robben is some way in front, before it being a lot closer.

Click to Enlarge

Bringing this and the dribbling together to plot dribbling and passing efficiency gives the following:

Click to Enlarge

Eden Hazard, Philippe Coutinho and Yacine Brahimi really shine in this category, and stylistically are pretty much what we’re looking for. Those who can pick the ball up and use it in tight spaces, creating dangerous situations in the final third.

Some other names worth mentioning include, again, Ruben Loftus-Cheek putting forward some great numbers on loan at Crystal Palace this season, Goncalo Guedes on loan at Valencia has also done great (albeit from out wide) and it’ll be interesting to see what happens with him at his parent club PSG in the summer.

Bringing it all together

Finally, I thought it’d be fun to try and normalise the numbers to give an attacking and defensive ‘score’ to see how they compare, and possibly find someone who does well at both. Then suggest some players who have some interesting numbers.

I used the standardize function in Excel to normalise the numbers, which having looked at the formula might just be creating another z-score, then added the standardized numbers together as a kind of attacking score and defensive score. I originally just added the passing and dribbling numbers, but given the difference in the scales on the graphs above you can see this wasn’t the best of ideas.

Going ahead and plotting these can give some interesting results.

Also, it’s worth pointing out we’re only measuring dribbling and vertical passing here, so don’t look at this and think x is better than y because they’re in a better position on the graph.

From here there’d also need to be a look into the defensive contribution of the deeper players, goal/xG contribution of the advanced players, whether or not they’d be a fit stylistically and then actually jumping into lots of video. Having already rambled enough though I’ll leave it where we are now.

The first graph is for deeper and attacking ‘scores’ using the original passing numbers that take into account the number of passes.

Click to Enlarge

Then the same but for under 24’s:

Click to Enlarge

Then using passing efficiency instead of totals:

* y-axis should also be labelled as passing efficiency * Click to Enlarge

Then the same again for under 24’s:

* y-axis should also be labelled as passing efficiency * Click to Enlarge

It’s interesting seeing Eden Hazard do well in both defence and attack, aided mostly by some extraordinary dribbling numbers (even in deeper areas) the Belgian international even plays an above average number of vertical passes. Whether or not he should be included here is debatable (should he be classed as a midfielder, or more of a forward/wide player?) but from the numbers above you can make the argument he’s the most ‘progressive’ player in the sample.

Sofiane Boufal is a player that shouldn’t be included, given he’s a wide player, and also gives a good example of why this is by no means definitive. Despite having such good numbers, thanks to lots of take-ons and an above average number of passes into the box, his end product is lacking. Just having a quick look at his xG numbers with StrataBet data sees his have an xG + xA of 0.30 p90, which  doesn’t seem great for an attacking player who’s getting into dangerous areas.

Anyway, looking at the graphs above, two players who look interesting that haven’t been mentioned yet include:

Fabian Ruiz – 21 – Betis

Despite playing only 154 minutes in 2016/17, Fabian Ruiz seems to have quickly become an important part of this Betis side, playing over 60% of the available minutes so far.

While he may not jump right out on the plots above, he is above average in every single one of them. His xP Rating of 1.021 is impressive, given how harsh these new xP values are, while he completes 81.1% of his 2.969 attempted take-ons p90. He attempts more take-ons in advanced areas than deeper areas (1.239 p90 vs 0.583 p90) but importantly hasn’t had an unsuccessful take-on in a deeper area.

Ruiz first came to my attention when clustering vertical passes as a player who didn’t attempt a huge amount, but was efficient in those he attempted. A couple months on and the same is still true. His 3.352 attempted vertical passes p90 doesn’t really stand out, but he completes both vertical passes from his own half and those in advanced areas at rates above what is expected.

Looking at his goal contribution using StrataBet data sees him contribute (through chances, chances created and chances where he had the secondary assist) 0.325 xG p90, which isn’t bad for a slightly deeper midfielder. As a quick comparison Mousa Dembele contributes 0.142 xG p90.

Ruiz seems to like a long shot though, with an xG per shot of 0.052 this season. This isn’t disastrous, it’s not as though he’s taking lots of shots a game, but depending on the team may not be ideal.

With him still only being young and this being his first season of regular football, it’s hard to pin down just how good he can be and what teams should be in for him. Having a quick search on Google news shows at a lot of sides rumoured to be linked with him, though given this was just before he signed a new deal with an alleged 30m euro release clause I’m skeptical about how much of it is actual interest.

Roma is a club he’s been linked to, and given the bulk of their midfield is in their late 20’s or early 30’s it could be a good move, though I think most clubs should definitely be keeping an eye on him.

Below you can see a few clips of him, just to try and put a face to the name:

Yves Bissouma – 21 – Lille

Similarly to Fabian Ruiz, Bissouma doesn’t jump right out in the above plots, but consistently performs above average while still only being 21-years-old.

With that being said, in the above graphs looking at the ‘scores’ involving passing efficiency you see he has one of the best deeper scores, while also being above average for advanced areas. A big reason for this are his take-ons from deeper areas. He attempts the 14th highest take-ons in deeper areas p90 in the 7 leagues used, with the 9th highest accuracy at 86.7% (for those with more than 1 attempted p90).

Again like Ruiz, Bissouma’s vertical pass numbers aren’t huge at 3.783 attempted p90, but he does complete the clustered vertical passes at a rate higher than expected.

Having a quick peek at his defensive numbers have him completing 4th the most interceptions and tackles in Ligue 1 this season, with another 21-year-old in Giovani Lo Celso being 1st who seems to be having a great season at PSG. For those that attempt more than 3 tackles Bissouma also has the 4th highest tackle success rate in Ligue 1 this season at 45%.

Having not watched much of Bissouma it’s hard to say just how good he is and who should be interested, but his numbers are definitely encouraging – particularly for someone so young. He’s definitely another player worth keeping an eye on.


After going through this I feel the same as I did when trying to find similar players, the methodology doesn’t seem very good at all, but I quite like the results. Rewarding players such as Eden Hazard, Mousa Dembele, Naby Keita and Frenkie de Jong as being progressive certainly ‘feels right’, though I’m not sure if the maths behind it does.

Given this just involved me throwing a bunch of things into z-scores, with some tweaking and more careful consideration over what metrics to be used it could have some use. Plotting an overall graph of deeper ‘progressive score’ and advanced ‘progressive score’ is easier than trying to compare 12 different things at once, it’s just making sure that what goes into these numbers is right.

This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations.*

*StrataData used for calculating xG values



Leave a Reply

Your email address will not be published. Required fields are marked *