Creating our first MMA ranking system
Hello all! Just wanted to quickly announce that MMA-DS is starting a newsletter! Also, if you are new here make sure to read the following three posts to understand the full context of how I generated the results in this article: Strategic Classes, Tactical Classes and Judging Model 1.0.
While working on my Derrick Lewis vs Curtis Blaydes article, I came to a realization about the tools I have been using. All of my data analysis centers around taking the strategy and tactics of fighters, quantifying it and then seeing how effective it is based on my judging model. However, this does not tell a complete story of a fighter's offensive output. Finish heavy fighters like Lewis were downgraded because they don't win rounds. To go about fixing this, I took a few steps back and asked myself: what are all the aspects of winning a fight at a high level?
A fighter can win by points, submission or knock out but my current tools only focused on the first of those three. So how could I incorporate a rarer event into my analysis? Points are easy to quantify because every round is a zero sum game meaning someone has to win. A fighter does not have to get knocked out or submitted which makes them rarer events from a probability standpoint. What I figured out was that I could take those probabilities in comparison to the actual results and assign a score to each fighter on a round-by-round basis that cumulatively would take into account all three win conditions. Shoutout to Justin Filteau who identified that this logic is the basis for the ELO chess algorithm.
On a round-by-round level, I created a data set with 7 simple columns:
Offensive Strategic Class (OSC)
Defensive Strategic Class (DSC)
If fighter won round (1/0)
If fighter KO’d opponent (1/0)
If fighter Subbed opponent (1/0)
From there, I pivoted the data on weight class, round number, OSC, and DSC. Verbally, this would give us a data set that would say:
At a given round number in a given weight class when OSC x and DSC y faceoff against each other, the probability of round victory is a%, probability of KO is b% and probability of submission is c%.
So now we have a pretty large data set of combined probabilities with their anchoring features of weight class and round number. Next, we take that data set and merge it back into the actual results. So in addition to the 7 columns above we add in 3 more:
Probability of winning a round on points (PTS)
Probability of KO (KO)
Probability of SUB (SUB)
Since every round is zero sum, someone has to win. I subtracted the probability of each from their respective actual outcome. The math looks like this:
If a fighter won a round in which they had a .4 or 40% probability of winning, then 1 - .4 = .6 is their PTS score for that round.
If a fighter lost with a .4 or 40% chance of winning, then 0 - .4 = -.4 is their PTS score for that round.
The above concept applies to SUB and KO as well but with smaller probability features since they are rarer events. I then added up all probability gains across a single row (PTS, KO, and SUB) together to create a placeholder metric that I called oSTeloR for total offensive STeloR score. Let's take a quick look at an example so we can see how this will look on a round-by-round basis. I will be using Israel Adesanya's last fight against Paulo Costa.
Once we have these results on a round-by-round basis, we can sum a fighter's rounds together and get an overall look at their total, probabilistic contributions offensively. What jumped out immediately to me is that fighters in their prime were heavily weighted towards the top of rankings. To fix this, I figured there were two options: segment fighters with the same fight count or segment fighters with the same round count. I have leaned towards the latter option through my research because I feel as though it gives a better picture of overall contribution. Since Ciryl Gane fights this coming weekend, I am going to use him as my example. Gane has fought 9 rounds in the UFC so far. In order to compare him to other heavyweight fighters, I have to segment each fighter's data to their first 9 rounds as well. So, what are the results?
|1||Junior Dos Santos||4.764286||2.329831||-0.274564||6.819553|
In terms of combined probability gain, Gane ranks as number 2 all time through 9 rounds trailing only Junior Dos Santos. Wow. Based on this ranking system, Gane is essentially the second greatest heavyweight prospect trailing only a young Junior Dos Santos.
What really stands out is that Gane ranks second all time through 9 rounds as a submission artist. Why is that? Gane has two submissions on his record: a head and arm choke as well as a heel hook. We will dive into this later but the type of submission and the round and weight class it happened in are both variable and extremely important in determining a fighter’s KO and SUB scores. I am going to be doing a further write up on Gane tomorrow so make sure to check that out for a deeper look into Gane.
One thing that also stood out to me is that the greatest heavyweight of all time, Stipe Miocic, was not in the top 5 nor the top 10 through his first 9 rounds. We had to go all the way down to #23 to find him. That stood out to me as a major red flag in my rankings and deserved a deeper dive.
Looking at Stipe’s record, he has fought 43 UFC rounds and is a multi-time defending heavyweight champion. But what if we split his time in the UFC in half? Let’s compare first 21 round Stipe with last 22 round and see which heavyweights rank at the top through the first 21 rounds of their career and the last 22, separately.
|1||Junior Dos Santos||7.990990||3.207555||-0.404267||10.794278|
Through the first 21 rounds of each heavyweight fighter's career, Stipe ranks a whopping 11th and doesn't even show up in our table.
|3||Junior Dos Santos||4.361429||-0.850221||-0.201509||3.309699|
Wow! Not only is Stipe #1, his cumulative oSTeloR is 37% higher than second place Andrei Arlovkski. This leaves open tons of potential like prospect rankings, a series on greatest peaks, best championship runs, all kinds of awesome content that I will be putting out!
The last point I want to hit on before I close this out is something I discovered while researching all time middleweight rankings.
This comparison is fighters at middleweight through their first 21 rounds. As we can see, Thiago Santos' incredible power puts him a bit above Israel Adesanya in total oSTeloR but more interestingly, Thiago has a much higher negative SUB score than Izzy. Why is that?
I broke down the round-by-round data and realized this was a perfect demonstration of the value of these probabilities. Israel fights as a predominantly stand up fighter and his opponents choose to fight him the same way. Because of this, he naturally has a low probability of submitting his opponents. Thiago, on the other hand, mixes in more grappling and takedowns. His probability of submission is thus higher but he has not capitalized on it enough to gain score therefore he lost points. This too opens up tons of possibilities for further discovery.
While these statistics are not yet predictive, they represent a first of its kind: mathematical ranking algorithm for UFC fighters. STeloR is going to be used in all upcoming fight breakdowns in new and unique ways so that I can learn about its potential further while putting out great content for everyone to read. Thank you for following along, let me know if I can answer any questions and make sure to sign up for the MMA-DS newsletter.