# Introducing STeloR and the MMA-DS data based ranking system

Dec. 23, 2021

After 100+ hours of programming and checking/rechecking, STeloR’s full rollout is done!

Let's start off with the math. STeloR is built on the concept of a traditional elo algorithm. With the basic format set up, I got to work adjusting and developing an MMA based system that would accurately rate fighter contributions.

A quick overview of how an elo system works. Elo algorithms are traditionally used in zero sum situations, i.e. where someone has to win or lose. A starting score is first assigned to player A and player B. In this example, we'll use 800. If the two players have equal scores coming into the matchup, the probability of victory is considered 50/50. If player A wins, we take the probability (in this example it's .5 or 50%) and subtract it from the outcome (1) which gives us 1 - .5 = .5. We then multiply that number by a weighting factor called a K factor. For this example we'll use K = 20, which gives us 20 * .5 = 10. Player A’s score would then increase from 800 to 810 after that victory and Player B, who lost, would see it decrease by (0 - .5) * 20 = -10, leaving them with a score of 790 at the end of the game. At a high level, elo algorithms attempt to determine a probability of an event occurring and then either rewarding or punishing players based on what actually happened compared to the likelihood of it happening.

So let’s convert this to MMA. At the highest level, someone has to win a fight. We could therefore create an elo algorithm since it passes the zero sum requirement, but in the end it will lack context and there are few sports that are as context dependent as MMA. The solution to this is to break apart a fight into the smallest possible component. With our current data, a round is the smallest piece we can look at.

So now that we have that set, we need to figure out how we are rating fighters. I believe there are six true outcomes for a round.

1. 1 Winning a round

2. 2 Losing a round

3. 3 Knocking out your opponent

4. 4 Getting knocked out

6. 6 Getting submitted

Next, we must convert that into an elo format. Winning a round is zero sum but options 3 to 6 are not. There doesn’t have to be a knockout or submission in a round so how do we account for that? My argument: set the odds to the probabilistic likelihood. What does that mean? We can count the number of rounds in heavyweight history that ended with a knockout and divide it by the total number of rounds, which would give us the specific round's probability of a knockout. If I adjust the starting scores to reflect this average rate, our rankings start to mimic reality to a much greater extent.

The five STeloR context scores that it leaves us with are:

1. 1 PTS (winning rounds)

2. 2 KO (knockout likelihood)

3. 3 KOd (knockout defense)

4. 4 SUB (submission likelihood)

5. 5 SUBd (submission defense)

So now we have our starting place but there is more context that needs to be added in. A striker fighting another striker should have a much lower submission probability than a heavy grappler fighting a striker. Fighters should be rewarded and punished in fair measures based on their context dependent probability of outcome. This introduces 16 context specific situations and their corresponding probabilities to give each fighter 16 context specific STeloR scores. Those 16 situations are then split by weight class to account for competition level at the fighter's specific weight class.

Now we have our starting point. For every round in UFC history, I faced each fighter off with the above logic but we ran into a problem. Fighters change weight classes so we can’t just set them at their starting weight class and let them go. We need to add and subtract context as we go.

My solution to that is what I called the lego method (that was how I imagined it in my head). Instead of setting each fighter's starting scores as the average of their first weight class, we set it to 0 for all 5 metrics. When a fighter has their first round, we add their starting score (0) to the average fighter of the weight class in the fight specific context (e.g. striker vs striker) for all five of the scores. This gives us the formula PTS = starting PTS + average PTS.

Let's go back to the previous example where fighter A and B both faced off with a starting score of 800 and ended with scores of 810 and 790 respectively. What I did instead for them was set each one's STeloR PTS ranking as the difference between starting and ending, i.e. +10 and -10. In the next round, we would add those new context specific scores to the average and come out with the correct starting score while accounting for changes in weight class. This also has the added benefit of allowing us to compare the knockout power of Francis Ngannou with Conor McGregor's. When we subtract the context out for their actual ranking, we are able to see how these fighters stack up on the 5 traits on a pound-for-pound basis. Going back and forth between context was a huge step in the development of this ranking system. From there, we rank fighters by their peak and current STeloR totals after summing all 16 context gains.

So now what? We have our PERFECT MATHEMATICAL RANKING SYSTEM right? No. I have identified multiples issues that I am working to resolve and will comprise an updated STeloR ranking system to account for them. I want to go point-by-point to explain the problem and the decision I made for why to not change anything for the time being.

## The Jose Aldo Problem

If you look at the all-time featherweight rankings, Jose Aldo is incredibly low on the list. This is because the ranking system only uses UFC fights for the results, meaning Jose’s first fight in the UFC counts as his first fight ever in STeloR. This applies to Strikeforce and all other major, older promotions that modern fighters draw their lineage from. I am going to create context specific systems to rate WEC, Strikeforce, Pride, etc, contributions so we can get a more accurate picture of MMA history. This will up the strength of the system since competition levels will be more accurate over time.

## The Khabib Problem

Khabib presents an interesting case because he, like many other fighters, goes on an undefeated streak to win the belt and then retires on top. Elo algorithms don’t especially like this because they want as many rounds as possible to be able to hone in on the true value of the fighter over time. Khabib ranks artificially low at #3 all time behind Dustin Poirier and Donald Cerrone due to his lack of rounds. I don’t think anyone would argue that peak Khabib beats peak Cerrone but elo algorithms require more context than Khabib has provided.

## The Beneil Dariush Problem

I knew right away looking at the lightweight rankings that I was going to get a flood of messages on this one. Beneil ranks #5 all time at lightweight behind Khabib but ahead of Tony Ferguson among others. The reason is that Beneil fights consistently good competition and fights a lot. His long career of consistently beating good competition helps his case in the algorithm. I have more thoughts on this concept and I make an argument later that maybe we should adjust our expectations as MMA fans away from momentum and more towards consistency. I believe this was the big problem with all of those who overlooked the Dustin Poirier vs Conor McGregor fight.

## Women’s Featherweight

One of the problems I've brought up in earlier articles is how to address the lack of fights in the women’s featherweight division. I have decided to just roll them into the women’s bantamweight context but still rank them in the correct weight class. Thus, a women’s featherweight fight would utilize the average performance of the bantamweight division fighters to determine elo probabilities.