Introducing the Judging Model 1.0

After rewatching the Jon Jones vs Dominick Reyes fight from back in February in preparation for Dom's fight a few weeks ago, I had an idea to create a Machine Learning model trained on past judging results to classify based on round statistics who won each individual round. In order to accomplish this, I pulled in total rounds statistics from every fight where 2 out of 3 judges had scored the fight either a 30-27 or 50-45. The reason I chose those two numbers is due to the way that MMA fights are scored in the UFC. A fighter that wins a round is scored either a 10-9 or 10-8; 10-9 means the fighter simply won the round and 10-8 means the fighter dominated the losing fighter in that round. 10-8’s are inconsistently used so I trained my model simply on a binary won or lost round rather than a 3 category win, loss, dominated scale. As to why I chose 30-27 and 50-45, a 30-27 score means that in a three round fight the winner won every round and the same holds true for 50-45 with a five round fight.
From there I needed a way to compare the total round statistics between winners and losers. I could have used raw round statistics with no transformation but I realized quickly that winning a round is not simply about how many takedowns were attempted or strikes to the head were thrown but rather the weighted comparison of the winning fighters stats vs the losing fighter. I took every category available to me on a round by round basis and subtracted the losing fighter's stats from the winner to give me the winning fighter's comparative round and did the opposite for the losing fighter (losing fighters stats minus winning fighters). What this came out to look like was the following: if the winning fighter landed 25 head shots in the first round whereas the losing fighter landed 20, the winning fighter's round score would be a +5 and the losing fighter a -5. I applied this across all statistical categories in my data set then labeled the winning fighter's round score with a 1 and the losing fighter a 0.
One thing I noted early on when transforming the data that I recognized as a potential pitfall was standardizing the data set prior to modeling. Models like Logistic Regression (the model I eventually chose) are prone to issues when data is not standardized. The logic of why I did this is as follows: utilizing the same example as above where the winning fighter received a 5 for head strikes, the winning fighter could’ve landed a knockdown and attempted a submission (two fight ending maneuvers) that to the eye of anyone watching means more than landing 5 jabs more than your opponent. However, a model like Logistic Regression would place a higher weight on the 5 than the other two statistics mentioned leading to inaccurate results and model degradation. I chose to use the built in standardization function in Sci-Kit Learn (the machine learning framework I utilized in this project). This standardization takes in all data and assigns the highest value in a column a 1 and the lowest a -1 and then adjusts all figures in between those as a ratio between 1 and -1.
Now for the fun part, results. Utilizing 980 rounds of data, the model correctly classified 92% of fights with their correct round grade. These are fantastic results considering the data is only for fights where 2/3 judges said that the winner won every round leading to some variance in how likely it is that the data is perfectly labeled. Very exciting results to say the least.
From this, I decided to take a look at the weights that the model applied to each statistic in order to see what the model says are the most important stats that judges utilize to decide a fight. The most important five were:
-
1
Ground strikes attempted
-
2
Guard passes
-
3
Standing strikes landed
-
4
Takedowns
-
5
Head shots landed
These are very interesting results in my opinion. Most fighters assume that judging is biased towards standing strikes but the data shows that in actuality, 3 of the 5 most important statistics relate to ground/grappling statistics. This is very useful for fighters preparing for fights so they will be better able to understand how to approach an opponent if they expect to fight to a decision.
As far as potential utilities of this model, I am working on building out a report on all unanimous and split decision fights to quantify which judges vary the most from the model's predictions as well as which states/countries/fighting commissions have the worst judging performances data wise. From there, one could look to model the different sub groups to see what they view as the most important statistics. What does that look like practically? Let's say a fighter has an upcoming fight in Texas, a well known poor judging state commission. Utilizing this model and others like it, the fighter could learn the biases of these commissions so they can game plan around their deficiencies and gain a leg up on their competitor in the eyes of judges.
Another utility would be live scoring rounds after they occur. Unfortunately, the UFC does not update their data in real time but rather after the fighter's hand has been raised at the end of a fight so I would not be able to do live round grading at present until they change the data availability.
Let me know your thoughts!