Chasing NFL Playoff Perfection (Part 1)

The NFL playoffs are pure chaos wrapped in high stakes. In a one-game showdown, anything can happen. A fumble here, a blown coverage there, and a team’s entire season can vanish in an instant. That unpredictability is exactly what makes the wild card round so thrilling, and also so difficult to forecast. With only four years of data since the league transitioned to 17-game seasons and seven-team playoffs, the sample size is still relatively small. Every number, every stat, every matchup is a tiny hint in a much bigger, wildly unpredictable story.

Even with a limited history, patterns start to emerge. Some teams have consistent edges in key areas, while other years are complete upsets, showing how a single game can swing wildly in either direction. The challenge is not just predicting winners, but understanding which factors matter most and how they combine in a sport where variance reigns supreme. This is the story of chasing perfection, one calculated step at a time, in the thrilling chaos of the NFL wild card weekend. 




Wild Card Weekend

Data

Every good prediction starts with good data, and this project was no different. I pulled numbers from two of the most trusted corners of football analytics: pro-football-reference.com and nfelo.com. Together, they paint a full picture of who these teams really were across the season. I pulled the raw production from pfr. Points, yards, penalties, turnovers, the stuff everyone sees on Sundays. NFelo added the context. Efficiency, EPA, pace, and opponent-adjusted strength are the numbers that explain whether those box score results actually meant something. The goal was balance. 

Counting stats can lie over a single season. A team might score a ton of points thanks to short fields or lucky bounces, while another quietly dominates play without the headlines. By combining basic stats with deeper metrics, I could test whether performance was sustainable or just a one-year mirage. With only four seasons of data to work with, every variable mattered. Each number became a small piece of evidence, helping separate real contenders from teams riding temporary momentum into the chaos of wild-card weekend.

Model

At first, we went big. HistGradientBoostingClassifier seemed like the perfect weapon for capturing every nuance buried in the stats. In theory, it could extract subtle patterns and complex interactions that even sharp analysts might miss. In practice? The model basically laughed at us. With only 48 rows of data (four years of playoffs since the 17-game era began), there simply wasn't enough fuel to feed its appetite. It couldn't learn anything meaningful.

So we pivoted hard. Enter logistic regression: simple, elegant, and surprisingly powerful. It glances over all the numbers and spits out the probability of a team winning its wild-card game. No overthinking, no pretending it knows more than the data allows. Just clean, interpretable numbers we can actually trust.

How did it perform? Shockingly well, actually. The model posted an AUC of 0.7778 in 2021, absolutely crushed it with 0.9444 in 2022, dipped to 0.7500 in 2023, then bounced back to 0.7778 in 2024. Overall, it averaged an AUC of 0.8125 with a tight standard deviation of just 0.0770.

Quick translation: AUC (area under the curve) measures how well the model separates winners from losers. A score of 0.5 means that the model is no better than a coin flip, a perfect score is 1.0, and anything above 0.8 is considered excellent. Hitting 0.8125 in a one-game, high-variance playoff setting where literally anything can happen? That's legitimately impressive. The model found a signal in the chaos, and the numbers prove it.


Results

Once the model was finished, the picture got a lot clearer. Not every stat matters when everything's on the line. Some numbers consistently moved the needle in wild-card games, while others barely registered once teams stepped into the playoff pressure cooker.

On offense, the passing game reigned supreme. Offensive EPA per pass came out as the heaviest hitter in the entire model, carrying a massive positive weight of +0.84. Translation: teams that consistently created value through the air had a serious edge in win-or-go-home scenarios. Overall offensive efficiency wasn't far behind either. Offensive EPA per play clocked in at +0.56, proving that balanced effectiveness across all plays kept teams alive. Quick note on how this works: positive weights mean a stat increases win probability, and bigger numbers mean bigger impact.

Defense told a different story, one written in disruption and red zone carnage. Tackles for loss per game posted a hefty +0.69, showing just how devastating backfield chaos becomes when margins shrink to nothing. On the flip side, red zone defense absolutely crushed teams that couldn't get stops. Red Zone Percentage Allowed came in at -0.71, a brutal reminder that giving up touchdowns instead of field goals in tight playoff battles was basically a death sentence.

Then there's Strength of Schedule, sitting at -0.34. Teams that cruised through softer regular-season slates were often exposed when the competition leveled up. In the wild card round, raw talent alone wasn't enough. You needed efficiency, you needed to create chaos on defense, and you needed to have been battle-tested against real competition. Everything else was just noise.






Wildcard Predictions





And there it is! My first-round predictions of the NFL playoffs. A decent mix of some upsets and some favorites coming through. Who do you think makes it out of this round? What's your prediction for the best game in this round?



Comments

Popular Posts