Thursday, January 15, 2015

Using Statistical Benchmarks to Find Successful Quarterbacks



The most important position in football is undoubtedly the quarterback. He is the leader, the field general, and the most impactful player on an offense. As a result, it is the most scrutinized position leading up to the NFL draft. Up until recently, many scouts and decision makers have had the opinion that the only way to truly evaluate a QB is to grind the film, watch every snap and analyze all their tools. While this is certainly a huge part of the process, recently more statistical and outside of the box methods of evaluation have started to spring up. From Greg Peshak, charting every snap and getting more in-depth statistics than ones provided in the box score, to rotoviz.com, projecting college players NFL success through already available stats, evaluating college players has now had some wrinkles thrown into the process.


What I plan to do with this blog is to combine both sides of evaluation as best I can. I will essentially try to become a “Data Scout”. I will watch film (mostly from draftbreakdown.com), make notes, and make as good of a scouting report that an amateur scout such as I can make. Then, I will get into the analytics. Look for red flags in the numbers, see where every prospect has success, and try to make a conclusion based on the numbers. And then to end off the whole thing, I will come back and see where the film and numbers agree, disagree, and everything in between.


Before I start making write ups for this year’s prospects, I thought it would be a good idea to outline the metrics I would be using to evaluate prospects. I’ve come up with a basic filter that has great success at not only finding successful quarterbacks in the first round, but also finding diamonds in the rough in later rounds. The filter uses 6 benchmarks in 6 key metrics. If you do not pass even one metric’s benchmarks, you fail. That QB prospect has a much higher chance of being not being an NFL starter. How much higher? We’ll get into that later.


To better explain the filter, here is an example entry in my master spreadsheet:


Name
Age
Comp% Index
Y/A Index
Att/TD Index
Att/INT Index
WR DR
Cam Newton
21
1.1287
2.8058
1.4688
0.2247
0.547


So when we look at this entry, we see that this is Cam Newton’s 2011 season at Auburn. He was 21 years old during that season, which is the first variable in the filter. As Greg Peshak, one of /r/NFL_Draft ‘s own and now contributor at RotoWorld, has noted, having very bad red flags is generally more predictive of a prospect than very good metrics. Like, in the case of age, it is a huge flag to be over 23 years old but you don’t get any ‘bonus points’ for being any younger. So the first benchmark is being under 24. If you are 24 or older, you are out.


For the next 4 numbers, I tried to make an era adjusted number to compare prospects across many years. As you would expect, passing numbers now are a lot different that passing numbers in 1999. To get these indexed numbers, I simply found the standard deviations away from the mean for each metric, in each year. So for example, Cam had a yards per attempt that was 2.8058 standard deviations away from the mean. If you aren't familiar with statistics, this is very, very good.
When it comes to completion percentage, it is the only metric that the filter really requires a prospect to be ‘great’. In order to pass the filter, a prospect must have a completion percentage that is 0.75 standard deviations above the mean. For Y/A and Att/TD, prospects only need to be above 0. It matters even less for Att/INT, where a prospect only needs to be above -0.25 standard deviations.


The last number is wide receiver dominator rating. Popularized by rotoviz.com, dominator rating is a number that attempts to quantify the impact a wide receiver’s impact on a passing offense. It is calculated by this formula: (WR Yards / Team Yards)+ (WR TDs / Team TDs). This contextualized the wide receiver’s impact by normalizing passing volume, giving a much more accurate, predictive statistic. A higher number means a wide receiver is having a larger impact on the passing game. For this study, the wide receiver whose dominator rating was assigned to each QB was the one with the most catches on the team. This might not be the most ideal way to pick the right wide receiver, but I've had success using this. Looking at the guys since the 2000 NFL draft, the list of QBs with a wide receiver who had a dominator rating above 0.8 is this:

  • Jim Sorgi
  • Jake Locker
  • Chad Henne
  • Jimmy Clausen
  • Rohan Davey
  • John Navarre
  • Josh Booty
  • Colt McCoy
  • Todd Husak
  • Cody Pickett
  • Chris Simms
  • Bradlee Van Pelt
  • Matt Leinart


Not exactly a who’s who of QB talent. Best guy on the list is probably Colt or Jake, and either guy is probably no better than a backup.


Now that I've outlined the benchmarks, let’s take a look it’s success rate. I grouped each quarterback into three groups, starter quality, backup, and bust. The breakdown for guys that passed and failed is as followed:


Starter
Backup
Bust
Totals
Pass
11
13
15
39
Fail
6
14
68
88


The filter had a ‘success rate’ of 0.2821, finding a starter quality quarterback about 28.21 percent of the time. This may not sound great, but the guys that fell outside of the filter only became starter quality 6.82 percent of the time.


Now, you might be thinking, maybe this model just explains the noise and not the signal. Maybe the filter is just selecting highly drafted quarterbacks, and isn't really finding good prospects in and of itself.  Do not worry, I also thought of that. The first of these charts is the breakdown for first round quarterbacks and the second is between rounds 2 and 7.



Starter
Backup
Bust

Pass
9
6
2
17
Fail
4
6
5
15



Starter
Backup
Bust

Pass
2
7
13
22
Fail
2
8
63
73


And here are the results as a percentage:



Starter
Backup
Bust
Pass
0.5294
0.3529
0.1176
Fail
0.2667
0.4000
0.3333



Starter
Backup
Bust
Pass
0.0909
0.3182
0.5909
Fail
0.0273
0.1096
0.8630
As you can see, the filter consistently finds starters at a better rate and avoids busts at a better rate, when stratified by round. If you were curious about the 39 that passed in the model, here is the list:

  • Cam Newton
  • Alex Smith
  • Eli Manning
  • Carson Palmer
  • Sam Bradford
  • David Carr
  • Jamarcus Russell
  • Vince Young
  • Philip Rivers
  • Mark Sanchez
  • Byron Leftwich
  • Ben Roethlisberger
  • Chad Pennington
  • Aaron Rodgers
  • Jason Campbell
  • Tim Tebow
  • Drew Brees
  • Andy Dalton
  • Colin Kaepernick
  • Kevin Kolb
  • Kellen Clemens
  • Brian Brohm
  • Charlie Frye
  • Ryan Mallett
  • Chris Redman
  • Matt Schaub
  • Stefan LeFors
  • Jeff Rowe
  • Dennis Dixon
  • Nate Davis
  • Troy Smith
  • Dan LeFevour
  • Josh Harris
  • Tom Brady
  • Greg McElroy
  • Levi Brown
  • Tim Rattay
  • Sean Canfield
  • BJ Symons


Now that we have a filter that does a good job at explaining the past, let’s introduce some new data to test it. When I was building the filter, I only used the draft classes from 2000-2011. Looking at the 2 years before 2, there are the passing QBs from 1999 and 1998:


  • Tim Couch
  • Donovan McNabb
  • Daunte Culpepper
  • Shaun King
  • Peyton Manning
  • Brian Griese
  • John Dutton

Pretty solid list of guys. I am confident in saying that McNabb, Culpepper(pre injury), and Manning were starter quality. Griese was a borderline starter, but was definitely a solid backup. King had his moments in the NFL, and was probably backup worthy. Couch was definitely not worthy of the number 1 pick, but played decently at points. Despite that, he’s probably worthy of being called a bust. Dutton was a flat out bust in the NFL, but actually played for a long time in the AFL.

I was able to avoid some of the bigger busts of all time in Ryan Leaf, Cade McNown, and Akili Smith. The filter really, really helped on this. Drafting any of those guys could have set a franchise back years. The only really successful guy that the filter missed on was Matt Hasselbeck, who went on to have a great career with the Seahawks.


Here is the list of passing QBs from 2012 and 2013:

  • Andrew Luck
  • Robert Griffin III
  • Russell Wilson
  • Nick Foles
  • Landry Jones

Luck and Wilson are studs. Hit big on them. While Griffin and Foles have had rough patches so far, they have also produced like legit franchise guys at times. I am big believer in the rest of their careers. Jones hasn’t gotten a chance to play yet, so no real evaluations can be made about him yet. Based on his draft position, I would expect him to probably end up as a reliable backup.
Of the guys that did not pass, the name that sticks out is Ryan Tannehill. I’m still not fully convinced that he is the guy in Miami, but he very well might be. The filter would have totally avoided the land mines at the top of the 2013 draft, in Geno Smith and EJ Manuel.


While the jury is still way out on these guys, here is the list of passing QBs from last year’s class:

  • Blake Bortles
  • Teddy Bridgewater
  • Aaron Murray
  • AJ McCarron
  • Tajh Boyd


While they technically failed, both Zach Mettenburger and Derek Carr were very close to passing, only failing the dominator rating test by 0.001 and 0.012 respectively.

This filter will be the most important piece of my evaluations. Guys that pass through will be taken much more seriously than guys that fall outside of it. My evaluations will still include, however, a film breakdown of each guy from my eyes, and addition metrics that are not included in the filter, such as similarity scores for comparing prospects on an individual level. I’m working to get a nice database of quarterbacks, so that the comparisons will have a large range of possibilities.