JP Hochbaum
Well-known member
- Joined:
- May 22, 2012
- Posts:
- 2,043
- Liked Posts:
- 1,279
I do believe I have settled on a name for what I want to call this correlation stat for an at bat instance. It comes with finding how the leagues batting average, on base percentage, and slugging percentages correlates with average earned runs per nine innings (ERA). In the previous post I had shown how little batting average correlates with earned run average and am going to cover how on base, and slugging percentages correlate with ERA.
View attachment 2105
In the above graph it is hard to determine how much on base percentage correlates to earned runs but that is why Excel has the handy correlation coefficient. So for the stat that I created I found the correlation to be at .83, which is much higher than what batting average was, which is not surprising in the least.
Now let’s take a look at slugging percentage:
View attachment 2104
Slugging percentage when combined with ERA on a graph makes it almost look like an exact correlation, and it almost is with a correlation coefficient of .94, and if I were to graph OPS it would show a correlation of .97, which is statistically significant. So what to do with all these correlations?
I ended up just taking a batting average, an on base percentage and a slugging percentage and multiplying them by it’s correlation coefficient, thus reducing them to it’s true effect on creating a run, and then I added all those percentages up to get what I call the correlated run contribution.
So in the national league the league leaders in 2014 looked like this:
Andrew McCutchen 1.04
Giancarlo Stanton 1.03
Anthony Rizzo* .99
Justin Morneau* .97
Buster Posey .96
Yasiel Puig .95
Matt Kemp .94
Josh Harrison .943
Jayson Werth .935
Jonathan Lucroy .933
If you were to rank the top ten hitters by OPS, a few guys would shift around here, Puig would be ahead of Posey and Morneau, and Freddie Freeman would have knocked out Lucroy of the top ten here. So what is the difference? The slight advantage a hitter has in batting average, so if a hitter had a higher batting average but lower OPS there were times where the .62 correlated run contribution made a large enough difference to be more valuable than getting on base.
This is the kind of result I had intended to see when creating this stat, as I thought that although OPS had an incredibly high correlation to runs being created, it did leave out the anomalous hitters who hit for high contact and thus have higher batting averages. So in some cases, some hitters that hit for higher average, but draw fewer walks can indeed contribute more to a run scored than a guy who hits for a lower average but walks more, of course they would have to be very close to each other in OPS for the contact hitter to jump ahead. Thus if you are GM and you had two similar OPS hitters in free agency and needed a 3-5 hitter you would probably want the guy who had a higher CRC, and if you were looking for a 1-2 hitter a guy with higher OPS.
https://sportsstatsandscience.wordpress.com/2015/04/24/correlated-run-contribution/
View attachment 2105
In the above graph it is hard to determine how much on base percentage correlates to earned runs but that is why Excel has the handy correlation coefficient. So for the stat that I created I found the correlation to be at .83, which is much higher than what batting average was, which is not surprising in the least.
Now let’s take a look at slugging percentage:
View attachment 2104
Slugging percentage when combined with ERA on a graph makes it almost look like an exact correlation, and it almost is with a correlation coefficient of .94, and if I were to graph OPS it would show a correlation of .97, which is statistically significant. So what to do with all these correlations?
I ended up just taking a batting average, an on base percentage and a slugging percentage and multiplying them by it’s correlation coefficient, thus reducing them to it’s true effect on creating a run, and then I added all those percentages up to get what I call the correlated run contribution.
So in the national league the league leaders in 2014 looked like this:
Andrew McCutchen 1.04
Giancarlo Stanton 1.03
Anthony Rizzo* .99
Justin Morneau* .97
Buster Posey .96
Yasiel Puig .95
Matt Kemp .94
Josh Harrison .943
Jayson Werth .935
Jonathan Lucroy .933
If you were to rank the top ten hitters by OPS, a few guys would shift around here, Puig would be ahead of Posey and Morneau, and Freddie Freeman would have knocked out Lucroy of the top ten here. So what is the difference? The slight advantage a hitter has in batting average, so if a hitter had a higher batting average but lower OPS there were times where the .62 correlated run contribution made a large enough difference to be more valuable than getting on base.
This is the kind of result I had intended to see when creating this stat, as I thought that although OPS had an incredibly high correlation to runs being created, it did leave out the anomalous hitters who hit for high contact and thus have higher batting averages. So in some cases, some hitters that hit for higher average, but draw fewer walks can indeed contribute more to a run scored than a guy who hits for a lower average but walks more, of course they would have to be very close to each other in OPS for the contact hitter to jump ahead. Thus if you are GM and you had two similar OPS hitters in free agency and needed a 3-5 hitter you would probably want the guy who had a higher CRC, and if you were looking for a 1-2 hitter a guy with higher OPS.
https://sportsstatsandscience.wordpress.com/2015/04/24/correlated-run-contribution/