The use of Earned Run Average (ERA) has increased with the realization that pitchers have little control over their win/loss record. But, it hasn’t taken long for sabermetrics-savvy fans to realize the issues with ERA. It puts too much emphasis on fielding, park factors and luck. Voros McCracken proposed the first alternative to ERA, the DIPS theory (Defense Independent Pitching Statistics), under the idea that pitchers control only strikeouts, walks and homers.
Since DIPS, the amount of pitching stats we have at the ready has blown up, but as we increase our research, we also increase our objective knowledge. FIP and xFIP, as laid out by JJ Keller previously, work very closely with McCracken’s original thesis. But, as we learn more about what pitchers can control, we have been able to create even better stats.
Enter SIERA, or Skills Interactive ERA. Eric Seidman and Matt Swartz created SIERA to have a better ERA estimator by taking more stats and research into account and properly applying them. The limitations of FIP-based metrics is that they do not take balls in play into account (home runs are not considered “in play” because the defense cannot make a play on them, generally speaking). Since we know pitchers have control on ground-ball and fly-ball percentages*, SIERA takes this into account.
This importance is derived from the different expected run values from the different batted ball types. One example of how this works is that pitchers with high walk rates can reduce the damage by inducing more ground balls, which will lead to more double plays (and pitchers and double plays are like, totally BFFs).
More from Statliners
SIERA also looks into the limited and increasing returns on strikeouts and walks. Baseball common sense and new research suggests that high-strikeout pitchers can get away with more mistakes, as well as low-walk pitchers, to some extent. Using regression analysis, Swartz and Seidman developed the correct coefficients and variables to properly account for this.
Here is the formula, and if you get lost, do not worry; it is probably the most complex baseball formula you can find:
SIERA = 6.145 – 16.986*(SO/PA) + 11.434*(BB/PA) – 1.858*((GB-FB-PU)/PA) + 7.653*((SO/PA)^2) +/- 6.664*(((GB-FB-PU)/PA)^2) + 10.130*(SO/PA)*((GB-FB-PU)/PA) – 5.195*(BB/PA)*((GB-FB-PU)/PA)
One common qualm with the formula is if multicollinearity is a factor in its perceived effectiveness. Multicollinearity (or just collinearity) is a problem that formulas can run into when two variables are highly correlated, which can cause strange or invalid results because of the lack of true independence and will skew the result. Fortunately, Swartz views this statistical phenomenon as a non-factor in SIERA. He explains in detail here, expanding on the use of squares with strikeouts.
A couple more observations about SIERA:
- Strikeouts are worth their weight in gold, platinum and extraterrestrial minerals. More Ks correlates with weaker contact (according to research by Swartz and Seidman), which means a low BABIP is more sustainable. HR/FB rates can live in the low-single digits as well, something xFIP does not take into account. Also, more baserunners will get stranded (LOB% can live high).
- Walks are bad but not quite kryptonite in small doses. SIERA accounts for some walks that are bound to happen, but it doesn’t start to really hurt you until they become a problem.
SIERA is the most in-depth ERA-estimator you are going to find, and it is also the most accurate. It performs better than FIP and xFIP in predicting next year’s ERA and in year-to-year consistency. Not even projection systems have yet to outsmart SIERA.
Read more about the development of SIERA in the five part series by the duo, located here. The formula did not simply appear to them in a dream, or hit them on the head in the middle of an apple. Much research and time was dedicated to craft and hone the perfect formula. At the very least, it’s a fascinating read, and it will most likely convert any FIP/xFIP followers into SIERA believers.
If you still haven’t got your fix on the SIERA diet, check out a Q&A session co-creator Matt Swartz did on the stat, addressing a range of topics from multicollinearity to Michael Cuddyer’s SIERA. For the more math-savvy readers, Swartz goes deeper into the inner-workings of SIERA on sabermetrics pioneer Tom Tango’s personal blog, as well as documenting any updates.
*Note: no research has yet to conclusively find a use for line drive percentage, besides luck. There is a much weaker correlation between year to year LD% than GB and FB%.
Want an explanation of a stat? Click on it the first time it appears in the article or view our Saber Glossary.