Tuesday, April 03, 2007

Protection Exists

The sabermetric community has generally been successful at challenging baseball’s conventional wisdom. However, Bill James has noted a troubling historical pattern: sabermetric research often labels a phenomenon (e.g., clutch hitting) as non-existent when it can not be measured or detected. James suggests previous methods may be flawed and we may be “underestimating the fog” (note: that's a PDF link). Similarly, baseball statisticians may simply have too few instances when trying to measure something. Millions of pitches and at bats may not be sufficient given the statistical methods we are utilizing and the delicate trends we are attempting to find. Perhaps if baseball games lasted 100 innings and we had data sets containing 100 million at bats, we would then be able to detect more of the game’s mysteries.

My suggestion?
A production function – and its inputs – can provide clearer methods for measurement.

Take protection, as an example. Is a batter’s success at the plate affected by having Barry Bonds rather than Neifi Perez in the on-deck circle? Previous research has answered this question by looking at a batter’s outcome with varying hitters on-deck. Such analysis consistently arrives at the same conclusion – protection does not exist:
  • J.C. Bradbury, in his recent The Baseball Economist, states, “Protection is a myth.”
  • In Baseball Prospectus’ Baseball Between the Numbers, James Click concludes, “Batting performance does not change significantly with the quality of the following batter..."

Considering James’ fog argument, such prior research methods may be flawed. Bradbury’s regression analysis attempts to measure the effect of the on-deck hitter's quality on the current batter's outcome (his regression model has the on-deck hitter’s OPS on the right-hand side and the current batter’s outcome on the left-hand side). This approach is intuitive; in fact, my initial instinct might be to perform similar research. However, at bat outcomes involve many moving parts (where the ball lands, reaction of the defense, and luck, to name a few), and Bradbury is trying to measure the effect of an outcome-based rate (OPS) on another outcome. Thus, if there is some noise or randomness within the data, the problem would be compounded in the findings. When discussing the previous development of a new statistic (platoon differential), James summarizes the challenges:

“… the result embodies not just all of the randomness in two original statistics, but all of the randomness in four original statistics. Unless you have extremely stable ‘original elements’ – original statistics stabilized by hundreds of thousands of trials – then the result is, for all practical purposes, just random numbers.

We ran astray because we have been assuming that random data is proof of nothingness, when in reality random data proves nothing.”

How can we get around these issues?
Through the lens of a production function, we can analyze this same protection problem as a process involving inputs as well as outputs. For the “protection process,” inputs can describe what is going on within a pitch before the batter reacts to it (my data set tells me the location coordinates, velocity, pitch type, etc. of every pitch thrown in MLB since 2002). If protection exists, a batter’s experience at the plate would be distinctly different – in several advantageous ways – when a great hitter is standing in the on-deck circle. For example, we would expect a batter in front of Barry Bonds to see more pitches within the strike zone (the pitcher will nibble less) and more fastballs (pitcher has more control). Both MLB players and empirical research would agree that these two inputs provide a significant advantage to the batter.

My regression analysis attempts to measure the effect of a better on-deck hitter in this way, i.e., the effect on the current batter's experience in seeing both more pitches in the strike-zone and more fastballs. Specifically, either “strike-zone location: yes or no” or “fastball: yes or no” is my dependent variable (left-hand side). On the right-hand side, I include the OPS of the next hitter and then control for everything within the situation: the pitcher, the batter, pitch type, count, outs, runners, ballpark, and year.

Using per pitch data from 2002 through 2006, the results show that better on-deck hitters have a positive and significant effect on both the strike location and fastball inputs, and hence, protection does exist in so far as a pitcher adjusts his approach and a batter enjoys multiple advantages when a good hitter is on-deck.

Effect on a pitch being located within the strike-zone:


Coefficient (Std Error)

OPS of next batter

.0169


(.0029)


Effect on a pitch being a fastball:


Coefficient (Std Error)

OPS of next batter

.0107


(.0029)



The protection production function seems to tell us conflicting stories. The "input" findings show that protection exists, but the "output" evidence suggests that protection does not exist. So, which answer is correct? In addition to the potential randomness issue discussed earlier, outputs suffer from one other relative disadvantage – the mere volume of data being studied is different. Analysis at the per-pitch level (inputs) employs about four times the number of instances as per-at bat level analysis (outputs). Thus, while prior research may (or may not) point us in the right direction, I would argue that the production function's inputs push us much closer to the truth.

Lastly, moving beyond this discussion on protection, I want to be clear about my broader argument. The sabermetric community will benefit as it moves away from its relatively strict reliance on outcomes and outputs. Events on the field of any sport involve a great deal of processes. While outcome data (e.g., much of what you find online at great sites such as retrosheet and baseball-reference) have generally been more widely available, a full picture of economic analysis in the future will rely much more heavily on whole processes and their inputs.

If you would like to contact me directly, please email me at kkovash at gmail.com.

39 Comments:

At 11:02 AM, Blogger Tom said...

We looked at protection in The Book, and we did find a significant, component-wise, effect, but not on an overall impact. As it turns out, we excerpted the relevant part of the chapter here (written by Andy Dolphin):

http://www.hardballtimes.com/main/article/pitching-around-batters

 
At 11:05 AM, Blogger Tom said...

And generally speaking, I agree that the pot of gold does lie in pitch-by-pitch data. That doesn't mean though that it would necessarily be better in all situations.

 
At 11:11 AM, Blogger Keith Law said...

Interesting article. I'd be concerned about the accuracy of the raw data you're using - the people who compile the pitch-by-pitch data are just stringers, paid $50 per game or so, and I'm not sure how much I trust in their judgment of pitch types. Also, not all pitches of the same type are created equal - an 88 mph fastball and a 96 mph fastball both show up as "fastball" in the data, as do a two-seamer and a four-seamer, to give two hypothetical examples. Anyway, always good to see people continuing to question basic assumptions, especially as more data become available.

 
At 8:15 PM, Blogger Neil I said...

If you are saying the good hitter on deck is the reason that the batter sees more strikes, and thus a positive input, then I suggest there is also a philosophy where pitchers WANT to get ahead of a hitter by throwing early strikes, to increase the odds of getting the batter out. This could explain why the "input" in theory, does not match the "output"....? Sometimes, throwing early strikes puts a hitter at a disadvantage when they start out "down in the count".

 
At 9:54 AM, Blogger 遊樂園 said...

人逢順境不逞強,身處逆境不示弱。 ....................................................

 
At 2:30 AM, Blogger 春天來嚕 said...

may the blessing be always with you!! ........................................

 
At 2:38 AM, Blogger 洪淑芷 said...

若對自己誠實,日積月累,就無法對別人不忠了。........................................

 
At 3:11 AM, Blogger JeremiahRenne332 said...

生活盡可低,志氣當高潔................................................................

 
At 8:28 AM, Blogger 倫妍 said...

說「吃虧就是便宜的人」,多半不是吃虧的人。..................................................

 
At 9:33 AM, Blogger 志義 said...

回應是我能為您做的最大的支持 ........................................

 
At 7:49 AM, Blogger BokHaile8854 said...

任何事都是由一個決心,一顆種子開始。........................................

 
At 6:41 PM, Blogger 家祥 said...

Thank you for Posts~............................................................

 
At 12:43 AM, Blogger JesseniaT_Or怡臻ndorf said...

No pains, no gains................................................................

 
At 6:46 AM, Blogger 慧玲 said...

本土天堂自拍台灣夫妻自拍做愛自拍照片打炮自拍貼圖夫妻自拍片免費線上直播片銀赫歐美av線上歐美av線上看歐美女自慰歐美成人女星歐美成人片女星歐美成人免費線上歐美成人情色歐美色情圖貼歐美免費成人電影歐美免費成人影片觀看歐美免費自拍歐美免費做愛片歐美免費情色影片模特兒平台標題樣?嫚雪兒免費小說影片avi影片a直播影片下?影片分享fuck影片成人片影片成人免費凹凸色色卡通圖片免費即時通視訊成人論壇巨乳

 
At 9:31 AM, Blogger D415_evonN_Risinger0 said...

最豐滿最好之稻穗,便最貼近地面..............................

 
At 10:15 PM, Blogger Jero思翰eded said...

Look before you leap.............................................................

 
At 8:54 AM, Blogger SadeRa盈君iford0412 said...

新的一天 祝你有所成長~~ ....................................................

 
At 3:51 AM, Blogger 慧君 said...

感謝您寫下您的生活,也是把珍寶來和諸君分享的心意。.................................................................

 
At 10:03 PM, Blogger 慧玲慧玲 said...

死亡是悲哀的,但活得不快樂更悲哀。.................................................................

 
At 9:13 PM, Blogger 凱文凱文 said...

成熟,就是有能力適應生活中的模糊。.................................................................

 
At 10:10 PM, Blogger 韓秀美 said...

It takes all kinds to make a world.............................................................

 
At 3:25 AM, Blogger 宥妃宥妃 said...

嗨!很喜歡來這欣賞你的作品,幫你推推推當上人氣王唷..................................................................

 
At 2:56 AM, Blogger 懿綺懿綺 said...

教育無他,愛與榜樣而已............................................................

 
At 1:13 PM, Blogger 懿綺懿綺 said...

Offence is the best defence.............................................................

 
At 11:02 PM, Blogger 吳婷婷 said...

認清問題就等於已經解決了一半的問題。..................................................................

 
At 7:25 AM, Blogger 楊儀卉 said...

the best as always thanks............................................................

 
At 11:31 PM, Blogger 洪瑋婷洪瑋婷 said...

良言一句三冬暖,惡語傷人六月寒。............................................................

 
At 8:36 PM, Blogger 曾法幸 said...

來看你囉~加油~~!!.................................[/url]...............

 
At 6:01 PM, Blogger RicoLisi0802志竹 said...

生命如夏花洵爛;死如秋葉之靜美。.......................................................

 
At 6:23 PM, Blogger 楊燕沛楊燕沛 said...

河水永遠是相同的,可是每一剎那又都是新的。. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

 
At 8:44 PM, Blogger 文王廷 said...

Poverty is stranger to industry.........................................

 
At 7:07 AM, Blogger 建鄭勳建鄭勳 said...

與人相處不妨多用眼睛說話,多用嘴巴思考,............................................................

 
At 10:29 PM, Blogger 倪平 said...

喜歡你的部落格,留言請您繼續加油............................................................

 
At 6:06 AM, Blogger 立和辛和胡辛和辛偉 said...

困難的不在於新概念,而在於逃避舊有的概念。......................................................................

 
At 7:02 AM, Blogger 翊翊翊翊張瑜翊翊翊 said...

成功多屬於那些很快做出決定,卻又不輕易變更的人。而失敗也經常屬於那些很難做出決定,卻又經常變更的人.................................................... ............

 
At 6:28 PM, Blogger 偉冠儒冠儒倫 said...

很不賴的分享!! 多謝啦!!◑0◐..................................................................

 
At 10:58 PM, Blogger 洪勳劉耀德劉耀德華 said...

多譏樓主分享 正野緊係要回啦..................................................................

 
At 8:33 AM, Blogger 鲁涵淞 said...

成功可招引朋友,挫敗可考驗朋友......................................................................

 
At 2:44 AM, Blogger Rob Torres said...

Thanks your post....
Brian Hoshowski,include several companies, MBA will present a seminar on the topics included in his book, A World of Options: the greatest financial opportunity of a lifetime, to provide Oregonians with tools to improve their personal financial situations. Topics include the primary drivers behind the current economic state, consumer spending waves, the world monetary cycle and the investment landscape contact: Brian@marketchaossolutionsinc.com OR 97219 Phone:503 244-2692

 

Post a Comment

<< Home