PDA

View Full Version : Why we should ignore hitter vs pitcher history



Jeffy25
10-05-2011, 07:09 PM
I thought this was a fantastic article, so it would be worth sharing.

I had read the Chapter Mano vs Mano in the Book before, but it has been awhile. So I thought this would be a great refresher for everyone.

http://www.fangraphs.com/blogs/index.php/when-you-should-ignore-the-data/


When Jim Leyland was setting his lineup for Game 3 of the ALDS, he looked to data for guidance. What he found was that Ramon Santiago was 7-for-24 in his career against CC Sabathia, giving him a .292 average against the Yankees ace. How much that played into his decision to hit Santiago second, we can’t say for sure, but he did mention this fact to reporters before the game and he did hit Santiago second last night. It’s probably safe to assume that Santiago’s history against Sabathia played some role in his placement in the lineup.

When Ken Rosenthal reported this on Twitter, I threw out a response about batter/pitcher match-up data in general, saying “Specific batter vs pitcher data is probably the worst use of statistics in the entire sport.”

A lot of people took umbrage at this comment, and when Ramon Santiago proceeded to go 2-for-3 off Sabathia — including a double that momentarily gave the Tigers the lead — many were happy to point out that Leyland’s move to insert Santiago worked, and thus, his decision to look to batter/pitcher match-up data was justified. There are quite a few problems with this scenario, however.

1. Santiago’s “success” against Sabathia relies on one viewing offensive capability through the lens of batting average. Santiago did enter the game hitting .292 against Sabathia, but he had never drawn a walk against him and had just one extra base hit, so his overall line against Sabathia was .292/.292/.333, good for a .625 OPS. Unless we’re still evaluating hitters like it’s 1884, Santiago’s previous performances against Sabathia should not have convinced anyone that he was likely to do well against him last night.

2. Batter/Pitcher match-up data has been shown to have no predictive value. In The Book, Tango/Lichtman/Dolphin devote an entire chapter — Ch 6, “Mano a Mano” — to looking for evidence that previous results of specific batter/pitcher match-ups would predict future results in those same match-ups. It wasn’t there. Despite looking at the 30 most extreme examples of matched-pairs where the batter had dominated the pitcher over a three-year period, the group was barely better than average in the fourth season against those same pitchers. When looking at the flip side, where pitchers had dominated the hitters, the results were the same. Most interesting is that there was little difference in actual future performance by the 30 hitters who had dominated their rivals versus those who had been dominated by opposing pitchers. Even at the extremes, specific batter/pitcher data showed no real usefulness in projecting future results.

In reality, we shouldn’t be overly surprised that this data doesn’t really tell us anything. Even when looking at multiple years, you’re generally ending up with something in the 20-30 plate appearance realm, a ridiculously small number of confrontations from which to be drawing conclusions. But, the problems with batter/pitcher data go even deeper — in order to get a larger sample, you generally have to find players who have been matching up against each other for many years.

For instance, 16 of the 26 plate appearances Santiago had against Sabathia before last night came in 2002/2003, back when Sabathia was an inexperienced thrower trying to establish himself with the Indians. He’s a massively better pitcher now than he was then, and it’s hard to believe that anyone should care about what happened between those two 10 years ago. In fact, in the last four years, the two had faced off just three times, and Santiago had gone 0-for-3 and hit into a double play. Not only was the batter/pitcher match-up data of questionable use, it was almost all entirely from a time when the two players were at very different points in their careers.

This is the kind of data that just isn’t useful, which is why I decried its usage on Twitter. However, I want to make clear that I’m not saying that there are no scenarios where I believe a specific batter could have an advantage over a specific pitcher, or vice versa. We know certain hitters do better against certain pitch-types, and that platoon splits are very real, so we’d expect a left-handed masher to do very well against a right-handed side-armer. I’d even be open to hearing good arguments about why a specific player could have success against a specific pitcher beyond generalities like handedness and velocity.

However, I’d suggest that this is an area where the evidence would need to be based on something other than the data. Like high school statistics, the numbers are essentially useless, which is why no one spends any time quoting the results of a player’s high school performance in the draft room. That doesn’t mean that we can’t differentiate between amateur players, but that we’ve recognized that we need other tools beyond their performance to help us understand who is likely to succeed and who is not.

The same is true here. If you want to make a case that a specific batter has an advantage over a specific pitcher, go ahead and make that case. We’re not saying that there are no situations where that reality exists — we’re just saying that relying on the past results of batter/pitcher confrontations is not going to help you find those specific situations. The data tells you what happened in the past, but it shines no light on what will happen in the future, and for the purpose of deciding who should play and who should not, it should just be ignored.

hoggin88
10-05-2011, 08:30 PM
Yeah that it is a good article. I thought it was interesting how Santiago's .292 avg against Sabathia, never mind the small sample size, was actually a terrible slash line. So many things mislead that decision to bat him 2nd.

I think you have a typo in your title though. Should say something like batter vs. pitcher, not batter vs. hitter.

Jeffy25
10-05-2011, 08:33 PM
thanks

Bos_Sports4Life
10-05-2011, 10:58 PM
Well I think any manager has too use common sense when using hitter vs. pitcher stats and obviously take things into consideration..

- How recent were the atbats

- How big is the sample size

Also, If there were say 20 at bats and he has 8 prior hits and 4-5 of those were infield hits/bloopers ect, you have too ignore the stats..

RTL
10-06-2011, 12:21 AM
This should be common sense.

RevisIsland
10-06-2011, 12:42 AM
This should be common sense.

What he said.

SJearthquakes21
10-06-2011, 12:47 AM
one thing i have heard from players alot is that certain players see pitchers better then others,

so i think this is just one part of the equation, not the answer

"Ace"ves
10-06-2011, 01:31 AM
anything can happen in baseball. Guys like Boone can get the big hit :)

stretchedmonkey
10-06-2011, 02:02 AM
My opinion, baseball has become far too dependent on stats. Batting orders, bench players, whole teams get daily makeovers because there's a pitcher on the mound who 1 person in the lineup hits ****** against in the past. It's just stupid.

Yankee Clipper
10-06-2011, 04:38 AM
Good article. I'm glad to see this sort of useless stat getting addressed finally.

TheRuckus
10-06-2011, 06:58 PM
My opinion, baseball has become far too dependent on stats. Batting orders, bench players, whole teams get daily makeovers because there's a pitcher on the mound who 1 person in the lineup hits ****** against in the past. It's just stupid.

Yeah, strategy is dumb.

mavwar53
10-06-2011, 08:05 PM
It would be stupid to ignore BA of a batter vs. a pitcher, you have to take other things into account as mentioned above, how recent the AB's were, if the pitcher was a reliever and now a starter and the other way around. Just cause santiago hadn't drawn a BB in 24 AB's against sabathia what does that really say, it is such a small sample size I'd expect him to have 0-2 BB tops.

It was a very smart move to hit santiago 2nd the other day and it worked.

c0rbz
10-06-2011, 08:21 PM
Because it was the past, and they are batting in the present.

Pinstripe pride
10-07-2011, 08:41 AM
because stone cold said so

WoodandNails
10-07-2011, 09:36 AM
The worst is when they use Pitcher-versus-Team statistics, including starts that were against a completely different lineup.

Jeffy25
10-07-2011, 11:47 AM
It would be stupid to ignore BA of a batter vs. a pitcher, you have to take other things into account as mentioned above, how recent the AB's were, if the pitcher was a reliever and now a starter and the other way around. Just cause santiago hadn't drawn a BB in 24 AB's against sabathia what does that really say, it is such a small sample size I'd expect him to have 0-2 BB tops.

It was a very smart move to hit santiago 2nd the other day and it worked.

Not really, it just worked out.


And it was ridiculous to try to bunt with him.

Leyland is lucky that Santiago doesn't know how to get a bunt down.

flips333
10-07-2011, 08:02 PM
I'll buy it when I get the data and make a dyadic analysis myself. Just because one analysis didnt find it doesn't mean it's not there...