One of the great pleasures of baseball is that it generates a vast amount of data for the analytically minded to use and abuse to their heart's content.
This purchase is presumably related to MLB's recent announcement of a new system
that will constantly track and measure the movement of the ball and every player on the field. Supposedly this is going to generate several terrabytes of information each game, and some team has decided to buy a Cray as a way of processing all that data. Whether that's a better idea than the proverbial Beowulf cluster I don't know, but that seems to be this team's thinking.
Most, maybe all, baseball teams have been doing some variant of advanced analytics for quite some time now. Most of this work is proprietary and secret, but there's been a lot of "open source" (or at least publicly available) work that's probably along the same lines. Sabermatricians (baseball stat people -- from "SABR', the Society for American Baseball Research) have gotten very good at measuring offense, and reasonably good at predicting hitters' future numbers. Nate Silver's PECOTA system is the most famous, but there are others that work about as well (ZiPS and Cairo being the ones I've spent time with, plus the "dumb as the monkey on Friends" system called Marcel). Pitching numbers are understood pretty well, at least as they relate to the Three True Outcomes, which are the results or a batter v. pitcher matchup that don't involve any defensive players (i.e., walks, strikeouts, and home runs).
The next great frontier of analytics is defense. There's been a lot of work in this field over the last decade, but the problem has always been in getting good data. If a ball is hit towards the shortstop and the shortstop doesn't get to it, why is that? Is it because the ball was hit too hard? Is it because the shortstop was badly positioned by his coaches? Is it because the shortstop isn't very good? Data that's not much more than "groundball to shortstop" can't really answer that question, but the new tracking system promises to answer that sort of question in full by precisely measuring reaction times, routes to the ball, and so forth. This in turn might lead to greater and greater changes in defensive positioning, different emphases in player acquisition, maybe even in-game changes based on small changes in wind patterns or whatever.
Some of what we're already learning about defense is very surprising. For example, there has been a lot of work done recently on catcher's ability to "frame" pitches, that is to make a borderline pitch look good. The most current results
suggest that the pitch-framing difference between the best and worst catcher might be worth something on the order of 5 wins. That's roughly the difference between having a random scrub and an All-Star as your right fielder, and all from a catcher's ability (or inability) to fool the umpire. It's shocking.
As for what team this is, when the news first broke it was claimed that the purchasing team "would surprise most people". That rules out the teams that are well-known to be friendly to advanced analytics -- starting with the Red Sox, Yankees, Cub, and A's. The best guess I've seen is that it's the Phillies -- they have tons of cash and seem to be very behind on analytics, and seem likely to just go out and buy a supercomputer rather than have the MIT grads in their analytics department jerry-rig a bunch of Debian boxes into something cooler and weirder.