Last night I came in to work to help with an oracle table space migration. The benefit being that we would get ~100 GB freed up. The DBA said it would be about a 20 minute outage while he rebuilt the indexes. Well, 20 minutes turned into 2 hours, because he neglected to think of the load on the production database. So I ended up having to shut down one of our heavier hitting apps and that fixed the problem. Still, I'm sleepy because instead of coming in, doing nothing, sending out an email, and then grabbing a cocktail, I came in, watched it not work, conferenced people in, made a decision that I hoped wouldn't fuck anything up, and left with enough time to slam a shot and a beer. Ugh, nothing ever runs smoothly.
Except my deploys when no one else is involved. That almost always goes smoothly. It's relying on other people that makes things difficult.
Today I start grad school. I've been getting amped about this Data Science thing. I remember waaayyyy back to when I was a C Programming student, and I applied to teach the C++ companion class. During my interview, the instructor asked me, "Do you prefer to work with People, Programs, or Data?" Assuring me there was no right or wrong answer (though in hindsight, "People" was probably the better choice of the three), I said Data. I was being honest, and I kind of said it without thinking. Coming back around now and thinking about it, yes. Yes, I do prefer to work with data. I want to munge it, wrangle it, ju jitsu it, analyze it, model it, and present it. I haven't been this excited about something since I started my Bayesian text classification project, which, come to think of it, was really a data science project. My calling has been sitting in front of me this whole time.
But I'm glad I went this route. I've become a good developer, which I think is paramount to being a good data scientist. I suspect that a lot of data scientists are not proficient in software development, primarily because it isn't the central focus of the field. But I think that will bring a lot to the table.
Looking at the data science curriculum, I'm a little concerned that it doesn't really bring in any of the major data science tools - R, MapReduce/Hadoop, NoSQL, etc. It seems like it's going to be a purely mathematical ride and I'll be left to my own devices for the tools. Which is... I mean, it's fine, the math is the hard part anyway, but I'd like a little instruction on some of these tools. I've been taking Coursera courses to try and get an high level understanding of all these tools. I just hope I get a little practical knowledge thrown in with all the theoreticals and foundation.
R is looking like a really cool fucking language. So far from the little bit I've played with it, it looks like it will cook you breakfast while giving you a beej. I absolutely love read.csv and read.table. Makes working with data so much easier than, say, Java or ColdFusion.
Well, anyway. I'm rambling at this point. But I am excited. Let the games begin!