Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
User Journal

Journal karniv0re's Journal: Dear Gournal 2

"Oh, you mean a journal?"
"Yeah, whatever. I guess I'm not all smart like you."
Really though, I'm kinda dumb. Yesterday I rolled my foot while out running. There was car trying to turn right. I was on his passenger side trying to cross the street and would have been directly in his path. And since I could see that he was looking left and had no situational awareness, was going to run me over, so I put the brakes on, only to have my foot find the edge of a sidewalk. I didn't go down, but my foot rotated 90 degrees and I heard a loud pop. That was that. Had to walk a mile home on a bum foot. I guess that doesn't make me dumb, I avoided getting hit by a car. But I wish I had just slowed my roll before getting close to the intersection. I may have also been distracted by some nice yoga pants-clad asses. Man I hope I can get back to running before summer ends.
I got through my first homework in my Probability Models class. That was a bitch. I spent probably around 48 hours of work on 20 problems. 2 of them went answerless. Sheesh. I need to get better at this though. I really want to be a Data Scientist. I mean, there are certainly times when I look at the math and think, Jesus, what am I doing?! I can make a fine living as a software engineer. But then when I hear about some of the awesome problems being tackled by Data Scientists, I fill up with excitement. That's shit I want to do!
The problem is, where do you start?
Obviously you need a solid understanding of math. So I'm starting in the right place by getting my master's and learning foundational mathematics. And the data wrangling/munging/ju jitsu. I think I'm pretty good there. I have a solid Linux background and am excellent at processing text. But then there are the technologies. They are a huge part of it and I know none of it. This is where I need to focus in my spare time. This is where I'm struggling with how to proceed. Do I do a brief overview of each technology, or do I deep dive one by one? Are there some I can skip?
This is my list of techologies to learn/understand:
* memcached
* MapReduce
* Apache Hadoop
* Apache CouchDB
* Google BigTable (proprietary)
* HBase
* MongoDB
* Amazon Dynamo (proprietary)
* Pig
* HIVE
* Apache Cassandra
* Voldemort
* Basho Riak
* Aerospike
* Google Dremel
* Google MegaStore
* Google BigQuery
* Google Tenzing
* Redis
* Apache Spark
* Apache Spark SQL
* Apache Giraph
* Spanner
* Apache Accumulo
* Impala
* Apache BigTop
Ugghhhh, that's not even scratching the surface. There's a whole section on Big Data and another on Databases at Apache. Sheesh. Where do I even start? I guess that's my initial struggle. Do I do an overview of each one, then deep dive, or do I just start deep diving from memcached on. Or do I just figure these out as I need them? Right now I'm playing with Pig. I exported a bunch of data and I'm trying to work with it. It's slow going due to the lack of documentation (at least from what I've found so far).
I just want so badly to get going on this stuff. To make it work for me and show my company that I'm a step ahead of everyone else. Well, anyway. I'll get there. Just gonna take some time and effort.

This discussion has been archived. No new comments can be posted.

Dear Gournal

Comments Filter:
  • No matter how you look at it, you're doing this to get a job. Look at companies you'd consider working for and scan their job postings to figure which of those they're actually using (ie if they've got openings for CouchDB developers then they probably aren't using MongoDB). Break this list down into sets of competitors and complements, and see how the technologies that complement each other fit together, then see if you can work out the entire stack being used at the company that you want to work for, an

    • I actually started doing that. I grabbed a bunch of job postings on Glassdoor and extracted the "requirements". Turned it into its own Data Science project. heh.

      It allows me to eliminate a few, but they really are all over the place technology-wise. Most postings put more emphasis on the statistical analytics background than on technologies, which is a fair point, hence why I'm getting my master's. Nearly all require at least a master's.

      For now I'm just narrowing my focus to R. I gave myself a real wo

New York... when civilization falls apart, remember, we were way ahead of you. - David Letterman

Working...