Become a fan of Slashdot on Facebook


Forgot your password?
Slashdot Deals: Cyber Monday Sale Extended! Courses ranging from coding to project management - all eLearning deals 20% off with coupon code "CYBERMONDAY20". ×

Comment Re:Depends who you ask... (Score 2) 219

Thumbs up on HDFS. The next question to ask your groups how they will be analyzing it. HDFS (and Hadoop/Spark/Whatever) will hopefully fit in nicely there. Not only will your data be redundantly copied across multiple systems, but as your data needs (and cluster) grows, so does your computational power.

Getting data in & out can be done via Java API, Rest API, FUSE or NFS Mounts. The only issue is that HDFS doesn't play well with small files, but hopefully your groups will be using large files instead.

Now administration is another story, but then there's Cloudera's Manager that's supposed to greatly simplify management. I'm currently using it to store about .25 PB right now for random analysis, but growing it's capacity is a straightforward task.

As far as backing up, HDFS provides snapshots, 3x replication (or more) across nodes in the cluster. Of course there's always the big hammer of just getting a second cluster. As an old HW sage once told me, "If you can't afford to buy two, don't buy one"

Comment Using a workstation for server/dev work (Score 1) 831

Never ever ever do development on your workstation, unless it's close enough to your production environment. If his company had enough $$ for a Macbook, I'm sure they can shell out another $400 or whatever for a Dell workstation that runs the same distro as the server.

No I didn't RTFA, but give me a break. It sounds almost as bad as developing an application using MS Access.

(Yes I have a MBP, and I develop on Linux servers all day)

SCCS, the source motel! Programs check in and never check out! -- Ken Thompson