Follow Slashdot stories on Twitter


Forgot your password?
Slashdot Deals: Cyber Monday Sale! Courses ranging from coding to project management - all eLearning deals 25% off with coupon code "CYBERMONDAY25". ×

Comment Re:Depends who you ask... (Score 2) 219

Thumbs up on HDFS. The next question to ask your groups how they will be analyzing it. HDFS (and Hadoop/Spark/Whatever) will hopefully fit in nicely there. Not only will your data be redundantly copied across multiple systems, but as your data needs (and cluster) grows, so does your computational power.

Getting data in & out can be done via Java API, Rest API, FUSE or NFS Mounts. The only issue is that HDFS doesn't play well with small files, but hopefully your groups will be using large files instead.

Now administration is another story, but then there's Cloudera's Manager that's supposed to greatly simplify management. I'm currently using it to store about .25 PB right now for random analysis, but growing it's capacity is a straightforward task.

As far as backing up, HDFS provides snapshots, 3x replication (or more) across nodes in the cluster. Of course there's always the big hammer of just getting a second cluster. As an old HW sage once told me, "If you can't afford to buy two, don't buy one"

Comment Using a workstation for server/dev work (Score 1) 831

Never ever ever do development on your workstation, unless it's close enough to your production environment. If his company had enough $$ for a Macbook, I'm sure they can shell out another $400 or whatever for a Dell workstation that runs the same distro as the server.

No I didn't RTFA, but give me a break. It sounds almost as bad as developing an application using MS Access.

(Yes I have a MBP, and I develop on Linux servers all day)

Leveraging always beats prototyping.