I wish I had mod points. I really really do.
Here's one area I have to give Apple some props in: their OSX interface puts some damn pretty and friendly makeup on the pig that was the old FreeBSD interface
What Apple's acquisition did was give the NeXT team money to update OpenStep for a next generation of hardware and throw marketing dollars at it to put it in front of people. Don't get me wrong: Apple's work post-acquisition on updating the interface was fantastic, but let's give credit where credit is due.
I really hate the reporting around Hadoop. Most of these people have absolutely no clue what they are talking about, and this article is just another example of that. Any bit of simple research would have revealed that the actual open source community of developers around Hadoop, Hive, Solr, etc, can be found at ApacheCon. Of course Strata is amazingly commercial: O'Reilly, being a corporate entity, is trying to make cash around the latest craze. If they weren't, they'd make sure the ASF and the other OSS organizations that help make the software had some space and would actually attend.
The easiest way to find a company who hires for open source work is to look at who is actually submitting patches back, participates on mailing lists, files bugs, etc. From my own experiences, it seems as though almost every Bay Area startup or former startup from the past 10 years (but clearly not all of them) are doing work in open source either out in the open or behind closed doors. Many positions don't have open source in big bright letters, so you might need to just flat out ask. If you are outside of the Bay Area, those companies exist but will require more legwork.
... except 100 gigabytes is not 1 terabyte.
I don't see why anyone would not want to use the GPL if they want their software to be free and open. Why create something, give it out for free, and then allow businesses to take your work, profit from it, and give nothing back? Maybe these developers are hoping to get bought out by a large company someday?
There are many businesses that want to profit from their own open source projects by including them or parts of them in other, proprietary works. The GPL essentially makes that impossible.
but I see no reason that it couldn't serve you well as a large personal file service.
HDFS is not POSIX or mountable. So actually using the data from something that is expecting POSIX is going to painful. "But there is a FUSE plug-in!" Yes, there is, but you'll take a 60% perf hit using it, assuming that it still works in newer versions of Hadoop. See none of the hardcore devs actually use it, so there is a very good chance it is completely busted.
In any case, there are still problems around losing the fsimage and having no real HA for the NN, needing quite a bit of RAM for any significant amount of files, don't forget that 8TB now turns into at least 24TB counting the 3x replication factor, etc, etc, etc.
So no, really this isn't a solution for this particular problem.
The fuse support has likely gotten worse since no one on the core dev team really spends any time with it. I'd be surprised if it still compiles.
+1 on this one.
We've been using the Buffalo modified version of DD-WRT for a few months now. It replaced a Linksys E3k that was continually dropping connections. Overall, we're pretty happy with it (QoS, DHCP, etc). I'll definitely check on the link speed, although it is connected to DSL modem that can't do gigabit anyway.
It isn't. There is an incredible overuse of glibc/Linux-isms to the point that even porting it to another UNIX is difficult.
This isn't about Microsfot getting involved with open source. This is about Microsoft not getting left out. Beyond the countless startups, Apache Hadoop already has major players like Amazon, Dell, EMC, HP, IBM, NetApp, Oracle, VMware,
Actually, there is an ever increasing amount of JNI (read: C) code in Hadoop that is in the critical path for security and performance features. Most of that code is not very portable. So either MS is going to pay for some major overhauling of that code, completely new code/branch to replicate that functionality or MS Hadoop is going to be severely lacking in features/performance.
In the case of a Hadoop task failure, the errant monkey was genetically cloned but put into a different environment. So it also served as a nature vs. nurture experiment.
Yup, I realize that going to Atom or ARM for a CPU bound process is suicide, but so is only using the tiny amount of money to try and solve the problem.
This is absolutely correct and if I had mod points, I'd spend them here.
If your budget is only £4000, you don't have the funding to build a real, actual grid for something that is CPU bound. If you are lucky, you have enough to get one or two boxes and some network gear to put on the top of someone's desk.... at least if you are doing AMD or Intel higher end procs.
Here are two ideas worth exploring...
1) Look at boxes like SeaMicro and other Atom-based mini-grids-in-a-box.
2) Look at building your own with Atom- and Arm- based machines