Please create an account to participate in the Slashdot moderation system


Forgot your password?
DEAL: For $25 - Add A Second Phone Number To Your Smartphone for life! Use promo code SLASHDOT25. Also, Slashdot's Facebook page has a chat bot now. Message it for stories and more. Check out the new SourceForge HTML5 Internet speed test! ×

Comment Re:Yes and the funniest thing about all this is (Score 3, Informative) 235

Indeed, and there are edge cases, like Facebook, or Google, or whatever. The edge cases are gigantic databases that are accessed in certain specific way.

It's true that many people attempt to prematurely optimize by using Cassandra first instead of something they are already familiar with. However, when faced with some of the pains of growing an RDMBS beyond what a single box can handle, it's worth it to consider your other options. Keep in mind that if it's easy to store and make use of a huge pile of data, you're more tempted to gather that data in the first place, where 10 years ago it might have been prohibitively expensive or difficult.

There are probably less edge cases than actual NoSQL codebases, which is pretty surreal. There are more actual products then the number of people who need the products. And 99.99% of the people playing with them don't need them at all.

I can assure you that you're incorrect, but since you don't have any data to back this up, I won't bother either.

The real joke is people using them in ways that are actually slower than any RDBMS, but they think it's 'easier', usually because they never bothered to learn how JOINs work, and don't understand that it's perfectly fine to make a dozen SQL queries on a web page...that's what indexes are for.

Yes, only knuckle-dragging imbeciles are interested in new systems... *sigh*. This is an often-touted piece of flamebait that has little basis in reality. Some of the largest Cassandra users are companies who already have extensive experience scaling MySQL and other RDMBS.

While some might find that document stores like MongoDB are "easier" and use it for that reason, Cassandra has a reputation for being difficult to get started with; the reason it gets used nevertheless is because the benefits outweigh the steep learning curve.

Comment Re:And Oracle supports EXABYTE sized databases (Score 1) 235

2B columns in a row isn't why you use Cassandra.

Here are some of the actual reasons why you use Cassandra:
- No single point of failure. Every node in the cluster has the same role. This is also nice for maintenance purposes.
- Linearly scalable. Increasing the size of your cluster by N times increases the total ops/second N times.
- Tunable consistency per operation. For every read and write, you can specify how many replicas for a piece of data must respond for the operation to be considered a success. A typical strategy is requiring a quorum of replicas to respond for both reads and writes, ensuring a strongly consistent view of your data (by the pigeonhole principle) while still tolerating the failure of up to half of your replicas. If you're familiar with the CAP theorem, this lets you trade off some amount of C and A for every operation.
- It's fast (while still durable). Sub-millisecond write latencies, read latencies typically 1ms to 10ms, depending on caching, write patterns, and other factors. Pretty standard hardware with cheap rotating media drives work very well, so you don't have to buy any super-boxes.
- Multi-datacenter replication. Cassandra really does a good job of making this transparent while still giving good latencies and tunable availability/consistency.

Basing the clustering aspects of Cassandra off of Amazon Dynamo is what really brings a lot to the table here. The BigTable data model just happens to work really well with this.

Comment Re:Typical applications? (Score 3, Informative) 235

Columns in Cassandra aren't analogous to columns in an RDBMS. Every row is basically a list of (key, value) pairs. This is referred to as a column, with the key being the column name. There's no requirement that rows have the same set of column names.

Typically large rows are used for indexes or timelines. In a timeline example, you might use a timestamp for every column name and store the entry as the column value. Cassandra keeps the row sorted by column name, so all of the entries in the row (timeline) will be in chronological order.

In the case of indexes, you may use one row for every indexed value (say, one row for all users from Utah, one for all from Texas, etc). Here, each column would store the row key (primary key) of a row in another column family (table) that matches that indexed value; in this case, every column might hold a userId.


Cassandra 0.7 Can Pack 2 Billion Columns Into a Row 235

angry tapir writes "The cadre of volunteer developers behind the Cassandra distributed database have released the latest version of their open source software, able to hold up to 2 billion columns per row. The newly installed Large Row Support feature of Cassandra version 0.7 allows the database to hold up to 2 billion columns per row. Previous versions had no set upper limit, though the maximum amount of material that could be held in a single row was approximately 2GB. This upper limit has been eliminated."

Remote Control Worms With Laser Light, Using FOSS 78

Kramer747 writes "to share a new tool I've developed for neuroscience that uses optogenetics to remotely control the neurons of a worm as it swims or crawls. Its called CoLBeRT, Controlling Locomotion and Behavior in Real Time. With the instrument I can induce the worm to stop, accelerate, lay eggs or experience the illusion of touch. All source code to run the instrument is GPLd and available. Science News and Scientific American both have stories. The project homepage is at" I hope that name also constitutes a successful bid to get on the actual Colbert show!

Comment Re:Freeeeee Markeeeeeeeeeet! (Score 1) 234

I ended up going with the MSG-free variety (the sodium content was roughly 1/8 that of the standard beef stock from any other brand, and 1/4 the sodium in the "low sodium" varieties), but the free market wouldn't let me avoid corn syrup as well.

It could be argued that the free-market was working correctly as despite the product range not being ideal you were still willing to buy one of them.

The free market also prioritizes making special "varieties" of products about the same way that the general population does. Consider that you might look for some of the following: organic, low sodium, MSG-free, HFCS-free, sugar-free, low-fat, made from free-range cows, gluten-free, BPA-free, locally made, and vegetarian-friendly (vegetable-based).

It's not generally cost-effective to make, ship, store, and sell all of these varieties; mandating that they all exist would be hugely expensive. So, the free market instead serves as much of the population as it profitably can. It would be too much to ask them to operate at a loss, so this is the most you could hope for. The only way to change this is to increase demand by raising awareness of issues like BPA.

Comment Re:What about memory? (Score 1) 240

No, no. Cache is orders of magnitude more expensive than RAM, and that is why we get very little.

Partially correct.

The size of cache dictates how quickly it can be accessed. This is why L1 caches are always tiny. In fact, the smaller == faster is why you see the idea of a cache repeated endlessly in both hardware and software. Eventually, you end up with regsisters -> L1 -> L2 -> [L3] -> RAM -> [SSD] -> HDD.

The cache is typically puny because that allows it to be faster. You see cache increases for the same price when it becomes afordable to have a bigger cache without latency issues.

Comment Re:So what? (Score 1) 215

This. The fine should be their profits from the affected products from the time they started price fixing to the time they stopped.

Even that is not enough. Even if they are caught 50% of the time, that's still a good deal for them. You have to make the expected profit (in probability terms) negative for price-fixing. In other words, (fine * probability_of_being_caught) > profits.


First Full Science Results From Herschel 22

davecl writes "Today the first full science results from the Herschel Space Observatory were released, including results ranging from the formation and evolution of galaxies to the detailed physics of star formation. Details can be found from The European Space Agency, the BBC, and the Herschel mission blog that I help maintain. Briefer reports, covering rather more of the science, can also be found under the #eslab2010 hashtag on Twitter."
Open Source

Open Source Developer Knighted 101

unixfan writes "Georg Greve, developer of Open Document Format and active FOSS developer, has received a knighthood in Germany for his work. From the article: 'Some weeks ago I received news that the embassy in Berne had unsuccessfully been trying to contact me under FSFE's old office address in Zurich. This was a bit odd and unexpected. So you can probably understand my surprise to be told by the embassy upon contacting them that on 18 December 2009 I had been awarded the Cross of Merit on ribbon (Verdienstkreuz am Bande) by the Federal Republic of Germany. As you might expect, my first reaction was one of disbelief. I was, in fact, rather shaken. You could also say shocked. Quick Wikipedia research revealed this to be part of the orders of knighthood, making this a Knight's Cross.'"

Comment Re:The future (Score 1) 134

I do think the Prisoner's Dilemma is a very good model for the global warming situation. If you think of it strictly in those terms, the only way to change the Nash equilibrium is to alter the benefits of the different strategies for all players. This could either mean some type of guaranteed punishment for defectors or some type of guaranteed benefit for cooperators. I think most people instinctively realize this -- but enforcing the benefits/penalties on a global scale is the difficult part.

Slashdot Top Deals

"You need tender loving care once a week - so that I can slap you into shape." - Ellyn Mustard