Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

Submission + - DDOS-in-a-box: VM swarm in a dozen lines of shell (gridcentriclabs.com)

Laxitive writes: We (GridCentric) just posted a couple of interesting videos demoing a load-testing use-case on top of our freely available Xen-based virtualization platform called Copper. In both videos, we use live-cloning of VMs to instantly create a swarm of worker VMs that act as clients to a webapp. The ability to clone is exposed as an API call to the VM that wants to clone itself, meaning that in a dozen lines of shell, we can script the automatic creation and control of dozens of VMs across multiple physical computers.

Creating a clone VM in Copper is similar in function and complexity to forking a process in Unix, and carries all the same assurances: your new VMs are near exact copies of the original VM, start running within seconds of the clone command being invoked, and are "live" — meaning that all programs running on the original VM remain running on the clone VM.

The more we play with it, the more it feels like live-cloning is one of those core capabilities which is at once powerful as well as easy to leverage in designing distributed applications and services. And it seems that today, when cloud is on the top of everyone's mind, is when we should really be having a discussion on what the APIs, architecture, and features of this new class of distributed operating systems should be.

We hope this demo spurs some of that discussion...

Submission + - Mathematics: The Most Misunderstood Subject (fordham.edu) 1

Lilith's Heart-shape writes: Dr. Robert H. Lewis, professor of mathematics at Fordham University of New York, offers in this essay a defense of mathematics as a liberal arts discipline, and not merely part of a STEM (science, technology, engineering, mathematics) curriculum. In the process, he discusses what's wrong with the manner in which mathematics is currently taught in K-12 schooling.

Comment Re:LiveSQL (Score 1) 78

I should have thought things out a bit better with the stddev example - and realized that it does indeed have a reasonable closed form. Good catch.

Complex data mining is hard everywhere, that's true. The problem is that even straightforward data mining is hard once the dataset sizes reach into the hundred-millions or billions or trillions in size (implying absolute dataset sizes of terabytes or more). For google it's webpages, for biology labs it's sequences.

The big killer is the cost of transferring data, which is how traditional data systems are built. A remote host has some software set up, and you send it some data, and it processes it and returns it to you. The distinction with Hadoop is that you keep the data on distributed hosts and send the code (which is typically a lot smaller).

The point stands that incremental update of queries on mutation is not a generally solvable problem: it'll still require the addition of new constructs and the limitation of existing constructs in SQL (e.g. ordering). Hadoop approaches the issue from the other end of the spectrum: focusing on a framework that models distributable algorithms directly using a small set of primitive operators (specifically, "map" and "reduce").

-Laxitive

Comment Re:Am I the only one who finds Hadoop unusable? (Score 1) 78

In situations where you are using Hadoop, your "primary" data store should BE the HDFS store you are using to analyze it. That's a big part of the actual efficiency proposition of Hadoop.

The big trick with the "big data" approaches is to recognize that you keep _everything_ distributed, _all the time_. Your input dataset is not "copied into the system" for some particular analysis task, it _exists in the system_ from the time you acquire it, and the analysis results from it are kept distributed. It's only at specific points in time (exporting data to send to someone external, importing data into your infrastructure) that you should be messing around with copying stuff in and out of HDFS.

-Laxitive

Comment Re:LiveSQL (Score 4, Informative) 78

There are some serious technical challenges to overcome when you think about actually implementing something like this.

Take something like "select stddev(column) from table" - there's no way to get an incremental update on that expression given the original data state and a point mutation to one of the entries for the column. Any change cascades globally, and is hard to recompute on the fly without scanning all the values again.

This issue is also present in queries using ordered results (as changes to a single value participating in the ordering would affect the global ordering of results for that query).

The issue that "Big Data" presents is really the need to run -global- data analysis on extremely large datasets, utilizing data parallelism to extract performance from a cluster of machines.

What you're suggesting (basically a functional reactive framework for querying volatile persistent data), would still involve a number of limitations over the SQL model: basically disallowing the usage of any truly global algorithm across large datasets. Tools like Hadoop get around these limitations by taking the focus away from the data model (which is what SQL excels in dealing with), and putting it on providing an expressive framework for describing distributable computations (which SQL is not so great at dealing with).

-Laxitive

Comment Re:Over commit is great (Score 2, Informative) 4

Well, not really. It's the same as operating systems 'overcommitting' memory by giving each process a full virtual address space and filling it on the go. Operating systems solve this problem by... well.. using paging.

The paging approach works well for systems where you expect the in-memory working set to be tight. Mainly you'll see a graceful degradation in performance as you actually start hitting real memory limits and paging comes into effect.

Eventually, I think that can be resolved by taking a hybrid approach: wait until memory pressure builds and paging hits performance more than you'd like, then auto-migrate machines off the host as necessary. You get the best of both worlds: oversubscription when resource usage is low and performance is not affected, and on-demand resource allocation when resources are known to be needed.

-Laxitive

Submission + - Extreme Memory Oversubscription for VMs (gridcentriclabs.com) 4

Laxitive writes: Virtualization systems currently have a pretty easy time oversubscribing CPUs (running lots of VMs on a few CPUs), but have had a very hard time oversubscribing memory. GridCentric, a virtualization startup, just posted on their blog a video demoing the creation of 16 one-gigabyte desktop VMs (running X) on a computer with just 5 gigs of ram. The blog post includes a good explanation of how this is accomplished, along with a description of how it's different from the major approaches being used today (memory ballooning, VMWare's page sharing, etc.). Their method is based on a combination of lightweight VM cloning (sort of like fork() for VMs) and on-demand paging. Seems like the 'other half' of resource oversubscription for VMs might finally be here.

Comment Re:new? (Score 2, Insightful) 278

It's not that 3d user interfaces have been fully explored, but that simulated 3d interfaces on 2d desktops have some fundamental limitations. We already have some amount of simulated pseudo-depth: windows can lie on top of other windows, etc.

The problem is that by the time you get around to interacting with something, you're interacting with a 2d euclidean plane which presents a projection of some 3d model. It doesn't make the plane 3d. You can't reach around and touch the "middle" of an 3d object projected onto a 2d plane. That's a problem. These might be somewhat ameliorated by true 3d interfaces (where the display itself is 3d), but that tech has yet to mature.

If you think about it, even the way we work on our typical desk is mostly 2d, from a topological perspective. I have a pile of papers and some random crap lying around my desk. When I go to grab a document to work on, I don't just reach into the middle of a stack and pull out the right one. I don't have that capability. I need to go and start flipping pages, basically morphing my 2d topology to reveal some object hidden in 3d, and only then interact with it.

That's not to say that all 3d effects and stuff are useless. Simulated 3d is a great way of providing visual cues that we have been training ourselves on since we opened our eyes. That can be a very important aspect of intuitive interfaces.. but fundamentally it acts as a visual highlight. The goodness or badness of any particular 3d interface depends entirely on how effectively the _2d_ projection is.

Thirdly, "true" 3d is actually too limiting. We are forced to live in a 3d world, but our computers give us access to many more dimensions, weirder dimensions, than that. We can provide 2d projections of abstract non-fixed-dimensional objects, like n-ary trees (e.g. filesystems). An example of a projection of that abstract object to a 2d interface would be spotlight. It provides a 2d textbox which behaves in strange and weird ways - a 2d textbox that projects 2d manipulations (type some characters), into an arbitrary traversal of the tree. Compare the utility of that to the utility of a "true" 3d rendered filesystem. What value would that add? Sure, it would look neat, but what extra thing would you gain from it?

There's nothing magic about 3d. Computers operate above and beyond limitations of 3 dimensions, and are currently constrained to expose their behaviour through primarily 2d interfaces. Simulating 3d on top of 2d user interfaces, aside from the "visual cue" aspect, is kind of an arbitrary choice.. not necessarily the best one.

-Laxitive

Comment Re:India is sooo into equality (Score 1) 204

Ah, I see, you must be referring to the Delhi high court's support of caste discrimination. I'm having a bit of trouble finding examples of that, though. Could you point me to some of your examples?

I tried searching for '...' in google, but it doesn't yield too much in the way of results relating to the delhi high court.

Thanks in advance!

-Laxitive

Comment Re:India is sooo into equality (Score 2, Insightful) 204

I'm sure if you lend the Indians your time machine, they can go back in time and fix that issue. Until then, I guess they'll have to just live with outlawing caste discrimination in the constitution and then slowly working to change public attitudes.

Or perhaps you've discovered a way to fix the issue with smug off-topic one-liners?

Do tell. I eagerly await your insight into the issue.

-Laxitive

Comment Re:re Increase or decline? (Score 4, Insightful) 746

In experimental science, this is not uncommon. Using different methods of analyzing the same subject is, in other words, using (relatively) independent methods to analyze that subject. Using multiple independent methods and combining their results is a good thing, because it avoids experimental error and potential systemic biases that exist in every observational setup.

That said, I don't want to get into an actual discussion about the actual paper in question because I have not read the relevant hacked personal e-mails with their full context and interpreted their significance (and I likely won't have time in the near future given the pressures of day to day life). I am not particularly inclined to start implying conclusions and accusations based off of an incomplete and shoddy reading of a few out-of-context paragraphs. I am neither willing to vouch for or defend, or attack a particular piece of research until I am reasonably well informed about how that research was conducted.

There seem to be many people, however, who are willing to do exactly that.

-Laxitive

Slashdot Top Deals

One man's constant is another man's variable. -- A.J. Perlis

Working...