Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Comment Re:Is it really only a matter of scheduling? (Score 3, Informative) 472


While certainly the whole file may end up cached, the source for cp does a simple read/write with a small buffer -- not read in the whole file and then write it out.

Many apps or DB engines will have a similar pattern: they read/write in a relatively small buffer, but then expect the exact opposite of what you'd expect /bin/cp to do: they expect the file to stay cached (because they will read it again in the future).

So the kernel cannot know why the files are being read and written: will it be needed in the future (Firefox sqlite DB) or not (cp of a big file).

(Unfortunately, the planned mind reading extension to the kernel is still a few years out.)

Even in the specific case of /bin/cp often the files might be needed shortly after they have been copied. If you have 4 GB of RAM and you are copying a 750 MB ISO, you'd expect that ISO to stay fully cached so that the CD-writer tool can access it faster (and without wasting laptop power), right?

So in 99% of the cases it is the best kernel policy to keep around cached data as much as possible.

What makes caching wrong in the "copy huge ISO around" case is that both files are too large to fit into the cache and that cp reads and writes to the totality of both files. Since /bin/cp does not declare this in advance the kernel has no way of knowing this for sure as the operation progresses - and by the time we hit limits it's too late.

It would all be easier for the kernel if cp and dd used fadvise/madvise to declare the read-once/write-once nature of big files. It would all just work out of box. The question is, how can cp figure out whether it's truly use-once ...

The other thing that can go wrong is that arguably other apps should not be affected by this negatively - and this was the point of the article as well. I.e. cp may fill up the pagecache, but those new pages should not throw out well-used pages on the LRU, plus other write activties by other apps should not be slowed down just because there's a giant copy going on.

Those kinds of big file operations certainly work fine on my desktop boxes - so if you see such symptoms you should report it to linux-kernel@vger.kernel.org, where you will be pointed to the right tools to figure out where the bug is. (latencytop and powertop are both a good start.)

Note that i definitely could see similar problems two years ago, with older kernels - and a lot of work went into improving the kernel in this area. v2.6.35 or v2.6.36 based systems with ext3 or some other modern filesystem should work pretty well. (The interactivity break-through was somewhere around v2.6.32 - although a lot of incremental work went upstream after that, so you should try as new of a kernel as you can.)

Also, i certainly think that the Linux kernel was not desktop-centric enough for quite some time. We didn't ever ignore the desktop (it was always the primary focus for a simple reason: almost every kernel developer uses Linux as their desktop) - but the kernel community certainly under-estimated the desktop and somehow thought that the real technological challenge was on the server side. IMHO the exact opposite is true.

Fortunately, things have changed in the past few years, mostly because there's a lot of desktop Linux users now, either via some Linux distro or via Android or some of the other mobile platforms, and their voice is being heard.

Thanks,

Ingo

Comment Re:IO scheduler != CPU scheduler (Score 4, Informative) 472

I know some of the patches have made it back into the mainline kernel, any idea when they all will be merged?

The -tip tree contains development patches for the next kernel version for a number of kernel subsystems (scheduler, irqs, x86, tracing, perf, timers, etc.) - and i'm glad that you like it :-)

We typically send all patches from -tip into upstream in the merge window - except for a few select fixlets and utility patches that help our automated testing. We merge back Linus's tree on a daily basi and stabilize it on our x86 test-bed - so if you want some truly bleeding edge kernel but want proof that someone has at least built and booted it on a few boxes without crashing then you can certainly try -tip ;-)

Otherwise we try to avoid -tip specials. I.e. there are no significant out-of-tree patches that stay in -tip forever - there are only in-progress patches which we try to push to Linus ASAP. If we cannot get something upstream we drop it. This happens every now and then - not every new idea is a good idea. If we cannot convince upstream to pick up a particular change then we drop it or rework it - but we do not perpetuate out-of-tree patches.

So the number of extra commits/changes in -tip fluctuates, it typically ranges from up to a thousand down to a few dozen - depending on where we are in the development cycle.

Right now we are in the first few days of the v2.6.37 merge window and Linus pulled most of our pending trees already in the past two days, so -tip contains small fixes only. While v2.6.37 is being releasified in the next ~2.5 months, -tip will fill up again with development commits geared towards v2.6.38 - and we will also keep merging back Linus's latest tree - and so the cycle continues.

Thanks,

Ingo

Comment Re:IO scheduler != CPU scheduler (Score 3, Informative) 472

(1) As soon as RAM is exhausted and the kernel starts swapping out to disk, the desktop experience is severely impacted (and immediately so). [...]

Right. If a desktop starts swapping seriously then it's usually game over, interactivity wise. Typical desktop apps produce so much new dirty data that it's not funny if even a small portion of it has to hit disk (and has to be read back from disk) periodically.

But please note that truly heavy swapping is actually a pretty rare event. The typical event for desktop slowdowns isn't deadly swap-thrashing per se, but two types of scenarios:

1) dirty threshold throttling: when an app fills up enough RAM with dirty data (which has to be written to disk sooner or later), then the kernel first starts a 'gentle' (background, async) writeback, and then, when a second limit is exceeded starts a less gentle (throttling, synchronous) writeback. The defaults are 10% and 20% of RAM - and you can set them via. To see whether you are affected by this phenomenon you can try much more agressive values like:

  echo 1 > /proc/sys/vm/dirty_background_ratio
  echo 90 > /proc/sys/vm/dirty_ratio

These set async writeback to kick in ASAP (the disk can write back in the background just fine), but sets the 'aggressive throttling' limit up really high. This tuning might make your desktop magically faster. It may also cause really long delays if you do hit the 90% limit via some excessively dirtying app (but that's rare).

2) fsync delays. A handful of key apps such as Firefox use periodic fsync() syscalls to ensure that data has been saved to disk - and rightfully so. Linux fsync() performance used to be pretty dismal (the fync had to wait for a really long time on random writers to the disk, delaying Firefox all the time) and went through a number of improvements. If you have v2.6.36 and ext3 then it should be all pretty good.

I think a fair chunk of the "/bin/cp /from/large.iso /to/large.iso" problem could be eliminated if cp (and dd) helped the kernel and dropped the page-cache on large copies via fadvise/madvise. Linux really defaults to the most optimistic assumption: that apps are good citizens and will dirty only as much RAM as they need. Thus the kernel will generally allow apps to dirty a fair amount of RAM, before it starts throttling them.

VM and caching heuristics are tricky here - a app or DB startup sequence can produce very similar patterns of file access and IO when it warms up its cache. In that case it would be absolutely lethal to performance to drop pagecache contents and to sync them out agressively.

If the cp app did something as simple as explicitly dropping the page-cache via the fadvise/madvise system calls then a lot of user side grief could be avoided i suspect. DVD and CD burning apps are already rather careful about their pagecache footprint.

But, if you have a good testcase you should contact the VM and IO developers on linux-kernel@vger.kernel.org - we all want Linux desktops to perform well. (server workloads are much easier to handle in general and are secondary in that aspect.) We have various good tools that allow more than enough data to be captured to figure out where delays come from (blktrace, ftrace, perf, etc.) - we need more reporters and more testers.

Thanks,

Ingo

Comment Re:what about servers? (Score 5, Informative) 472

I think the Phoronix article you linked to is confusing the IO scheduler and the VM (both of which can cause many seconds of unwanted delays during GUI operations) with the CPU scheduler.

The CPU scheduler patch referenced in the Phoronix article deals with delays experienced during high CPU loads - a dozen or more tasks running at once and all burning CPU time actively. Delays of up to 45 milliseconds were reported and they were fixed to be as low as 29 milliseconds.

Also, that scheduler fix is not a v2.6.37 item: i have merged a slightly different version and sent it to Linus, so it's included in v2.6.36 already: you can see the commit here.

If you are seeing human-perceptible delays - especially in the 'several seconds' time scale, then they are quite likely not related to the CPU scheduler (unless you are running some extreme workload) but more likely to the CFQ IO scheduler or to the VM cache management policies.

In the CPU scheduler we usually deal with milliseconds-level delays and unfairnesses - which rarely raise up to the level of human perception.

Sometimes, if you are really sensitive to smooth scheduling, can see those kinds of effects visually via 'game smoothness' or perhaps 'Firefox scrolling smoothness' - but anything on the 'several seconds' timescale on a typical Linux desktop has to have some connection with IO.

Thanks,

Ingo

Comment Re:Is it really only a matter of scheduling? (Score 5, Informative) 472

Yes. Here there is another problem at play: cp reads in the whole (big) file and then writes it out. This brings the whole file into the Linux pagecache (file cache).

That, if the VM is not fully detecting that linear copy correctly, can blow a lot of useful app data (all cached) out of the pagecache. That in turn has to be read back once you click within Firefox, etc. - which generates IO and is a few orders of magnitude slower than reading the cached copy. That such data tends to be fragmented (all around on the disk in various small files) and that there is a large copy going on does not help either.

Catastrophic slowdowns on the desktop are typically such combined 'perfect storms' between multiple kernel subsystems. (for that reason they also tend to be the hardest ones to fix.)

It would be useful if /bin/cp explicitly dropped use-once data that it reads into the pagecache - there are syscalls for that.

And yes, we'd very much like to fix such slowdowns via heuristics as well (detecting large sequential IO and not letting it poison the existing cache), so good bugreports and reproducing testcases sent to linux-kernel@vger.kernel.org and people willing to try out experimental kernel patches would definitely be welcome.

Thanks,

Ingo

Comment IO scheduler != CPU scheduler (Score 5, Insightful) 472

FYI, the IO scheduler and the CPU scheduler are two completely different beasts.

The IO scheduler lives in block/cfq-iosched.c and is maintained by Jens Axboe, while the CPU scheduler lives in kernel/sched*.c and is maintained by Peter Zijlstra and myself.

The CPU scheduler decides the order of how application code is executed on CPUs (and because a CPU can run only one app at a time the scheduler switches between apps back and forth quickly, giving the grand illusion of all apps running at once) - while the IO scheduler decides how IO requests (issued by apps) reading from (or writing to) disks are ordered.

The two schedulers are very different in nature, but both can indeed cause similar looking bad symptoms on the desktop though - which is one of the reasons why people keep mixing them up.

If you see problems while copying big files then there's a fair chance that it's an IO scheduler problem (ionice might help you there, or block cgroups).

I'd like to note for the sake of completeness that the two kinds of symptoms are not always totally separate: sometimes problems during IO workloads were caused by the CPU scheduler. It's relatively rare though.

Analysing (and fixing ;-) such problems is generally a difficult task. You should mail your bug description to linux-kernel@vger.kernel.org and you will probably be asked there to perform a trace so that we can see where the delays are coming from.

On a related note i think one could make a fairly strong argument that there should be more coupling between the IO scheduler and the CPU scheduler, to help common desktop usecases.

Incidentally there is a fairly recent feature submission by Mike Galbraith that extends the (CPU) scheduler with a new feature which adds the ability to group tasks more intelligently: see Mike's auto-group scheduler patch

This feature uses cgroups for block IO requests as well.

You might want to give it a try, it might improve your large-copy workload latencies significantly. Please mail bug (or success) reports to Mike, Peter or me.

You need to apply the above patch on top of Linus's very latest tree, or on top of the scheduler development tree (which includes Linus's latest), which can be found in the -tip tree

(Continuing this discussion over email is probably more efficient.)

Thanks,

Ingo

Censorship

Submission + - Montana town cancels Peace Prize winners' speech (missoulian.com)

jkgraham writes: "In order to avoid "controversy" Choteau, MT superintendent Kevin St. John canceled the speech of Steve Running, an ecology professor and global climate scientist at the University of Montana who shared the Nobel Peace Prize for his work on global warming. It looks like in trying to apease a few local wing-nuts, St. John has elevated the "controversy" to a regional level and possibly higher."
The Internet

Submission + - Wikia Search Engine to be Launched on January 7th 1

cagnol writes: The Washington Post reports that Jimmy Wales, the founder of online encyclopaedia Wikipedia, has announced the launch of a new open-source search engine: Wikia Search on January 7th, 2008. The project will allow the community to help rank search results, in a model close to Wikipedia. However the company is a for-profit organization. This new search is supposed to challenge Google and Yahoo.
Upgrades

Best Motherboards With Large RAM Capacity? 161

cortex writes "I routinely need to analyze large datasets (principally using Matlab). I recently 'upgraded' to 64-bit Vista so that I can access larger amounts of RAM. I know that various Linux distros have had 64-bit support for years. I also typically use Intel motherboards for their reliability, but currently Intel's desktop motherboards only support 8GB of RAM and their server motherboards are too expensive. Can anyone relate their experiences with working with Vista or Linux machines running with large RAM (>8GB)? What is the best motherboard (Intel or AMD) and OS combination for workstation applications in terms of cost and reliability?"
Encryption

Submission + - The NSA vs rigged Crypto AG encryption machines? (ohmynews.com)

AHuxley writes: Did the NSA really use its super computers to read ultra sensitive messages intercepted from around the world?
Or did they get inside Crypto AG, a Swiss company that sold encryption machines to more than 100 countries?
Where algorithms swapped out for the NSA in the 1970's during the transition from mechanical to electronic machines?
How did Iran react when they found encrypted diplomatic messages in the press?

Censorship

Submission + - Wikipedia COO was Convicted Felon (theregister.co.uk) 4

An anonymous reader writes: From the Register:

"For more than six months, beginning in January of this year, Wikipedia's million-dollar check book was balanced by a convicted felon. When Carolyn Bothwell Doran was hired as the Chief Operating Officer (COO) of the Florida-based Wikimedia Foundation, she had a criminal record in three other states — Virginia, Maryland, and Texas — and she was still on parole for a DUI (driving under the influence of alcohol) hit and run that resulted in a fatality. Her record also included convictions for passing bad checks, theft, petty larceny, additional DUIs, and unlawfully wounding her boyfriend with a gun shot to the chest."

Businesses

Submission + - The Epic Battle between Microsoft and Google 1

Hugh Pickens writes: "There is a long article in the NYTimes well worth reading called "Google Gets Ready to Rumble With Microsoft" about the business strategies both companies are pursuing and about the future of applications and where they will reside — on the web or on the desktop. Google President Eric Schmidt thinks that 90 percent of computing will eventually reside in the Web-based cloud and about 2,000 companies are signing up every day for Google Apps, simpler versions of the pricey programs that make up Microsoft's lucrative Office business. Microsoft faces a business quandary as they to try to link the Web to its desktop business — "software plus Internet services," in its formulation. Microsoft will embrace the Web, while striving to maintain the revenue and profits from its desktop software businesses, the corporate gold mine, a smart strategy for now that may not be sustainable. Google faces competition from Microsoft and from other Web-based productivity software being offered by start-ups but it is "unclear at this point whether Google will be able to capitalize on the trends that it's accelerating." David B. Yoffie, a professor at the Harvard Business School, says the Google model is to try to change all the rules. If Google succeeds, "a lot of the value that Microsoft provides today is potentially obsolete.""
PlayStation (Games)

Submission + - PS3 Sales Jump in US on Heels of Price Cut

tighr writes: The PS3 has finally started to gain ground in the latest incarnation of the console wars, though still selling less than the aging Playstation 2.

The higher sales are good news for Sony, which has been running in third place in the console battle in the U.S. In October, 121,000 PlayStation 3 consoles were shipped, according to estimates from NPD Group. That ranks it lower than the seven-year-old PlayStation 2, which shipped 184,000 units in the month. The market-leading Nintendo Wii shipped 519,000 units and Microsoft's Xbox 360 shipped 366,000 units, said NPD.
Meanwhile, the Nintendo Wii has pulled even with the XBox 360 in worldwide sales:

The Wii, which sells for $250 and features a motion-sensitive controller, sold 13.2 million units worldwide as of September, Nintendo said. Microsoft reported that the Xbox 360 — in models priced from $280 to $450 — had sold 13.4 million units at the time. Then, in October, U.S. sales of the Wii exceeded Xbox 360 sales, according to the NPD Group. Combined with the Nintendo console's strength in the Japanese market, that effectively would bring the two into a dead heat in cumulative sales.
Portables

Submission + - N810 Internet Tablet Now Available

Roostersauce writes: The Nokia N810 Internet Tablet is now available online. According to Tom Keating at tmcnet.com the tablet can be bought at retailers: Best Buy Mobile, CompUSA, Micro Center, and Nokia stores in New York and Chicago. A quick check of Best Buy Mobile, CompUSA and Micro Center web sites do not yet show a listing for the device. Nokia's web site shows the device as on sell but out of stock. Link http://blog.tmcnet.com/blog/tom-keating/gadgets/nokia-n810-internet-tablet-woohoo.asp
Real Time Strategy (Games)

Submission + - Other Uncommon Treatments For Tinitus (announced.us)

cattall writes: "There are numerous practical methods of treating this condition. Still, it is interesting to discover some of the uncommon tinnitus remedies.Ringing in the ears is a fairly common affliction.For every conventional each normal therapy applicable for patients who experience ear ringing and other symptoms of tinnitus there are also many weird and peculiar treatment for tinnitus. The interesting thing is that many of them even seem to work."

Slashdot Top Deals

An authority is a person who can tell you more about something than you really care to know.

Working...