irinotecan - Slashdot User

Comment Re:easy solution: (Score 2, Informative) 472

by Ingo Molnar on Sunday October 24, 2010 @09:32AM (#34003482) Attached to: The State of Linux IO Scheduling For the Desktop?

The trouble is that in server workloads you generally don't see ONE LARGE I/O operation - lots of small ones instead. There are very very few server workloads that involve transferring >100MB data at a time (even when it comes to DB snapshoting).

There's lots of server workloads that involve large IO requests:

- backups
- DB startup/shutdown
- DB traffic that generates or reads a lot of new data (say report generation)
- HPC workloads that work with huge data sets
- animation farms that work with huge images/movies
- web servers streaming out big files
- fsck
- virtual desktop servers where the desktops are virtual instances running on the server. There any IO load within that 'desktop' runs on the server.

etc. As there is a fair number of server workloads that are IO heavy but which use small IO requests.

On the desktop this is common (all your AVI files).

If you have those big files in networked storage or if you are backing them up to some network host then you've already transformed those kinds of IO requests into big IO requests on the server side as well: the big file you read or write on the desktop the network file/backup server will read/write from its own disks, etc.

Really, "interactivity sucks during big IO" kind of bugs can hurt servers just as much as they can hurt desktops. The boundary between desktops and servers is very fluid.

Comment Re:what about servers? (Score 2, Informative) 472

by Ingo Molnar on Sunday October 24, 2010 @09:17AM (#34003420) Attached to: The State of Linux IO Scheduling For the Desktop?

Yeah, that's what the discussion was about - we improved that particular case, see this commit (which can be found in v2.6.36), and Phoronix reported about that upstream fix.

Thanks,

Ingo

Comment Re:It sucks I agree (Score 4, Interesting) 472

by Ingo Molnar on Sunday October 24, 2010 @06:50AM (#34002790) Attached to: The State of Linux IO Scheduling For the Desktop?

There's also the VM fix from Wu Fengguang, included in v2.6.36, which addresses similar "slowdown while copying large amounts of data" bugs.

There were about a dozen kernel bugs causing similar symptoms, which we fixed over the course of several kernel releases. They were almost evenly spread out between filesystem code, the VM and the IO scheduler. And yes, i agree that it took too long to acknowledge and address them - these problems have been going on for several years. It's a serious kernel development process failure.

If anyone here still experiences bad desktop stalls while handling big files with v2.6.36 too then we'd appreciate a quick bug report sent to linux-kernel@vger.kernel.org.

Thanks,

Ingo

Comment Re:Is it really only a matter of scheduling? (Score 2, Interesting) 472

by Ingo Molnar on Sunday October 24, 2010 @05:24AM (#34002478) Attached to: The State of Linux IO Scheduling For the Desktop?

So I know some people may read this and think "haha, funny joke" but given that most users are extremely predictable regarding what programs they use and when and how they use them (same with web browsing), shouldnt it be possible to gather user activity over time and analyze it to help improve scheduling.

Yeah, that's certainly a possibility.

This is also the goal of most heuristics in the kernel: to figure out a hidden piece of information that the application (and user) has not passed to the kernel explicitly.

The problem comes when the kernel gets it wrong - the kernel and applications can easily get into a feedback loop / arms race of who knows how to trick the other one into doing what the app writer (or kernel writer) thinks is best. In such cases we get the worst of both worlds: we get the bad case and we get the cost of heuristics.

(Heuristic and predictive systems also tend to be complex and hard to analyze: you can rarely reproduce bugs without having the exact same filesystem layout and usage pattern as the user experienced, etc.)

What we found is that in terms of default behavior it's a bit better to keep things simple and predictable/deterministic and then give apps the way to inject extra information into the kernel. We have the fadvise/madvise calls which can be used with the POSIX_FADV_DONTNEED flag to drop cached content from the page cache.

Heuristics and predictive techniques are done when we can be reasonably sure that we get the decisions right: for example there's a piece of fairly advanced code in the Linux page cache trying to figure out whether to pre-fetch data or not.

The large file copy interactivity problems some have mentioned here were most likely real kernel bugs (in the filesystem, IO scheduling and VM subsystems) and were hopefully fixed in the v2.6.33 - v2.6.36 timeframe.

If you can still reproduce any such problems then please report them to linux-kernel@vger.kernel.org so we can fix it ASAP.

In any case, we could all be wrong about it, so if you have a good implementation of more aggressive predictive algorithms i'm sure a lot of people would try them out - me included. We kernel developers want a better desktop just as much as you want it.

Comment Re:It sucks I agree (Score 4, Informative) 472

by Ingo Molnar on Sunday October 24, 2010 @05:02AM (#34002440) Attached to: The State of Linux IO Scheduling For the Desktop?

Such drastic change! I have seen this happen on numerous systems and I just change the elevator to "deadline" and poof! The problem is gone. See this discussion for some details. The CFQ scheduler is great for a Linux server running a database, but it completely sucks for desktop or any server used to write large files to.

I see that the bug entry you referred to contains measurements from early 2010, at which point Ubuntu was using v2.6.31-ish kernels IIRC. (and that's the kernel version that is being referred to in the bug entry as well.)

A lot of work has gone into this area in the past 1-2 years, and v2.6.33 is the first kernel version where you should see the improvements. Slashdot reported on that controversy as well.

If you can still reproduce interactivity problems during large file copies with CFQ on v2.6.36 (and it goes away when you switch the IO scheduler to deadline), please report it to linux-kernel@vger.kernel.org so that it can be fixed ASAP. (You can also mail me directly, i'll forward it to the right list and people.)

Thanks,

Ingo

Comment Re:what about servers? (Score 3, Informative) 472

by Ingo Molnar on Sunday October 24, 2010 @04:52AM (#34002394) Attached to: The State of Linux IO Scheduling For the Desktop?

Ingo, I find delays of 29-45ms to be pretty noticeable. To put it another way, if you had a delay of 10ms before, and you're now getting a delay of 50ms due to some background copy, all of your applications went from running at 100fps to 20fps, which I think even non-sensitive people can pick up on, even outside of games and smooth scrolling. VIM feels different over a 10ms LAN connection vs. a 45ms connection from my home.

Yes i agree with you that if a 45 msecs latency happens on every frame then that will snowball and will thoroughly ruin game interactivity - but note the specific context here:

you can see the commit referenced by Phoronix here

(hm, my first link above was broken, sorry about that.)

Those 45 msec delays were statistical-max outliners - with the average latency at 7.3 msecs. This got cut down to 25 msecs / 6.6 msecs respectively via the patch. Note that it's also a specific, CPU overloaded workload that was measured here, so not typical of the desktop unless you are a developer running make -j build jobs.

We care about optimizing maximum latencies because those are what can cause occasional hickups on the desktop - a lagging mouse pointer - or some other non-smooth visual artifact.

Thanks,

Ingo

Comment Re:what about servers? (Score 4, Informative) 472

by Ingo Molnar on Sunday October 24, 2010 @04:38AM (#34002338) Attached to: The State of Linux IO Scheduling For the Desktop?

Sorry dude, it looks like it's a hardware specific problem. I did that on nearly 700G of large files and then fired up the flight sim while it was still going. The only slow down was on file related activity, which is totally what you'd expect. I had it running full screen across two monitors without any drop in frame rate. AND I'm using economy hardware.

It may also be kernel version dependent - with older kernels still showing this bug.

A lot of work has gone into the Linux kernel in the past 2 years to improve this area - and yes, i think much of the criticism from those who have met this bug and were annoyed by it was fundamentally justified - this bug was real and it should have been fixed sooner.

Kernels post v2.6.33 ought to be much better - with v2.6.36 bringing another set of improvements in this area. The fixes were all over the place: IO scheduler, VM and filesystem code and few of them were simple.

This Slashdot article from 1.5 years ago shows when more attention was raised to this category of Linux interactivity bugs.

Thanks,

Ingo

Comment Re:easy solution: (Score 2, Insightful) 472

by Ingo Molnar on Saturday October 23, 2010 @09:34PM (#34000816) Attached to: The State of Linux IO Scheduling For the Desktop?

No, massive unfairness is just as bad on the server as it is on the desktop - in all but a few select batch processing situations.

Replace 'desktop' with 'database', 'Apache', 'Samba' or 'number crunching job' and you get the same kind of badness.

There's not much difference really. If it sucks on the desktop then it sucks on the server too: why would it be a good server if it slows down a DB/Apache/Samba/number-crunching-job while prioritizing some large copy operation?

Comment Re:AS I/O scheduler was removed in 2.6.33 (Score 2, Informative) 472

by Ingo Molnar on Saturday October 23, 2010 @08:41PM (#34000552) Attached to: The State of Linux IO Scheduling For the Desktop?

You are right, deadline is the other (much smaller/simpler) one - CFQ is the main IO scheduler remaining.

You can still test AS by going back to an older kernel - and as long as it's a performance regression that is reported (relative to that old kernel, running AS), it should not be ignored on lkml.

Thanks,

Ingo

Comment Re:Is it really only a matter of scheduling? (Score 1) 472

by Ingo Molnar on Saturday October 23, 2010 @06:57PM (#33999834) Attached to: The State of Linux IO Scheduling For the Desktop?

It would be useful if /bin/cp explicitly dropped use-once data that it reads into the pagecache - there are syscalls for that.

Other than opening the files O_DIRECT, how would you do that?

No need for O_DIRECT (which might not even work everywhere), /bin/cp could use fadvise/madvise with some size heuristics. (say if a file is larger than RAM and will be copied fully then it cannot be reasonably cached)

POSIX_FADV_DONTNEED should do the trick in terms of pagecache footprint - it will invalidate the page-cache.

Thanks,

Ingo

Comment Re:Perhaps if Con Kolivas named his scheduler .. (Score 4, Informative) 472

by Ingo Molnar on Saturday October 23, 2010 @06:52PM (#33999806) Attached to: The State of Linux IO Scheduling For the Desktop?

He tried that before. I think he's given up on getting his scheduler (though perhaps not a suspiciously similar one written by Inigo) in the kernel after what happened with CFQ.

One reason for why the principle of CFS may seem to you so suspiciously similar to Con's SD scheduler is that i used Con's fair scheduling principle when writing the initial version of CFS. This is credited at the very top of today's kernel/sched.c [the scheduler code]:
* 2007-04-15 Work begun on replacing all interactivity tuning with a * fair scheduling design by Con Kolivas.

It was added in this commit.

The scheduler implementations (and even the user visible behavior) of the schedulers was and is very different - and there is where much of the disagreement and later flaming came from.

Note that this particular Slashdot article is about IO scheduling though - which is unrelated to CPU schedulers. Neither Con nor i wrote IO schedulers.

There are two main IO schedulers in Linux right now: CFQ and AS, written by Jens Axboe, Nick Piggin, et al.

What adds fuel to the confusion is that it is relatively easy to mix up 'CFQ' with 'CFS'.

Thanks,

Ingo

Comment Re:IO scheduler != CPU scheduler (Score 1) 472

by Ingo Molnar on Saturday October 23, 2010 @06:27PM (#33999646) Attached to: The State of Linux IO Scheduling For the Desktop?

The problem is in the IO scheduler. Switching from CFQ to AS minimizes the problem. It takes less than 5 seconds with google to see how wide spread the problem is. Lots of people squawking about it. CFQ is pure crap in my experience. Copying a couple gigabytes of data should not render all my applications unusable for five minutes.

Probably uninformed, but someone actually claimed that CFQ is designed to be tweaked with ionice. Yeah, a modern OS should require people to manually ionice every time they do a file copy!

It would be ideal for the schedulers and window manager to communicate and give priority to the foreground application.

Please help us resolve this issue: please post your experiences to linux-kernel@vger.kernel.org and Cc: the following gents:
jaxboe@fusionio.com, torvalds@linux-foundation.org, akpm@linux-foundation.org, a.p.zijlstra@chello.nl, mingo@elte.hu.

It would be nice if you could attach latencytop numbers for CFQ and for AS, using the same workload. Latencytop will measure the worst-case delays you suffered - so you can demonstrate the CFQ versus AS effect numerically.

Thanks,

Ingo

Comment Re:IO scheduler != CPU scheduler (Score 2, Informative) 472

by Ingo Molnar on Saturday October 23, 2010 @06:20PM (#33999602) Attached to: The State of Linux IO Scheduling For the Desktop?

Ingo,

I believe most desktop users run into this problem when they complain about IO schedulers. Is there any immediate plan to address it?

Thanks,

Jason

Regarding plans you need to ask the VM and IO folks (Andrew Morton, Jens Axboe, Linus, et al).

Regarding that bugzilla entry, there's this suggestion in one of the comments:

echo 10 > /proc/sys/vm/vfs_cache_pressure
echo 4096 > /sys/block/sda/queue/nr_requests
echo 4096 > /sys/block/sda/queue/read_ahead_kb
echo 100 > /proc/sys/vm/swappiness
echo 0 > /proc/sys/vm/dirty_ratio
echo 0 > /proc/sys/vm/dirty_background_ratio

or use "sync" fs-mount option.

If you can reproduce that problem with a new kernel (v2.6.36 would be ideal) then please try to describe the symptoms in a mail to linux-kernel@vger.kernel.org, and also point out whether the tunings above improved things. Please Cc: Jens, Andrew, me and Linus as well.

To turn interactivity woes on the desktop into actual hard numbers you can use Arjan van de Ven's latencytop tool. It will measure your worst-case delays with and without big copies being done in the background, which numbers you can cite in your email.

Thanks,

Ingo

Comment Re:easy solution: (Score 3, Interesting) 472

by Ingo Molnar on Saturday October 23, 2010 @06:10PM (#33999548) Attached to: The State of Linux IO Scheduling For the Desktop?

That's great that you post your experiences with server scheduling in a topic about desktop scheduling. It's so relevant. No wait, it's not.

The boundary between the desktop space and the server space is rather fluid, and many of the problems visible on servers are also visible on desktops - and vice versa.

For example 'copying a large amount of data' on a server is similar to 'copying a big ISO on the desktop'. If the kernel sucks doing one then it will likely suck when doing the other as well.

So both cases should be handled by the kernel in an excellent fashion - with an optimization/tuning focus on desktop workloads, because they are almost always the more diverse ones, and hence are generally the technically more challenging cases as well.

Thanks,

Ingo

Comment Re:IO scheduler != CPU scheduler (Score 1) 472

by Ingo Molnar on Saturday October 23, 2010 @06:01PM (#33999482) Attached to: The State of Linux IO Scheduling For the Desktop?

Why does fsync not synchronously flush out *only* the data dirtied by the current process, rather than all buffers on all file systems dirtied by any process?

That's the intention and that's even how it's coded - but for example ext3 had a bug/misfeature that caused this operation to serialize on all pending writes to the same filesystem's journal area in some pretty common scenarios - with disastrous results to interactivity.

There was a big flamewar about it on lkml a year or two ago, and in that discussion Linus declared that it is a very high priority item to fix this, and that desktop interactivity is our main optimization focus. (IIRC the flamewar was big enough to make it to Slashdot - cannot find the link right now.)

The fsync/fdatasync performance problem was fixed/resolved shortly after that.

Kernels after v2.6.32 (and certainly the latest v2.6.36 kernel) should be much better in that area.

It seems like a bad idea for a non-root thread to have so much power over how smoothly the rest of the system runs.

Yes, indeed.

In terms of isolation guarantees, block cgroups (control groups) should be a feature for more formal isolation of one user from another. AFAIK Android puts each app into a separate user and into separate cgroups. So it's not impossible to design the user-space side properly. It was written a server feature originally, but i think it's very useful on the desktop as well.

Thanks,

Ingo

Slashdot Top Deals