So I know some people may read this and think "haha, funny joke" but given that most users are extremely predictable regarding what programs they use and when and how they use them (same with web browsing), shouldnt it be possible to gather user activity over time and analyze it to help improve scheduling.
Yeah, that's certainly a possibility.
This is also the goal of most heuristics in the kernel: to figure out a hidden piece of information that the application (and user) has not passed to the kernel explicitly.
The problem comes when the kernel gets it wrong - the kernel and applications can easily get into a feedback loop / arms race of who knows how to trick the other one into doing what the app writer (or kernel writer) thinks is best. In such cases we get the worst of both worlds: we get the bad case and we get the cost of heuristics.
(Heuristic and predictive systems also tend to be complex and hard to analyze: you can rarely reproduce bugs without having the exact same filesystem layout and usage pattern as the user experienced, etc.)
What we found is that in terms of default behavior it's a bit better to keep things simple and predictable/deterministic and then give apps the way to inject extra information into the kernel. We have the fadvise/madvise calls which can be used with the POSIX_FADV_DONTNEED flag to drop cached content from the page cache.
Heuristics and predictive techniques are done when we can be reasonably sure that we get the decisions right: for example there's a piece of fairly advanced code in the Linux page cache trying to figure out whether to pre-fetch data or not.
The large file copy interactivity problems some have mentioned here were most likely real kernel bugs (in the filesystem, IO scheduling and VM subsystems) and were hopefully fixed in the v2.6.33 - v2.6.36 timeframe.
If you can still reproduce any such problems then please report them to linux-kernel@vger.kernel.org so we can fix it ASAP.
In any case, we could all be wrong about it, so if you have a good implementation of more aggressive predictive algorithms i'm sure a lot of people would try them out - me included. We kernel developers want a better desktop just as much as you want it.