Slashdot Log In
RH7 Crashes In Three Weeks (But Fixed)
Posted by
timothy
on Wed Oct 11, 2000 12:01 PM
from the gulp! dept.
from the gulp! dept.
Herz writes: "I got this email today from Red Hat. RH7 will crash out of the box in 3 weeks! The new Update Agent provided with Red Hat Linux 7.0 contains a daemon, rhnsd, which periodically polls Red Hat Network for updates. This daemon leaks file descriptors. On a default installation, all available file descriptors will be used by rhnsd in approximately three weeks, making the system unusable." The Red Hat folks have also provided a fix, though -- updated packages for those who want to use their update network, and the two-line method of disabling per machine for those who don't. After all, everyone wants uptime > 3 weeks, eh? And you don't need to wait for a "service pack," either.
This discussion has been archived.
No new comments can be posted.
RH7 Crashes In Three Weeks (But Fixed)
|
Log In/Create an Account
| Top
| 301 comments
(Spill at 50!) | Index Only
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Kind of like... (Score:5)
They never introduced a fix... the sheer idea of running win95 for 43 days was silly, even to MS.
Linux is targetting Windows (Score:4)
Re:Kind of like... (Score:3)
That was supposed to be funny, laugh dammit.
--
/. ate my comment! (Score:5)
Anyway, my whole "-1, Flamebait" comment was:
Are you installing RH7 on production machines the day it comes out? Are you INSANE? Look, its a bug. They have a fix. So patch the TEST MACHINES you're running RH7 on, so you can work out the bugs, migration path, and eratta, and get on with your life! You ARE running this on test machines, right? You are planning a migration to RH7, not just popping the CD into your mission-critical servers, right? You are following good sysadmin practices, right?
Just because they rushed the release doesn't mean you have to take it. Take your time and be smart.
Re:Politics (Score:5)
Now with regards to the bug, I think the obvious fix is to simply kill -9 rhnsd. There ya go, bug fixed. Yes it's a serious bug, but it's hardly a service that any production server needs so it's a non-issue in my mind. If you are running a serious server you are probably not going to let the the software update itself. You are going to get it up, apply any security patches that come out, and lock it in a closet somewhere. The "idea" that you must be running the most current version of software is a marketing ploy (which MS does very well) and is hogwash. If you have software that meets your needs and is stable and secure you certainly don't want to screw it up by randomly updating it.
I think it was poor of RH not to actually test this properly, but I also understand that this is partly just the nature of the beast. They feel that they must move forward at a fast pace and this is the result.
Their "quick fix" also has a bug :) (Score:3)
But of course it should be
This doesn't exactly help improving the impression of their
Re:Politics (Score:5)
Redhat dominates the Linux market. This affects a LOT of
As well, I think politically it's probably a good idea to be public about this kind of bug. Linux has a rep of being extremely reliable. I, for one, would like to keep it that way, and bugs that affect reliability thus NEED TO BE very embarassing events. Trying to suppress this kind of news may make Linux APPEAR more reliable but actually BE less reliable -- a lose-lose situation for sure.
After all, if Sendmail suddenly started crashing every two weeks, the community would be justifiably furious about it. I don't think it's unreasonable to hold Redhat to a similar standard. They have an enormous advantage over Microsoft by packaging all the Open Source stuff instead of writing it themselves. Seems to me that expecting really good QA on their internally-written software is quite reasonable.
You can bet that if Microsoft had released Win2K with a bug that took it down after two weeks it would have made national news. And Slashdot.
Re:Serious teething pains (Score:5)
Most of posters stating that they do actually use RH 7 seem quite happy about it, noticing that it is even more stable than RH 5.0 or 6.0 ever were. Most of the bad press on
So, chances are that you should trust /. a little less and learn from your own experience by trying it... In my experience, it is better than all previous RH releases; the way it should be.
Childish (Score:4)
You can't do that standing on such shaky ground. One could argue that it _is_ a service pack, or point out that MS does usually release patches to serious problems within a week as well as rolling them up into a service pack.
Re:Kind of like... (Score:4)
And it was 49.7 days (the time it takes for a millisecond timer to overflow a 32bit unsigned integer.
It was fixed in one of the service packs.
See this MS KB entry for details [microsoft.com].
Re:Why is that? (Score:4)
They never introduced a fix... the sheer idea of running win95 for 43 days was silly, even to MS.
Why was that? I personally like to leave my computer on it's better for the electrical connections within the machine and parts due to thermal expansion/contraction.
Better for the "electrical connections within the machine"... Uhhh, okay.
Actually, it's just an expansion-contraction issue within the ICs, in particular. And the hard disk drive, landing the heads every time you shut down (but this is the same as if you leave the power management on). Cheap power supplies can sometimes make issues with voltage spikes as they turn on; if you buy a good one, the voltages all come up to their regulated levels and then the Power_Good line is pulled high and the motherboard is reset.
So, if you have a good quality system, you probably won't have any problems with the wear of turning your machine on and off in reasonable useage until after the machine is obsolete.
Compare this to the higher power bills, risks of fans dying and overheating that conservatively overclocked processor, as well as more potential uptime for a thunderstorm to kill it, and I feel it's probably wise to shut off the computer when you're not using it. Of course, that's discretion. Do you turn off the computer when you leave the office for lunch? Nah. For the weekend? For sure. Overnight? I do.
I do speak with some authority here; while I'm not an electrical engineer, I have several years of experience design engineering critical radar systems for Litton [litton.com]. I also used to write electronics design and construction columns for Popular Electronics magazine.
As for Windows 9x/ME, it's only under controlled laboratory conditions that you can make a Windows box run long enough to see that bug. I've managed to see the 49.7 day bug once; and with the M$ fix, I've seen a record uptime of 103 days with Windows 95B OSR2. Windows 3.1/DOS, I've managed to keep running for months at a time.
ouch, this has to hurt.... (Score:3)
In an ideal situation, every programmer will look at the source code, and contribute to the effort of the open project. Most people (like myself) are free-riders, who have no ability to program. So as idealistically sound open source may seem, there are certain issues to worry about.
In RH's case, at least they pay their workers-which means that they are more willing to do the dirtywork of bug fixing others' code (in theory). Although, cases like this gives another doubt in the "Linux for the business" credibility since more non-techies seem to equate Linux with RedHat. It seems to be an understanding by almost everyone, that any RH x.0 distro is pretty much an experimental state, and must not be used on production servers. This, however, makes theo perating system appear "buggy" and "not production-quality" to the uninformed, hence I wish they will take more pride in their distribution instead of "hey, we had that packaged into ours first!" I honestly wish comments on how RH's similarity with MS due to their tactics are only on the surface. Unlike MS (whose operating system is proprietary), RH simply has their own distribution of an open-sourced OS. If you so choose not to use their distro, you have enough other choices: e.g. Debian, Mandrake, Slackware, etc etc.
Re:Biased pinhead... (Score:3)
RHL 7 has been out for two weeks. It's not even in _stores_ around here yet, but the bug has been found. It's been fixed.
That's why it's not a big deal.
Main page comings and goings (Score:3)
Disappearing article? (Score:4)
Wait, a revolutionary moment!!! Slashdot confirms an article before posting it!!!
A little perspective (Score:5)
The leak is in The Update Manager. If you're not running the update manager, you don't have a problem and the system won't go down. If you ARE running the Update Manager - well, it'll just automatically get the update from RedHat, won't it? Assuming that part works, anyway...
Completely unusuable in 3 weeks? (Score:3)
Have you ever run out of file descriptors? (Score:3)
It's not a pretty sight. It's not too far off from running out of memory. And, the 4096 number is a system wide number:
Now, it's not that when that number runs out, that process dies, but the *NEXT* process to request a file dies. This happens on officially penguin-peed kernels as well. You need to set resource limits to keep an individual process from getting to trigger happy with files.
And by the way, take stock 2.2 and make a program which either A) fork bombs or B) chews memory. Watch the system go down in flames. In the case of (B) you (once? Is it fixed?) had the chance of watching the kernel give init the boot, which is very ugly.
--
Ben Kosse
one word: cron (Score:5)
You lack slack, Jack! (Score:3)
Re:Proof positive of the benefits of Open Source (Score:3)
---
Re:Politics (Score:3)
And I'm very glad to know about the bug and the fix; it's something of a showstopper, and I didn't know the update manager was active by default, so this is valuable information -- not RedHat bashing.
-Erf C.
Re:Proof positive of the benefits of Open Source (Score:4)
Got Them Dot Zero Blues (Score:5)
Crawled out of bed
Couldn't wait to get that Red Hat distro you said
Told you to worry
Told you to wait
But no you want to mirror it from outside the state
Refrain
I got the blues
Got them old dot zero blues
Cause I done installed that distro
And it blew up on my shoes
Wish I had DSL
Wish I had fat pipes
But on a 56K modem
The download's such a fright
It's all installed now
Servers up and cool
But I come back three weeks later
And look just like a fool
Refrain
Got burned by Compaq
Got burned by Dell
Got burned by Microsoft
Now I'm in Red Hat dot zero hell
Refrain
Now don't you worry
This one's ok
It won't drop under loads now
Cause if it does we'll make you pay!
Refrain
Re:Disappearing article? (Score:5)