Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Communications The Internet

Tier One ISPs Dying 394

xbmodder writes "Two tier one ISPs are down today. At about 23:30PST both Verio and Level 3 starting having problems with routes. According to Level 3 this is a software upgrade gone awry. Is this the end for Level 3?" Many, many reports about this are coming in, and if you're wondering why the stories were rather sparse overnight, it's because it's difficult to post them without internet access. Hope everyone else is back online too.
This discussion has been archived. No new comments can be posted.

Tier One ISPs Dying

Comments Filter:
  • by discord5 ( 798235 ) on Friday October 21, 2005 @07:56AM (#13843400)
    Noticed this this morning when a customer called upset about his hosting services being unreachable. A quick traceroute showed one of level3's ip to be down. A few minutes later more customers had problems with different routers from level3. As soon as I saw level3 I knew enough, shrugged it off and told the customer that it was routing problem we couldn't fix but those responsible were most likely already trying to fix it.

    It seems fixed now though, so no, this isn't the death of the Internet just yet.
  • Re:Flicker (Score:2, Interesting)

    by EvanED ( 569694 ) <{evaned} {at} {gmail.com}> on Friday October 21, 2005 @08:02AM (#13843427)
    Outage of plentiful service?
  • Re:Call me silly? (Score:1, Interesting)

    by Anonymous Coward on Friday October 21, 2005 @08:14AM (#13843475)
    1. (Lower case "i"nternet) A large network made up of a number of smaller networks.

    2. (Upper case "I"nternet) The largest network in the world. It is made up of more than 100 million computers in more than 100 countries covering commercial, academic and government endeavors. Originally developed for the U.S. military, the Internet became widely used for academic and commercial research. Users had access to unpublished data and journals on a variety of subjects. Today, the "Net" has become commercialized into a worldwide information highway, providing data and commentary on every subject and product on earth.
  • by PhraudulentOne ( 217867 ) on Friday October 21, 2005 @08:21AM (#13843505) Homepage Journal
    Is this even an issue? I mean, this was probably scheduled maitenance that went a little longer than expected. I have been through this before. It just sounds like Level 3 dropped some core routers for a few minutes to do a code upgrade - it didn't work so hot, so they were down for a few more mintes, OSPF/BGP decided to tell all the clients that they have no routes, Level 3 gets the routers back up, OSPF/BGP tells everyone that their fine again. Was this like 6 hours, or 45 min?
  • by Seraphnote ( 655201 ) on Friday October 21, 2005 @08:21AM (#13843507)
    Yes and no.

    Yes, the Internet enables/permits/allows redundant routes, but...

    No, it doesn't require/demand/"enforce with any government or legal authority" redundancy at all levels.

    So any smaller ISPs connected to Level3, and all their customers would have had problems reaching the rest and being reached by the rest.

    (sarcasm mode)Obviously this wouldn't have happened if the EU had been in control!(/sarcasm mode)

    Actually, how many of these corporations are US companies, and how many are NOT?
  • by Anonymous Coward on Friday October 21, 2005 @08:38AM (#13843584)
    This should be a wakeup call to keep moving ahead with creating a more distributed and resilient Internet infrastructure. The first step in that direction are wireless neighborhood mesh networks.
  • Re:Call me silly? (Score:5, Interesting)

    by Anonymous Coward on Friday October 21, 2005 @08:50AM (#13843638)
    In 1994, The National Science Foundation (NSF) awarded contracts to replace the National Science Foundation Net (NSFNet) Internet backbone. These contracts were for backbone transport, routing arbiter and traffic exchange points (NAPs).
              These contracts were awarded for the original 15 NSF sponsored NAPs, and to become a Tier 1 ISP, you had to have atleast DS3 connectivty to all 15 NAPs.
              It's a very old and crappy definition, and I wish people would stop using it, because it is very easy to meet now adays, and most of those original NAPs are now insignificant, compared to the power of the force.
  • flapping (Score:5, Interesting)

    by SpectralDesign ( 921309 ) on Friday October 21, 2005 @09:00AM (#13843689)
    Way back in the day when I was a Network Controller at BBN Planet, if we began to have cascading routing outages we'd call it "Flapping"... Visualize a wounded bird squirming around on the ground flapping...

    Takes me back... My first night on the job a rat in Berkeley chewed through the wrong cable and got himself fried -- he also happened to take the entire west-coast off the internet for the better part of a day.

    Then there was the time an electrical worker got vaporized in a hole near MIT which caused quite a problem too as it overloaded the MIT power station, but the fallout wasn't nearly as bad as the day of the rat...
  • by Anonymous Coward on Friday October 21, 2005 @09:10AM (#13843739)
    dude - have you been behind a rock for a decade? Mae-East is hardly the only peering point carrying US international traffic any more, and alter.net/uu.net/worldcom is not carrying nearly all of the net's traffic anymore.
  • Verio? (Score:2, Interesting)

    by VikingThunder ( 924574 ) on Friday October 21, 2005 @09:10AM (#13843742)
    Well we seem to know why Level3 went down, but why did Verio go down at the same time?
  • by dantheman82 ( 765429 ) on Friday October 21, 2005 @09:18AM (#13843776) Homepage
    happened in Detroit in the last 24 hours. Apparently all ingoing/outgoing traffic to other Tier One ISPs had problems in that city. Also, Philadelphia had really slow traffic within Level3 (and slower to all the others), and had major problems connecting to Verio. San Diego also had some problems, especially within the Level3 network. St. Louis was the only area without major problems...

    For a breakdown, check out this view of the data [keynote.com].
  • by senor_jt ( 623124 ) <senor_jt@@@yahoo...com> on Friday October 21, 2005 @09:19AM (#13843787)
    Finally -- somebody that gets it. No offense to others who didn't feel like posting... As dissapointed as I am by the level of discussion about this topic in Slashdot, I'm thankful that it's here! I tried a search for DNS and DNS problems on Google news this morning and didn't come up with any stories, then tried Slashdot. And viola!! I wasn't going crazy at 11:30p PDT last night, the Net routing was... having problems. DNS service to wide swaths of the net was down/unreliable. I had to try a mix of different nameservers to get to sites(work, personal mail, etc...). Thankfully, I was too tired to worry about my clients and hoped it would all be solved by morning. Yay. But this event does underline a topic seen on Slashdot(and other esteemed zines) many times before -- the fragility of the Internet. Not even close to bulletproof.
  • by DFossmeister ( 186254 ) <{moc.oohay} {ta} {dlanod_ssof}> on Friday October 21, 2005 @09:50AM (#13843963) Homepage
    Level 3 went down at 22:42 pst and was available around 23:50 pst. Verio started having problems right around the same time that Level 3 was coming back up. The Internet Health Report [theinterne...report.com] from Keynote showed me what was going on, scary that it was.
  • Level 3? (Score:3, Interesting)

    by Greyfox ( 87712 ) on Friday October 21, 2005 @09:52AM (#13843975) Homepage Journal
    Haven't they been hanging on by their fingernails since the dot-com bust? I've know a few guys who got burned working for them just before the bust, and I've seen several recruiters post stuff like "A local communications company (NOT Level 3!)" in their job reqs.

    I don't know that they've replaced Sprint yet on my list of most sucktastic internet companies. Time was you lost connectivity to an important piece of the Internet (Like your favorite Quake TeamFortress server) and a traceroute would show the failure somewhere in the Sprint backbone. So far they've been more reliable than Sprint at their worst, at least for me.

    If they go under, well Tier 1's don't ever really die. Chances are one of the other Tier 1's will buy their assets and it'll be business as usual. Usually the buyer is MCI.

    Of course the true test is pretty easy -- has anyone who works at Level 3 had their paycheck bounce yet? Surely there are a few readers among their employees...

  • Re:flapping (Score:1, Interesting)

    by ScottKin ( 34718 ) on Friday October 21, 2005 @10:12AM (#13844107) Homepage Journal
    I had a similar situation back when I worked at the Computer Center at LBL - mice chewing through the high-voltage supply for our CDC 7600Z back in '80 set-off the fire alarms...I was the lucky stiff who had the joy of holding-down the "HALON DEFEAT" deadman switch until they could find where the carbonized rat was.

    Ah, those were the days.

    --ScottKin
  • Re:Call me silly? (Score:3, Interesting)

    by w4pso ( 874168 ) on Friday October 21, 2005 @11:02AM (#13844532)
    My understanding is that a Tier 1 ISP is now defined as one that has established free peering points with all other Tier 1 ISPs.

    Considering that free peering is likely only established between 2 networks that have close to a 1:1 bit exchange, this is a very high bar to meet.

  • by Wakko Warner ( 324 ) * on Friday October 21, 2005 @11:29AM (#13844755) Homepage Journal
    The scary thing is it makes you wonder is some terrorist who has intimate knowledge of how Tier 1 ISP's work doing a trial run in the middle of the night by knocking out Level 3 and Verio backbones so later they could try to knock out ALL the backbones in a co-ordinated terrorist attack.

    It doesn't make me wonder that. Terrorists do not give a shit about this kind of thing. To even invoke the word "terror" in this discussion is ludicrous.

    - A.P.
  • Re:Guess not (Score:5, Interesting)

    by Dun Malg ( 230075 ) on Friday October 21, 2005 @11:34AM (#13844818) Homepage
    The scary thing is it makes you wonder is some terrorist who has intimate knowledge of how Tier 1 ISP's work doing a trial run in the middle of the night by knocking out Level 3 and Verio backbones so later they could try to knock out ALL the backbones in a co-ordinated terrorist attack. (eek!)

    Oh please. You know, it's pretty easy to figure out if it's something likely to be attempted by terrorists or not. The simple test is does it cause mass "terror". As annoying as it might be, lack of internet access is an annoyance. Perhaps a very expensive and exasperating annoyance, but it won't cause mass terror. Terrorists prefer things like bombs, or poison gas, or disease. Some other things people get worked up about but terrorists are unlikely to attempt: sabotaging bridges and tunnels to cause traffic jams; sabotaging electricity distribution to cause blackouts; sabotaging railroad tracks, making commuters late for work!. Think DEATH, not irritation. Quit with the automatic "terrorist hysteria" already, people!

  • Has to be said... (Score:3, Interesting)

    by KC7GR ( 473279 ) on Friday October 21, 2005 @11:35AM (#13844820) Homepage Journal
    Oh, so THAT's why my daily spam load suddenly dropped by about 35% or so...

  • Re:flapping (Score:5, Interesting)

    by rah1420 ( 234198 ) <rah1420@gmail.com> on Friday October 21, 2005 @12:10PM (#13845141)
    halon defeat

    OT, but it brings back memories of working at Purolator Courier in the machine room. IBM mainframe shop.

    We had had trouble with the damn fire suppression all day. On third shift, around 3 AM, the trouble alarm went off (again) for the umpteenth time. One of the operators, a nervous fellow who was a little bit green, went over to the annunciator panel and opened it to see what the Trouble Might Be.

    A fire technician he was not, and he apparently didn't know the difference between the trouble bell and the klaxon that would sound when a halon dump was about to occur; so he reached around the open panel door and hit the halon defeat.

    Or so he thought.

    It was actually the Big Red Switch.

    The whole room (full of 3420 and 3480 tape drives, the 3745s, the 3800 laser printers; and the floor above, containing trivial bits like the DASD and the CPU all plunged into a deafening silence.

    We all stared at each other and at the newbie BOFHeck.

    A few minutes later, the phone rang. It was the Indianapolis air hub for Purolator, wondering why (when they were about to receive about 150 planes from all over the country) they didn't have anything useful displayed on their green screens.

    That was a fun morning.

    Ah, those were the days indeed.
  • by Cally ( 10873 ) on Friday October 21, 2005 @04:08PM (#13847293) Homepage
    However, L3 has been having "issues" this month that have left a lot of lower-tier ISPs in the uncomfortable position of explaining to their customers "We know the internet is down but there's nothing we can do about it." This outage really can't be good for their reputation, and I can see more potential customers taking their money elsewhere because of this.

    Just because the technical issues have been fixed doesn't mean their finances have been fixed as well.

    See now that's a story, if you add a couple of links to back it up. Not saying there's any problem with L3's finances, though of course we're all still waiting to hear the story behind the Cogent issue. Incidentally (as has also been discussed on NANOG recently) there's increasing pressure on not only ISPs but even corporate networks whose parent orgs are large enough to merit audits and certifications (think NIST, SOx, ISO17799,..) to start thinking that being multi-homed is a necessary precondition to really `being on the Internet`. (And who's to say they're not right?) One thing's for sure - demand for BGP-clued bodies with experience with 'enable is on an upwards curve. (Interestingly, routing is one of those IT areas that can't easily be distilled into a "...in 28 days" type crash course, ie,. commodified - along with systems programming, solid C++ coding, DBA-dom, and lots of things under the umberella of 'security'.)

  • by Kenshin ( 43036 ) <kenshin@lunarOPENBSDworks.ca minus bsd> on Friday October 21, 2005 @04:46PM (#13847688) Homepage
    It doesn't speak much for the Slashdot community when Wikipedia has to put this warning at the top of a Slashdot-linked page:

    This article has recently been linked from Slashdot.
    Please keep an eye on the page history for errors or vandalism.
  • Re:flapping (Score:3, Interesting)

    by llefler ( 184847 ) on Friday October 21, 2005 @05:55PM (#13848382)
    I used to work in a large datacenter for a mutual funds company. At a guess, the computer room was 200k sq/ft with about half of that 3480 drives and the tape library.

    Every Sunday night they would switch from mains to batteries to exercise the system. So at around 1am the air conditioners and lights would go out and the silence would be deafening. It always made your heart skip a beat while you checked to make sure the lights on the drives were still on. 30 seconds later the lights and air conditioning would come back, but I never got used to it.

    Oh, and I also worked in a Gov't datacenter for a while. So of course, Halon wasn't allowed. The VAXen were 'protected' by a sprinkler system. The disaster plan was for one operator to hold the sprinkler abort button while another pulled the t-bars and covered the machines with plastic. Then of course, I worked Sundays as the sole operator. Hmm, do you burn holding the button or get electrocuted pulling t-bars. Good thing we never had a fire, because I would have been listening for the explosion from my car.
  • Halon stories (Score:3, Interesting)

    by Ungrounded Lightning ( 62228 ) on Friday October 21, 2005 @09:30PM (#13849885) Journal
    Used to do contract work at an auto company's plant. The main data center's primary job was to feed test programs to an distributor testing line and collect the stats. It was located in the middle of the plant on the second floor, next to the row of test stands.

    Some time after my contract had ended I visited the place and it was a total disaster.

    During the model change shutdown (when most of the plant maintainence and rearrangement was done) the millwrights were welding on some cableways on the ceiling of the plant floor below. The fumes from the welding, of course, rose to the ceiling and escaped through the first hole they could find - around the big fire sprinkler pipe that went up through the floor of the computer room and into the space beneath the raised floor.

    It tripped one ionization smoke alarm and sounded the warning - but nobody was around during the shutdown to hear it. Shortly thereafter it tripped a second one and the halon system went off. The computer power shut down and $10,000 worth of halon blasted into the computer room. Half of it came out through vents under the floor, throwing the raised floor panels and a decade's accumulation of fine dust (much of it byproducts of metal cutting and anealing) all over the room. And finally sounding an alarm at the guard shack.

    The guards came over and found the room in disarray but no slightest sign of a fire. A couple million bucks worth of computer equipment, slated for replacement in another few months but still critical to the plant's operation, was standing there, covered with dust (likely to cause trouble for the disk drives later) but otherwise intact. So they followed procedure and reset the halon system, switching to the backup cylinder, to protect the computer in case an actual fire made it to the comp room. (Normally that's a good idea, since smouldering that sets off smoke detectors is often followed some time later by an actual fire.)

    Of course the welding was still going on - just not at the moment the guard sniffed the comp room. (Welders out to lunch, pulled out due to the alarm, or having decided to come down off the ceiling for a bit after the blast of gas from above.) And they still had work to do. So of course they went back to it.

    In less than an hour the situation repeated, dumping the SECOND $10,000 worth of halon on the non-fire. B-(

Remember, UNIX spelled backwards is XINU. -- Mt.

Working...