Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
The Internet

Lead Scientist Responds to Questions on Root Server Queries 192

cidtoday writes "A CircleID interview with the lead scientist whose study recently revealed that 98% of a main root server queries are unnecessary, reveals that spam has little to do with the issue. In fact, he provides two reasons why anti-spam tools cause more unnecessary queries to the root servers than spam emails. Many other questions previously raised by Slashdot readers on the study are also answered."
This discussion has been archived. No new comments can be posted.

Lead Scientist Responds to Questions on Root Server Queries

Comments Filter:
  • Did anyone else read "Lead" as the metal, and not as "the one in charge?
  • 98% of... (Score:3, Funny)

    by $$$$$exyGal ( 638164 ) on Thursday February 27, 2003 @08:11PM (#5402251) Homepage Journal
    98% of all Slashdot comments are unnecessary. Should you be concerned?

    --sex [slashdot.org]

  • by pcardoso ( 132954 ) on Thursday February 27, 2003 @08:19PM (#5402302) Homepage
    don't go to the article all at once, or those questions will continue unanswered!
  • by $$$$$exyGal ( 638164 ) on Thursday February 27, 2003 @08:20PM (#5402306) Homepage Journal
    spam emails floating around in people's inboxes, many of which contain broken links that cause bad DNS lookups

    Here's a link [pc-help.org] that lists how some spammers attempt to hide their real identities. This isn't necessarily exactly what the root server query guy was talking about, or maybe it is? Either way, it is very enlightening. Some slashdotters even occasionally try to hide a goatse link this way.

    --sex [slashdot.org]

  • by bourne ( 539955 ) on Thursday February 27, 2003 @08:20PM (#5402311)

    It's BB&N... er, GTEI... er, Genuity that's getting pounded. They provide caching DNS servers to the entire Internet at 4.2.2.1 (.2, ...) and because they're so easily memorizable, I've never met a sysadmin who didn't put them in a hosts' configuration in a pinch.

    • damn. i'm not the only that does that?
      • Hah! I thought I was the only one. I've probably spread that to 3 or 4 other admins too. It's easy to remember to set up on a box for testing, and it's always live so it's a good ping test.

        Funny....
    • > It's BB&N... er, GTEI... er, Genuity that's getting pounded. They provide caching DNS servers to the entire Internet at 4.2.2.1 (.2, ...) and because they're so easily memorizable, I've never met a sysadmin who didn't put them in a hosts' configuration in a pinch.

      Yeah, but if the spam from verizon-dsl.net luzers (spammers and proxies on DSL) in reclaimed chunks of 4.0.0.0/8 doesn't slow up soon, those pounded DNS servers are gonna be the only bits left of BBN that's not blocked.

      Remember when living in 4.0.0.0/8 used to be a badge of honor?

    • i thought i was the only one who used 4.2.2.1 and 4.2.2.2

      easiest ip's in the world to remember, great ping times. I have them set as the secondary and tertiary dns servers for my company network.

    • The root DNS servers shouldn't be bearing the bulk of the DNS load - the DNS servers at the Tier 1 ISPs (and also smaller ISPs, but especially Tier 1) should, and they should take care of many of the common queries, such as in-addr.arpa for the 192.168.*.*, 172.stuff.*.*, and 10.*.*.* domains, zone-transfer caching "." and ".com" so that those lookups don't need to hit the roots, etc. Also, while the Root Name Servers have a policy [isi.edu] against accepting zone transfers from randoms, they really ought to have at least one server that either accepts zone transfers or at least some variant on FTP from registered addresses at the Tier 1 ISPs (The top ~25)and maybe at Tier 2 ISPs.

      Also, the name servers get a surprising number of queries FROM RFC1918 addresses (10.x, 192.168.x, etc.), and while it may be more efficient to use root server CPU (on big fast computers) than router CPU to dispose of these queries, ISPs have ENTIRELY no business accepting IP packets FROM these addresses, and they should be killing them at the incoming edges of their networks, not carrying them and passing them on to other people.

      • Also, the name servers get a surprising number of queries FROM RFC1918 addresses (10.x, 192.168.x, etc.), and while it may be more efficient to use root server CPU (on big fast computers) than router CPU to dispose of these queries, ISPs have ENTIRELY no business accepting IP packets FROM these addresses, and they should be killing them at the incoming edges of their networks, not carrying them and passing them on to other people.

        I really doubt root servers get queries FROM RFC1918 adresses. Every sane ISP blocks all such packets(not only DNS queries) on its border routers - ore else there will be much more spoofed packets around here. I work at ISP and usually all that NAT'ed machines that use our DNS are quering us about x.x.168.192.in-addr.arpa

    • by zerocool^ ( 112121 ) on Friday February 28, 2003 @12:28AM (#5403783) Homepage Journal
      Heh - anyone remember what the lookups to those used to be?

      ns:root> host 4.2.2.1
      1.2.2.4.in-addr.arpa domain name pointer vnsc-pri.sys.gtei.net.
      ns:root> host 4.2.2.2
      2.2.2.4.in-addr.arpa domain name pointer vnsc-bak.sys.gtei.net.
      ns:root> host 4.2.2.3
      3.2.2.4.in-addr.arpa domain name pointer vnsc-lc.sys.gtei.net.
      ns:root> host 4.2.2.4
      4.2.2.4.in-addr.arpa domain name pointer vnsc-pri-dsl.genuity.net.

      4.2.2.4 used to be i.will.not.steal.dns.sys.gtei.net.

      Now, that was an internet-wide easter egg!
    • Primary the root zone for yourself. Then you don't care if the legacy root servers all get unplugged, your dns will still work just fine. This is a recording... this is a recording... this is a recording...
  • If they can identify and quantify eplicit networks or IP addresses causing the 'abuse', then why don't they send a warning and then block them? They'll fix the problem real quick.....

    • why don't they send a warning and then block them?
      It's because these problems are being caused by DNS requests that cannot receive a reply, so blocking them wouldn't make a lot of difference. Any way you look at it, it behaves similar to a DDOS attack. From the article you forgot to read:

      Approximately 75% of the root server's queries were duplicates. Furthermore, we noticed that most of the repeats occurred at sensible intervals. That is, the agents making queries seemed to be following the protocol specifications.

      From this, it seems most likely that these agents are just not receiving any DNS replies. To the application, it looks like a network outage, so it keeps on retransmitting.

      • Fine. Firewall those IPs from using the root servers.

    • I had a firewall once configured to use my ISP's name servers. It would boot up and ask for its host name, but would not drop the DNS replies as they came in. Since the internal connections were properly NATted, there were no ill effects to my programs, from inside the firewall...

      As a result, I was getting *thousands* of replies that were being dropped every day. Funny-- seems the exact scenario described in the article.
  • This sounds interesting but what's a root server query?
    • by radon28 ( 593565 ) on Thursday February 27, 2003 @08:55PM (#5402505)
      When you type in a webpage address, say, slashdot.org, your computer needs to have a way to find out that it needs to send a message to the IP address of the server. that way is DNS. most ISP's host several of their own DNS servers that keep track of which addresses have been recently resolved so that their customers can get faster resolution. if an address hasn't been recently resolved and is no longer/never was in the DNS cache, then it's time to hit up one of the 13 root servers with a request.
      • Not quite.

        The root servers are the "invisible" trailing dot in

        www.slashdot.org. <- that one at the end


        The root DNS servers point to the top-level domains (TLDs) such as the Country Code TLD (ccTLD) and generic TLD (gTLD).


        So the root server points to the servers for the 'org' domain (or subdomain), which are now handled by Internet Society [isoc.org] and Public Interest Registry [publicinte...gistry.org] that operate several DNS authoritative DNS servers for the ORG domain. These then point to the authortative servers for slashdot.org, and we (or our ISP on our behalf) do yet another DNS request, this time to one of the authoratitive slashdot.org DNS servers, and lookup the IP address of www.slashdot.org or slashdot.org.


        To reduce the number of requests, our ISP DNS server will normally cache answers for both the TLDs servers, and specific subdomains, such as slashdot.org and specific hostnames such as www.slashdot.org.

      • Nice that you're being helpful and all, but you got trolled. Check out the guy's homepage.

        Succinct answer tho.
      • Close, but not quite right. The root DNS servers have the primaty purpose of listing authoraitive name servers for the top-level domains (.com, .net, etc). These servers then resolve the single-level domain (slashdot.org) and give it's authoraitive nameserver, which gives an IP.
    • The root servers are responsible for providing the IP addresses of the name servers for the top-level domains such as .com, .edu, .org. If you want the IP address for slashdot.org, you ask the root nameserver for the IP address of the nameserver responsible for .org, you then ask the .org nameserver for the IP address of slashdot.org.
    • the root DNS servers hold the database for all the top level domains like .com and .net.
      the problem being, that there are only 13 or 14 root servers that are accessible to the general public.

  • Eh??! (Score:5, Insightful)

    by FyRE666 ( 263011 ) on Thursday February 27, 2003 @08:30PM (#5402362) Homepage
    reveals that spam has little to do with the issue. In fact, he provides two reasons why anti-spam tools cause more unnecessary queries to the root servers than spam emails...

    So Spam has little to do with extra traffic, but the wealth of tools fighting against spam are adding to the load, right? But then since spam is the reason anti-spam tools exist, it's fair to say spam is the root cause of the problem!
    • That's like saying homosexuality should be abolished because it's the root cause of homophobia.

      Heh.
    • So the innocent people that get shot because incompetent SWAT teams got the wrong address for a drug bust are victims of drug users?

      Yeah, um... right.

      • Re:Eh??! (Score:1, Insightful)

        by BeBoxer ( 14448 )
        Actually, most War On (Some)Drugs supporters could tell you that with a straight face and not bat an eyelash. Most of them could then go on and tell you that any innocent Iraqi's killed by American bombs are actually Hussein's responsibility, and conclude by explaining that victims of spouse abuse are responsible for their plight because a good beating was the only response to their poor behavior.
    • But then since spam is the reason anti-spam tools exist, it's fair to say spam is the root cause of the problem!
      So it's God's fault...
    • So Spam has little to do with extra traffic, but the wealth of tools fighting against spam are adding to the load, right? But then since spam is the reason anti-spam tools exist, it's fair to say spam is the root cause of the problem!

      So if the Soviet Union had nuked the US over the U2 incident, and wiped out the human race, the US would have been the cause of the problem? Or would Wilbur and Orville Wright have been, because they caused the airplane to exist?
      • scripsit dvdeug:

        So if the Soviet Union had nuked the US over the U2 incident, and wiped out the human race, the US would have been the cause of the problem? Or would Wilbur and Orville Wright have been, because they caused the airplane to exist?

        Nah, it was Eve.

    • You could add a line to your bind.conf to hardcode the authoraitive nameservers for $dnsbl.
  • With all the talk that floats around, about every household electronic appliance having its own IP. And this also leading to companies adding everything as some kind of named host within in a home network i.e yourhomeaddress.personal.ps2.sony or yourhomeaddress.personal.microwave.bosh. What can root servers actually handle. I'd hate to see someone bring down a root server with a microwave oven, well without actually putting it in one :)
    • That is certainly true - IPV6 already promises to bring about this sort of deluge - after all, no one is likely to remember a 128 bit number, no matter how it's represented (zeroes taken out, and the like). Sendmail, among other programs already asks for AAAA records.

      On another note, has anyone thought about the second-level nameservers? Sure, there are only 13 root servers, but heck, there are only 200 or so GTLDs and CCTLDs to deal with. Now look at the 13 nameservers authoritative for the '.com' GTLD - there must be _millions_ of .com domains registered, and each one of these has to be accounted for by these servers. Now that's a lot of traffic...
    • Don't worry, a properly configured microwave oven wil only ping when it's finished cooking :-)
  • We have enough geeks and articles about geeks who tinker with things to optimize them even though they work just fine the way they are.

    The root server engineers are busy explaining why not to tinker with things that are clearly and inherently broken.

    Don't complain about useless queries -- FIX THE SYSTEM.
  • by graveyhead ( 210996 ) <fletch@fletchtr[ ]cs.net ['oni' in gap]> on Thursday February 27, 2003 @08:54PM (#5402492)
    Our results showed that 50% of the root server traffic comes from only 220 IP addresses.

    List, please? Hey Bush, forget about Iraq, let's take these bastards out. [grabs ak-47]

    • Re:This is amazing (Score:3, Interesting)

      by BeBoxer ( 14448 )
      Our results showed that 50% of the root server traffic comes from only 220 IP addresses.

      List, please? Hey Bush, forget about Iraq, let's take these bastards out. [grabs ak-47]


      Remember that some of those are perfectly legitimate. Huge ISP's like AOL should be funneling all of their customers queries from a small number of IP addresses. That's the whole point. On the other hand, some of these are probably losers who are doing dictionary searches on domain names. You are likely to get blacklisted from the Whois servers if you try that. You won't get blacklisted from the DNS servers, it appears. But it should be easy to tell the difference between legitimate query streams and illegitimate ones.
      • ANY ISP should be CACHING on their nameservers, so there should not be _that_ much traffic from them.
        • ANY ISP should be CACHING on their nameservers, so there should not be _that_ much traffic from them.

          How do you suppose that cache gets refreshed? You wouldn't want to change the IP address of your web server and have no AOL users able to reach it for a month, would you?
          • 86400 seconds (==24 hours) is a reasonable time to cache DNS queries. That means that if AOL's DNS server has queried one of the root servers about something (such as "where are the DNS servers for .com?"), it shouldn't make that same query for another day. The data in the root servers (a bunch of names and glue records for all the top-level domains) doesn't change that often, so 24 hours is definitely not too long.

            If you change the IP address of foo.bar.com, that is done in the bar.com DNS servers, and the higher-level DNS servers (.com and root servers) have nothing to do with it. In that case, people won't be able to reach that site for (at worst) 24 hours.
            And if you plan a little ahead, you just set the TTL (time-to-live, the maximum allowed time to cache a record) down a couple of days before you change the IP. If all DNS resolvers do what they should (which they of course don't, hence some of the unnecessary load), "DNS downtime" shouldn't have to be more than a couple of minutes.
    • Our results showed that 98% of the wars have futile reasons (mostly for economic and religious reasons), and only interest to a bunch of people, who will be sleeping happy at their cozy homes while the poor bastards will do the dirty job, killing themselves.

      The good news is that they found that steam cells can become neurons, thus increasing the intelligence of those who have an IQ deficit, or need to rule a country. Scientist say that once leaders see the light, there'll be no more need for wars.
      Or, at least, they'll to able to develop better excuses.
  • The article ...... (Score:5, Informative)

    by Anonymous Coward on Thursday February 27, 2003 @08:58PM (#5402523)
    Internet's Main Root Server Saturated By 98%: Should You Be Concerned?

    February 26, 2003

    By CircleID | Add+Read Comments | Email Article

    A recent study by researchers at the Cooperative Association for Internet Data Analysis (CAIDA) at the San Diego Super Computer Center (SDSC) revealed that a staggering 98% of the global Internet queries to one of the main root servers, at the heart of the Internet, were unnecessary. This analysis was conducted on data collected October 4, 2002 from the 'F' root server located in Palo Alto, California.

    The findings of the study were originally presented to the North American Network Operators' Group (NANOG) on October 2002 and later discussed with Richard A. Clarke, chairman of the President's Critical Infrastructure Protection Board and Special Advisor to the U.S. President for Cyber Space Security.

    In this special CircleID interview with Duane Wessels, president of The Measurement Factory and one of the main scientists who lead the root server study, we attempt to gain a better sense of what has been discovered? What can be done about it? And, how? But most importantly, why? After all, from an end-user's perspective, the Internet appears to be working just fine! Should a businesses that fully or partially depends on the Internet be concerned? Read on...

    CircleID: Mr. Wessels, could you give us a bit of background about yourself and tell us what initiated the study?

    Duane Wessels: I started doing Internet research in 1994. From 1996 to 2000 I worked for the National Laboratory for Applied Network Research (NLANR)/UCSD on a web caching project, including Squid, funded by the National Science Foundation. These days I am president of The Measurement Factory, where we develop tools for testing performance and compliance.

    For this study I joined up with my old friends at CAIDA. Funding for this work came from WIDE in response to questions from ICANN's Root Server System Advisory Committee (RSSAC).

    CircleID: Could you give us a brief background on the significance of your findings in this study, particularly the unique discoveries that were not already known to the technical and scientific community?

    Duane Wessels: Certain facts about root server traffic have been known for a long time. Earlier studies identified certain problems, and some root server operators publish traffic statistics (number of queries, etc). What is unique about our study is that we developed a simple model of the DNS and used that model to categorize each and every query. This allowed us to say, for example, "this query is valid, because we haven't heard from this client before, but this other query is invalid, because the same client sent the same query a short time ago."

    We also took a much longer trace than earlier studies and spent more time looking at individual abusers.

    CircleID: Why the F root server? Is there a particular reason why this root server, located in Palo Alto, California, was selected for the study rather than the other 12 servers?

    Duane Wessels: Paul Vixie and the Internet Software Consortium were kind enough to give us access to the query stream. ISC has the infrastructure in place to make this happen easily, and without any chance of disrupting the operation of the server. We are currently working with other operators to get data from additional sites.

    CircleID: The report on the study indicates "a detailed analysis of 152 million messages received on Oct4, 2002." In other words, the final results are based on only one set of data collected within 24 hours. What about comparison to other dates? Why are you confident that findings from this particular day, October 4, 2002, is a sufficient indication of what is happening today -- or tomorrow, for that matter?

    Duane Wessels: We have no reason to believe that October 4, 2002 is special. It just happens to be the first day that we successfully collected a 24-hour trace. We took shorter traces before and after this date, and they have similar characteristics. For example, our talk and paperPDF mention a particularly large abuser (the Name Registration Company). While writing the paper, we were curious to see whether they had cleaned up their act yet. Indeed, they had not. They were still abusing the F root server months after we had notified them about the problem.

    CircleID: Why should end-users be concerned about the findings, given that their Internet browsing experience does not appear to be affected in any noticeable way?

    Duane Wessels: It's likely that most end-users are not impacted by root server abusers, for several reasons. One is that most users are going through properly functioning name servers, and their queries rarely reach a root name server. Another is that the root servers are overprovisioned in order to handle the load -- root DNS servers are typically multiple boxes placed behind load balancers, and some are even geographically distributed.

    CircleID: What about companies that are running part or all of their business on the web? How are they being affected by this very high -- unnecessarily high -- root server inquiry rate?

    Duane Wessels: Again, I would bet that most of them are properly configured and not severely impacted by root server abuse. Our results showed that 50% of the root server traffic comes from only 220 IP addresses. It's possible that some of these 220 addresses are experiencing a negative side-effect, but I believe that most of these problems go unnoticed. For example, some web servers are configured to look up IP addresses in the in-addr.arpa domain so they can log a hostname instead of an address. But if the lookup fails (as in-addr.arpa queries often do), nobody really notices. The web server logs the address anyway after a timeout.

    CircleID: Moving on to possible causes -- at this time, what do you think are the main reasons for such a high (98%) inquiry rate? Is it possible to identify them?

    Duane Wessels: The short answer is that we suspect firewalls and packet filters.

    When we initially started the study, our assumption was that there must be some broken software out there causing all the root server traffic. Aside from an old bug with Microsoft's resolver [a system to locate records that would answer a query], we didn't really find any implementation-specific problems.

    Approximately 75% of the root server's queries were duplicates. Furthermore, we noticed that most of the repeats occurred at sensible intervals. That is, the agents making queries seemed to be following the protocol specifications.

    From this, it seems most likely that these agents are just not receiving any DNS replies. To the application, it looks like a network outage, so it keeps on retransmitting. By investigating a few individual abusers, we know that they indeed do not receive replies from the root server.

    CircleID: According to Radicati Group research firm, more than 2.3 billion spam messages are broadcast daily over the Internet, and this number is expected to rise to 15 billion by 2006. How does spam, particularly at such high rates, affect the root servers -- especially when you take into account millions, if not billions, of spam emails floating around in people's inboxes, many of which contain broken links that cause bad DNS lookups.

    Duane Wessels: It's entirely possible that spam emails generate an increased load for the root name servers. However, I don't think that simply sending spam increases load. Rather, it's more likely that anti-spam tools do. I can think of two specific examples:

    1. Many anti-spam tools verify "From" addresses and perhaps other fields. If the From address has an invalid hostname, such as "spam.my.domain," the root servers will see more requests, because the top level domain does not exist.

    2. Anti-spam tools also make various checks on the IP address of the connecting client -- for example, the various "realtime blackhole lists" and basic in-addr.arpa checks. These may be causing an increase in root server load, not simply because of the amount of spam, but also because these tools silently ignore failures.

    CircleID: According to the report, "About 12% of the queries received by the root server on October 4 were for nonexistent top-level domains, such as '.elvis,' '.corp,' and '.localhost.'" Many Internet users, in order to avoid spam, are increasingly providing dummy email addresses whenever they are forced to provide personal information on the web. Are all those 'email@lives.elvis'-type fake email addresses triggering part of the 98% problem?

    Duane Wessels: I don't believe so, but I can't be sure.

    Many of the fake email addresses that I've seen are of the form wessels.NOSPAM@example.com or wessels@nospam.example.com.

    Most of the unknown TLD queries probably come from short hostnames. For example, if I set my hostname to "elvis" (instead of "elvis.example.com"), then the root servers are likely to see queries for the short name "elvis."

    CircleID: This is a direct quote from SDSC news release:

    "Researchers believe that many bad requests occur because organizations have misconfigured packet filters and firewalls, security mechanisms intended to restrict certain types of network traffic."

    How far can current unnecessary root server inquiry rates be reduced, considering that organizations such as ISPs will be required to dedicate added time and financial resources to help in the reduction? Do you foresee new regulations and penalties for organizations that are responsible?

    Duane Wessels: Regulations and/or penalties are extremely unlikely. They would be impossible to monitor and enforce.

    I am, unfortunately, skeptical that ISPs and other network operators will take the initiative to reduce root server traffic, for three reasons:

    1. The system works sufficiently well as-is. Many applications use the DNS, but do not depend on it. Unresolved queries go silently unnoticed.

    2. A very small number of sources can cause a significant amount of abuse.

    3. It's often difficult to get people to recognize they have a problem, and even harder to get them to fix it.

    As is often the case with studies such as this, network administrators are left feeling somewhat helpless. That is why we also wrote a tool for examining the DNS traffic leaving a network. People can download our "dnstop" tool from http://dnstop.measurement-factory.com/.

    One of the abusers was misusing packet filters to block incoming, but not outgoing, DNS packets. This prompted us to write a technote for ISC that describes how people should be configuring their authoritative-only name servers. You can find it at http://www.isc.org/tn/.

  • Two words (Score:5, Informative)

    by Gothmolly ( 148874 ) on Thursday February 27, 2003 @09:26PM (#5402678)
    DNS cache.

    My company firewall is a Linux host-based box with some custom logging apps, squid and tinydns. Making your network "Internet friendly" is easy:

    iptables -t nat -I PREROUTING -p udp --dport 53 -j REDIRECT --to-ports 53

    directs all your outbound DNS to your cache. Let users, rogue admins, and anyone else try and resolve from particular nameservers, all they'll get is your own cache.
    • even better idea (Score:5, Informative)

      by Indy1 ( 99447 ) on Thursday February 27, 2003 @10:04PM (#5402978)
      set your dhcp server to assign your company dns server to the clients.

      THEN

      iptables -I FORWARD -p udp --dport 53 -j DROP

      let them try to hit any external dns servers :) they'll be scratching their heads : )

      • by billstewart ( 78916 ) on Friday February 28, 2003 @12:28AM (#5403785) Journal
        Yes, definitely, set your DHCP servers to tell clients about your company's DNS servers, and do a good job of maintaining your DNS serves so they work well. But sometimes people want to ask other servers what's going on, especially if they're trying to track down detailed authoritative information about a name from the real name servers for that name - or it they're spam hunting.
      • by ftobin ( 48814 )

        Yes, let's destroy more of the fundamental end-to-end principles of the net.

        </sarcasm>

        Man, I can't wait for ubiquitous host-to-host IPsec, so these content-based filters are thwarted.

    • I actively practice encrypted firewall piercing, or, at a minimum, running an external socks server. I can't handle castrated networks. The worst of them don't even allow me to get IMAP traffic. Blech.
  • by topologist ( 644470 ) on Thursday February 27, 2003 @09:56PM (#5402922)
    From this article, we've learned the most important truth of our time - elvis is possibly the most popular hostname on the internet (since some large fraction of 12% of the 98% of the queries to the root server are for the top level domain elvis, probably because of a misconfigured resolver). What could this mean? Elvis was the messiah and we just didn't know it? Are there more machines named elvis than Jesus? Are there more elvis impersonators than jesus impersonators? On the other hand, I wonder how many machines are named Gandalf.
  • So what? (Score:5, Interesting)

    by Jordy ( 440 ) <jordan&snocap,com> on Thursday February 27, 2003 @11:40PM (#5403540) Homepage
    I don't understand why this is news or why it required any level of study.

    The root servers handling zone '.' such as F.ROOT-SERVERS.NET put refresh periods of 48 hours on most every query. That means that at most once every 48 hours every name server on the planet should re-ask the root servers where to get answers for each of the gtlds, com, net, org, arpa, etc.

    What they should receive the most queries for are domains that don't exist because everything else is cached for such a long period of time. That is the point of the root servers.

    If the root servers are having trouble handling the query load then they should be upgraded for goodness sake. These are root servers after all and I think the global internet community could spare a few dollars to add some spare capacity if it is required.

    To improve on this, BIND could up the maximum negative RR cache default time to live. Right now I believe it is set to 3 hours and the root servers use a 1 day SOA.MINIMUM instead, so BIND is always lowering it by default.

    Of course, other nameservers are different. Some older versions of BIND by default only stored negative RR for 10 minutes.
    • Why was this article modded up so much? Jordy doesn't seem to have read the original article, or understood the issues behind it.
      If the root servers are having trouble handling ... they should be upgraded
      The article explicitly says this is not the case.
      To improve on this, BIND could up the maximum negative RR cache default time to live.
      If you'd read the analysis you'd know this is completely beside the point. It doesn't explain the single host that was asking for the same non-existent TLD 20 times per second. Also you'll note that the busiest hosts appear to be running Windows: 7.5% of all traffic is attributable to a bug in w2k (for which a patch has been released but evidently not applied).

      The few hundred abusers aren't going to be affected by changes in BIND.
    • I don't understand why this is news or why it required any level of study. The root servers handling zone '.' such as F.ROOT-SERVERS.NET put refresh periods of 48 hours on most every query. That means that at most once every 48 hours every name server on the planet should re-ask the root servers where to get answers for each of the gtlds, com, net, org, arpa, etc.
      Did you actually read the article?

      What they were saying is that they believe most of the excess requests were from systems that were sending out requests but somehow (for instance a misconfigured firewire) the actual replies were not getting back. So it would not matter what the refresh period was, as the reply saying what the refresh period was would never get through.

  • If there wasn't spam, we wouldn't need anti-spam utilities...so wouln't you say that the excesive queries are, IN FACT, caused by SPAM?
  • by billstewart ( 78916 ) on Friday February 28, 2003 @12:34AM (#5403817) Journal
    Most DNS queries get handled out of some kind of cache. While it's definitely important to be able to query your favorite root or alternate-root-like server when you really need to, you don't usually need to. If you ask your local vaguely-correctly-configured server for something, then ask it again before the expiration date, the first time it sees it it'l cache it, so the second time it can get it out of cache (unless the cache entry expired or the cache overflowed.) But if the entry's nonexistent, it's not likely to stick around the cache. So there's a need for a standard way to respond to well-known non-existent names, so the cache has something to keep for popular bogus queries. Obviously "localhost" is "127.0.0.1", and "example.com" can be just about anything not in use but might as well be 127.0.0.1, but it'd be nice if there were some other standard value to use. Maybe 127.0.0.0 or 127.255.255.255 (e.g. yell at yourself :-) ?
  • by blowdart ( 31458 ) on Friday February 28, 2003 @12:54AM (#5403913) Homepage

    Many anti-spam tools verify "From" addresses and perhaps other fields. If the From address has an invalid hostname, such as "spam.my.domain," the root servers will see more requests, because the top level domain does not exist.

    DNS lookups on the sender address was common before there was a major spam problem. It makes sense, why would you want to take email from somewhere you cannot reply to? So I don't think you can blame anti-spam tools for this.

    Anti-spam tools also make various checks on the IP address of the connecting client -- for example, the various "realtime blackhole lists" and basic in-addr.arpa checks.

    in-addr.arpa checks has been a standard practice in networking software, not just email, since it was available. Some FTP servers do it, some web servers do it, your web log analyzer does it, IRC does it. You can't put that one onto anti-spam tools either.

    The use of dnsBL lists will, of course, create extra load, when you look up the name servers for the list(s) you are using. But in all likelihood the NS and A records are cached at your local server. You're not hitting the root server with every lookup.

    This guy seems full of bull. Note that he is not a LEAD scientist for the root servers, he's a lead scientist for the company that produced the report.


  • I seriously doubt extranneous DNS queries rate in the top 10, or hell, even top 100, of culprits of network inefficiency. The fact that it only takes 13 of these servers to keep the entire internet afloat should be a testament to the efficiency of the protocol.

    so obviously it is critical to totally reform the DNS implementation as it exists today. maybe if we free up some traffic, we can look towards more important things... like defending the right for some little prick to be KaZaaing half of the music released in the last 15 years across 2 oceans with it ending up in some 3rd world chinese province where it is pressed into 2 gazillion cds and sold to some guy who has never paid more than 5 cents for something in his whole damn life. geez, I gotta get off this site ;)
  • by bigberk ( 547360 ) <bigberk@users.pc9.org> on Friday February 28, 2003 @02:02AM (#5404164)
    I really think that one of the very nice things happening in anti-spam these days is the increasing use of local, independent processing power rather than centralized network queries (like realtime blacklists).

    A growing number of projects are implementing Bayesian filtering techniques for example. I personally love spamprobe [sourceforge.net], but there are many others. Some, like spamprobe go server side and others are even client-side. They work equally well by filtering spam based examples you train it with. In the 4 months I've been using it, I've achieved 97.6% accuracy. And no DNS queries, no load to any other site but my disk & CPU.

    Anyway, the advantage of this sort of filtering is that you do all the decision making locally, and no data flies across the internet. Remember, what we have in abundance is processing power. But network resources should be conserved.
  • Things could actually be operating to spec (except for the few abusing the root servers to do dictionary searches etc).

    I see the RFC suggest minimum values of 2 to 5 seconds for retransmissions. What values do implementations pick?

    In many situations the round trip time between the querying host and the root server could be more than the retransmission timeout, that's why the root server gets more than one request.

    In other cases there could be packet loss.

    And if the reply takes too long (delays etc), firewalls could timeout the stateful filtering rules for the returning DNS reply, requiring yet another query.

    It may be that some DNS implementations go to the root servers more often. Does djbdns's dnscache do that?
  • Think about it. How many new domain registration sites have popped up over the last year or two? For 7.95, you can have your own domain.

    What does this lead to? Millions of people doing searches on Go-Daddy, Verisign, etc for their vanity domain name.....

    And then, there is the spam email about owning your own domain, and spam about increasing traffic to your site, and spam about blocking spam to your site, etc....

    I really hope my tax dollars did not pay this guy. Traffic on the root name servers is way down on my priority list, right under voluntary castration.....
    • Availability checks for domain name registrars never hit the root servers. The registrars connect directly to the SRS (Shared Registry System) and look up records there.

      It would be silly to use the root servers as a basis for availability, especially since the root servers know nothing about individual domains, only TLDs (the root server zone file is less then 50k, iirc) . But even assuming you meant the DNS servers one level down(like the GTLD servers that handle .com, .net, and .org), none of them refresh in real-time, so you could be registering a domain that had actually been taken 6 hours previously.

      Thanks,
      Matt
  • All I can think of while reading this is Chekov saying, "Nuclear wessels.. we're looking for the nuclear wessels!"
  • If you think about it, creating a lot of unneeded DNS queries does the internet a favor. When everyone wastes resources, that means that the systems are designed to handle so much traffic that it will be extremely difficult to initiate a DOS attack. Your thousand boxes will simply drown in the noise from the rest. At least that's a theory :)

The explanation requiring the fewest assumptions is the most likely to be correct. -- William of Occam

Working...