Comment Re:Okay kids...(in Ruby) (Score 2, Interesting) 104
I couldn't resist - in Ruby, using the beautiful (but much understated) hpricot library:
doc = Hpricot(open(html_document))(doc/"a").each { |a| puts a.attributes['href'] }
Check it out - I've been using it for a project, and it's really fast and really easy to use (supports both xpath and css for parsing links). For spidering you should check out the ruby mechanize library (which is like perl's www-mechanize, but also uses hpricot, making parsing the returned document much easier).