Forgot your password?
typodupeerror

Comment Re:Okay kids...(in Ruby) (Score 2, Interesting) 104

I couldn't resist - in Ruby, using the beautiful (but much understated) hpricot library:

doc = Hpricot(open(html_document))
(doc/"a").each { |a| puts a.attributes['href'] }

Check it out - I've been using it for a project, and it's really fast and really easy to use (supports both xpath and css for parsing links). For spidering you should check out the ruby mechanize library (which is like perl's www-mechanize, but also uses hpricot, making parsing the returned document much easier).

Slashdot Top Deals

Life would be so much easier if we could just look at the source code. -- Dave Olson

Working...