Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Spam

Journal slappyjack's Journal: Since we al lhate SPAM

I wrote this a while ago, and decided to pop it up here in the tradition of
http://slashdot.org/comments.pl?sid=141718&cid=11872925

which I found during this thread:
http://it.slashdot.org/article.pl?sid=05/07/18/1214226

Its a totally inelegant thing with a bitchload of bugs, im sure, but I used to run it when I was doing 14 hour shifts at work. I;d come home and see that it had sucked up a couple hundres of megs of spammy bandwidth.

I am proud of the fact that I thought of sending them different useragents, though

Er, I mean, it kept checking my favorite email avertised sites for updates and new products!

whatever. if enough people ran little single scripts like this against sites that they personally confirmed were spam, we might be able to make a fucking dent in that shitty business model.

i'm just saying.

flame away.

#! C:\perl\bin
# aaa.pl 29 Dec 04
#
# slashdot.org carried these stories a few weeks ago:
#
# Lycos Declares War on Spam Servers
# http://it.slashdot.org/article.pl?sid=04/11/26/2129238&tid=111&tid=95
# a followup can be found here:
# Lycos Pulls Vigilante Anti-spam Campaign
# http://slashdot.org/article.pl?sid=04/12/04/1417200&tid=111&tid=95&tid=1
#
# I thought this was a pretty neat little idea - just suck spamvertised
# sites bandwidth all day long and run up their hosting bills. I
# wondered how hard this would be to hack up as a perl script.
#
# Only took about two days of fucking around a little.
#
# The script uses the following files, all in the same directory as the script
# itself:
# ./aaa.pl ----------- this script
# ./target.txt ------- list of trargeted pages
# ./newtarget.txt ---- file to add targeted pages to while script is running
# ./last_targets.txt - backup list of targets. Useful if script is
# terminatred mid-loop so you dont lose your targets
# ./rundata.txt ------ some logging
# ./keep_running.txt - very basic flag. can be used in while loop
# as reason to keep running or stop
# ./stopnow.txt ------ CREATE this file in the same directory to make the
# process stop after current page is sucked.
#
# This is a little bit of a better solution than the makelovenotspam
# idea Lycos had. Its not a coordinated thing and there is no central
# repository of places to hit, so some pissed off guy couldn't target
# a valid site like microsoft or something.
#
# Of course, theres nothing stopping you from getting a list from the
# web using LWP and just letting this thing update itself forever.
# Might be a small burden on the guy hosting the list, though. Maybe
# you only wanna check it once a day or at program start or something
#
# What? You DON'T have perl on your machine?
# Shame on you! get it FREE at ActiveState
# http://activestate.com/Products/ActivePerl/
#
# its FREE, so what's your excuse?
#

$|++;
$Start_Time = time;
print "Starting: ".$Start_Time."\n\n";

# Make Ubiquitous web agent
use LWP;
use HTTP::Headers;
$ua = new LWP::UserAgent;
$ua->agent("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");

# some preset variables
@targets = ();
@dead = ();
$bytecount = 0;
$runs = 1;

# create keep running flag
open(RUNFILE,">keep_running.txt");
print RUNFILE "Delete this file to close up shop early.";
close(RUNFILE);
# To use the "Runfile" lock, add the following to your while loop
# && -e ./keep_running.txt

# Uncomment this one to loop indefinately
# OUTERLOOP:while ($runs) {
# Uncomment this one to loop a specific number of times
# OUTERLOOP:while ($runs ){
                chomp();
                (length($_) ){
                chomp();
                (length($_) newtarget.txt");
        close(NEWLIST);

        # Save the list of targets to a backup file in case the script gets
        # shut down for some goofy reason.
        @lasttargets = @targets;
        open(LASTLIST,">last_targets.txt");
        foreach $oldlist (@lasttargets) { print LASTLIST $oldlist."\n"; }
        close(LASTLIST);
        @lasttargets = ();

        @targets = sort(@targets);
        print "opening list for writing..\n\n";
        # hoseout and reopen the list, this is for saving sites that are still
        # alive. We DO NOT save dead sites.
        open(LIST,">target.txt");

        # loop through sites
        GETSITE:foreach $thing (@targets) { # START FOREACH LOOP
                # switch up then useragent by site, for fun
                $Label_roll = int(rand(40));
                if ($Label_roll == 1 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 2 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"; }
                elsif ($Label_roll == 3 ) { $ua_label = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0)"; }
                elsif ($Label_roll == 4 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 5 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"; }
                elsif ($Label_roll == 6 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705)"; }
                elsif ($Label_roll == 7 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"; }
                elsif ($Label_roll == 8 ) { $ua_label = "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"; }
                elsif ($Label_roll == 9 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 10 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"; }
                elsif ($Label_roll == 11 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 12 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT;)"; }
                elsif ($Label_roll == 13 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; AOL 9.0; Windows NT 5.1)"; }
                elsif ($Label_roll == 14 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"; }
                elsif ($Label_roll == 15 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"; }
                elsif ($Label_roll == 16 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"; }
                elsif ($Label_roll == 17 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 18 ) { $ua_label = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0"; }
                elsif ($Label_roll == 19 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 20 ) { $ua_label = "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1)"; }
                elsif ($Label_roll == 21 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; AOL 9.0; Windows NT 5.1; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 22 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"; }
                elsif ($Label_roll == 23 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; AOL 9.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"; }
                elsif ($Label_roll == 24 ) { $ua_label = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040803 Firefox/0.9.3"; }
                elsif ($Label_roll == 25 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 5.5; Windows 95)"; }
                elsif ($Label_roll == 26 ) { $ua_label = "Mozilla/4.5 [en] (Win98; I)"; }
                elsif ($Label_roll == 27 ) { $ua_label = "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/125.5.5 (KHTML, like Gecko) Safari/125.12"; }
                elsif ($Label_roll == 28 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) Opera 7.54 [en]"; }
                elsif ($Label_roll == 29 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705)"; }
                elsif ($Label_roll == 30 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts)"; }
                elsif ($Label_roll == 31 ) { $ua_label = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)"; }
                elsif ($Label_roll == 32 ) { $ua_label = "Mozilla/2.0 (compatible; Ask Jeeves/Teoma)"; }
                elsif ($Label_roll == 33 ) { $ua_label = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)"; }
                elsif ($Label_roll == 34 ) { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Opera 7.54 [en]"; }
                else { $ua_label = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"; }
                $ua->agent($ua_label);

                print "Getting: ".$thing."\n";

                if ($thing =~ /^(http\:\/\/\S+\/)[^\/]*$/) {
                        $base_url = $1;
                        print "\tBASE:".$base_url."\n";
                }
                if ($thing =~ /^(http\:\/\/[^\/]+)/) {
                        $base_domain = $1;
                        print "\tDOMAIN:".$base_domain."\n\n";
                }
                # Get the page!
                $req = new HTTP::Request GET => $thing;
                $res = $ua->request($req);
                #print "Sucess: ".$res->is_success;
                print "Site status line: ".$res->status_line." - ".length($res->as_string)." bytes\n";
                $bytecount += length($res->as_string);
                # Check the status - if its good, write it to the list for hitting
                # again, or don't save it.
                if($res->status_line eq "200 OK") {
                        #print "status line: ".$res->status_line."\n";
                        print "writing ".$thing." to list.\n\n";
                        print LIST $thing."\n";
                } else {
                        # This was for daving dead sites, but I got bored witrh it.
                        #print LIST "# ".time()." ".$thing."\n";
                }

                # Here, we find the images in the file for requesting.
                # Why? More bandwidth sucked, thats why.

                # Reset Image array
                @images = ();
                # Split up the page for easy parsing.
                # You could make this nicer/more effective.
                @lines = split("\n",$res->content);
                LINEREAD:foreach $line (@lines) {
                        # Find the images
                        if ($line =~ /img src=\"(\S+)\"/) {
                                #print $1;
                                foreach $image (@images) { ($image eq $1) && next LINEREAD; }
                                push(@images,$1);
                        }
                }

                # Check the image tag and make the image request.
                foreach $image (@images) {
                        if ($image =~ /^http/) {
                        } elsif ($image =~ /^\//) {
                                $image = $base_domain.$image;
                        } else {
                                $image = $base_url.$image;
                        }
                        # go get the image.
                        print "\tGetting Image:".$image."\n";
                        $req = new HTTP::Request GET => $image;
                        $res = $ua->request($req);
                        print "\t\tstatus:".$res->status_line." - ".length($res->as_string)." bytes\n";
                        $bytecount += length($res->as_string);
                }

                print "\n\n------------------------------------\n\n";

                # Rest a little while. Useful if you're using your bandwidth for other
                # important tasks, like downloading pr0n.
                # sleep(5);

                # for ending program quickly
                if(-e "./stopnow.txt") {
                        # Rewrite the target list into targets.txt Yes, there will be
                        # doubles, but the next run of the script will scrub them anyway.
                        # You can just go manually edit the targets.txt if you want.
                        foreach $thing (@targets) {
                                print LIST $thing."\n";
                        }
                        # Cleanup flagging files.
                        unlink("./stopnow.txt");
                        unlink("./keep_running.txt");
                        last GETSITE;
                }

        } ## END FOREACH LOOP
        # Save the dead sites to the end of the list, if you like
        #foreach $thing (@dead) {
        # print LIST $thing."\n";
        #}
        # Save the list for the next loop.
        print "closing list for saving.\n\n";
        close(LIST);

        # Little message to tell us how well we're doing so far
        print "Got $bytecount bytes!\n\n";

        # Rest a little while. You can just comment this out if you like,
        # or change the delay time.
        $delay = int(rand(5));
        print "sleeping for ".$delay."seconds\n\n";
        sleep($delay);

        # Count the number of times this ran
        $runs++;

        # Some Logging, for fun. You could add a ton of shit to this,
        # were you so inclined. I KNOW this numbers arent that exact, whatever.
        open(RUNDATA,">>rundata.txt");
        print "Last Looper was:\nGot ".$bytecount." bytes\nin ".($runs-1)." Loops\nand ran for ".(time - $Start_Time)." seconds\n";
        close(RUNDATA);
} # END WHILE LOOP

# Nice Closing output to show what you've been doing.
print "totals:\nGot ".$bytecount." bytes\nin ".($runs-1)." Loops\nand ran for ".(time - $Start_Time)." seconds\n";

# This shit is so much fun.
#
# Other thought: You could be really complete and toss the
# pre-built Mirror module from CPAN in the GETSITE: loop.
# This would suck a lot more bandwidth from the target, but I think
# it also tries to save the site to disk.
# Of course, you couls just erase all the shit you just sucked for the
# next go-round.

__END__

This discussion has been archived. No new comments can be posted.

Since we al lhate SPAM

Comments Filter:

So you think that money is the root of all evil. Have you ever asked what is the root of money? -- Ayn Rand

Working...