Journal the_mad_poster's Journal: Alpha Code: Journal Archive Script 14
Get it here and rename the extension to
It's perl, so if you don't have Perl, you'll need to get it. If you're on Windows, ActiveState Perl is easy to install as any other Windows program.
To invoke it:
perl getslash.pl -a
If you don't include the -a switch, I'll not be held liable for where it puts the archived files...
Optionally, you can specify an archive location with the -a switch right on the command line:
perl getslash.pl -a/home/user/slashstuff
You can also specify your user name on the command line:
perl getslash.pl -Nyou -a/home/user/stuff
So, to archive all my entries, I ran:
perl getslash.pl -Nthe_mad_poster -ac:\foobar
If you have a space in your username, you'll need to wrap the entire chunk with -N in double quotes. If you don't specify the archive location or your name, you'll be prompted (but you still need the -a switch regardless). It SHOULD work on Linux, BSD, or Windows, but I've only tried it on Windows at the moment. It will only archive comments that are above the 2 threshhold right now - the rest is coming in beta
Complaints, requests, and free beer can be sent to my public email address or posted here.
Update: Comment Ripping
Alright, I've been studying the HTML that
How are you going to get around (Score:1)
Re:How are you going to get around (Score:2)
Re:How are you going to get around (Score:2)
Re:How are you going to get around (Score:2)
As long as people don't abuse it individually, it shouldn't be a problem. It's set up not to rip the entire journal every time, so the only big hit would be the first one. After that, it will only pull the new entries that you haven't archived unless you force it to do otherwise (makes it cron-friendly).
Maybe I'll add a "stealth" type of functionality (a la nmap) that delays each action. Maybe put a 5 second delay in between page hits for really big jobs.
Error message (Score:2)
Re:Error message (Score:2)
Re:Error message (Score:2)
Right-o. I think I'll just change that permanently to 'my' and "require 5.005" instead.
Suggestion (Score:2)
Anyway, a suggestion: add leading zeroes to the (shorter) filenames so they sort in the correct order.
Re:Suggestion (Score:2)
Heh, that's just because you're a Slashdot fogie and have shorter SIDs than the rest of us whippersnappers ;)
I'll set the default filename to be 12 characters long so that it allows for 1 trillion SIDs (assuming 0 is a valid SID).
Re:Suggestion (Score:1)
Personally, I think you should let sourceforge host the project, and save yourself some bandwidth;)
As far as comments, I prefer nested myself. Perhaps user toggleable, as well as threshhold?
If anybody will trip the sensors on overuse on the first run, it would probably be em emalb or Sam the Butcher. (Sorry Em, too lazy to go back and fix capitalization on your name.)
Re:Suggestion (Score:2)
The problem with using anything other than flat mode is that you have multiple page hits. Deep nested structures - which are common in TechnoLust's, Em's, and StB's entries - would ramp the page hits up on the Slashdot server like you wouldn't believe. I mean, Slashdot pisses me off and all, but I still wanna be nice to their server :)
I'll work on instituting a Nested mode that actually pulls at -1 Flat and then reconstructs the proper order in Nested format using the parent links and CIDs of the pulled co
662 (Score:1)
(Oh, and any chance of making the extension
Re:662 (Score:2)
Wow.... how many megs of data did it pull, out of curiosity?
.jrnl is going to be the file that gets passed into another script which spits out valid HTML files. It's kinda like an object file for the HTML sanitizer that doesn't exist yet ;)
Re:662 (Score:1)
But, us low UID bastards have been spewing journals for a long time:)
You might want to talk to TechnoLust; I think he was going to work on a similar project.