Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
User Journal

Journal the_mad_poster's Journal: Getslash Update - Bugfixes 16

Debian Note: Looks like you can use apt-get to install HTML::Template on Debian. Thanks to gmhowell for this little tidbit.

Update: Stupid hosts. getslash.txt should be saved and renamed to getslash.pl since it was redirecting all "Save As" requests" into the cgi-bin.

getslash.pl, Sanitizer.pm, and the Changelog are all here (please don't left click the perl script. You're just going to cause one of those dreaded 500 errors if you do.. you see enough of them on Slashdot). They can all go in the same directory or you can put Sanitizer into another path from your @INC setup if you know what that means.

Okay, I suggest using the new -l switch (that's an ell for those of you with silly fonts) first to test it, make sure it runs okay. -l[n] will limit the number of ripped JEs to 'n' number. (for example, -l10 will limit you to ten rips). Good for making sure it all works. xtext (that funky mix of text and HTML) is still the only sanitization option since it preserves all valuable content, even if it doesn't make it purty. Superfluous stuff comes later.

Fixed the spacing issue for you folks who have spaces in your handles. Recommend deleting your entire archive and using this invocation:

perl getslash.pl -Nyour_name -a/home/your/archive -p -l10 -Sxtext

If it works for you, you can rerun it without the -l10 switch to get your whole journal.

Only had a chance to test on Windows. rdewald was having trouble creating an archive directory automatically when he used the home shorthand '~'. Don't know what's up with that... gotta keep digging into the problem to see what's going wrong.

Also, the options file was busted as all hell. Fixed that, I think. Let me know if you have issues.

Report errors here. This time I'm going to be sending bugfixes straight up to my server, so if you report a bug and I say it's fixed, you can grab the fixed copy right away.

This discussion has been archived. No new comments can be posted.

Getslash Update - Bugfixes

Comments Filter:
  • I can get changelog and Sanitizer.pm, but trying to save the link target (it's a right-click) for getslash.pl returns some sort of "it ain't here" message.
    • Uh... that's an interesting error seeing as how that's just a directory listing there....

      I must be cursed (the server is redirecting requests to /cgi-bin/ ... if the stupid jackasses running the box had given me my SSH access like I asked for, I wouldn't have these stupid problems with the server...).

      At any rate, it's fixed now. I just changed the name to getslash.txt. You shouldn't HAVE to rename it if you don't want to (just run it as 'perl getslash.txt ...' instead of 'perl getslash.pl ...') but it loo

  • rdewald@Lisa:~/journal$ rm -r archive
    rdewald@Lisa:~/journal$ ls
    changelog Changelog.txt getslash.pl Sanitizer.pm
    rdewald@Lisa:~/journal$ perl getslash.pl -Nrdewald -a/home/rdewald/journal/archive -p -l10 -Sxtext

    /home/rdewald/journal/archive does not exist. Create it? (Y/n)
    ripping...
    Getting UID from rdewald...
    Found UID 229443
    archiving 77809.jrnl...
    archiving 71854.jrnl...
    archiving 79239.jrnl...
    archiving 78756.jrnl...
    archiving 78571.jrnl...
    archiving 70771.jrnl...
    archiving 79137.jrnl...
    archiving 70399.jrnl...
    a

    • Iced coffee is not a good beverage choice on a hot summer night during which you plan to get some sleep.

      Without the l10 flag I get this output [rdewald.us] to the console. No errors.
  • First of all, thank you very, very much for all your hard work!

    When you do your changelog, would you mind ordering the versions (in the log) reverse chronological? That way, we don't have to scroll down to find the latest and greatest stuff. Muchas Gracias.

  • for debian, you'll need to:

    apt-get install libhtml-template-perl

    Running with following command: ./getslash.pl -Ngmhowell -a./ -p -l10 -Sxtext

    And things are a-okay.
  • OK, now I'm showing off how little I really know about Perl.

    This happens when I invoke the Perl script:

    [john@firebolt:getslash]$ perl getslash.pl -NEthelred%20Unraed -a/Users/john/Terminal_Scripts/getslash/archive/ -p -Sxtext

    /Users/john/Terminal_Scripts/getslash/a rchive/ does not exist. Create it? (Y/n)
    ripping...
    Getting UID from Ethelred%20Unraed...
    Use of uninitialized value in concatenation (.) or string at getslash.pl line 294, <STDIN> line 1.
    Found UID
    Use of uninitialized value in concatenation

    • [john@firebolt:getslash]$ sudo perl -V
      Summary of my perl5 (revision 5.0 version 8 subversion 1 RC3) configuration:
      Platform:
      osname=darwin, osvers=7.0, archname=darwin-thread-multi-2level
      uname='darwin hampsten 7.0 darwin kernel version 6.0: fri jul 25 16:58:41 pdt 2003; root:xnu-344.frankd.rootsxnu-344.frankd~objreleas e _ppc power macintosh powerpc '
      config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe -Dldflags=-Dman3ext=3pm -Duseithreads -Duseshrplib'
      hint=recommended, useposix=true, d_sigaction=def
      • Aha! My first Mac user!

        Try passing in "Ethelred Unraed" for the -N switch (be sure to use the double quotes or you'll confuse the shell) rather than HTML escaping the space. I suspect the % is b0rking things, because it's not getting your user ID. The %20 notation is translated internally whenever it needs to be translated, so it's actually trying to find journals for the user "Ethelred%20Unraed" (it's kind of complicated as to why it works partially...).

  • first off, this is very cool, very good work my good man.

    Now, as is my nature, to make some suggestions to a future version :)

    1. It may be a good idea at some point to decouple the format of the HTML from the actual code. You know how these knuckleheads are around here... all they need to do is change the position of an element and, well, it could mean tweaking the program.

    Decoupling, like creating a "rules" file (something akin to WebL) would mean that changes to accommodate format changes would have le
    • While xtext is the only option now (it provides all the necessary data such as URLs and formatting without any frills), I plan to also have Plain text, XML, Valid HTML, and HTML that comes out of user-specified templates. With the exception of XML, this will actually be a very simple process.

      I was thinking today about a meta-language type of ruleset that could go with the parser (at 8 a.m. while I frantically searched for the sunglasses that I had in my bloody pocket...). You would just build a ruleset, sa

      • plan to also have Plain text, XML, Valid HTML, and HTML

        very cool :)

        I was thinking today about a meta-language type of ruleset

        Yeah, seriously take a look at WebL [compaq.com] from Compaq. I started working with it and it seems to be pretty nice. You may just want to look at it just to get some ideas.

Lawrence Radiation Laboratory keeps all its data in an old gray trunk.

Working...