Follow Slashdot stories on Twitter


Forgot your password?

More PDF Blackout Follies 309

georgewilliamherbert writes "The latest installment of "As the PDF Blackouts Turn" hit today, with a U.S. government apparently releasing a redacted version of their court filing in the Balco grand jury leak case which merely stuck a black line over the text, which remains available in the document. As with prior documents, entering text cut/paste mode in a normal PDF browser such as Acrobat allows a reader to access the concealed text. Previous incidents include an AT&T filing in the NSA case." This works with Xpdf and KPDF, too; for KPDF, use the selection tool (under the Tools menu) around the redacted section, copy to clipboard, then paste into the text-manipulator of your choice.
This discussion has been archived. No new comments can be posted.

More PDF Blackout Follies

Comments Filter:
  • by TWX ( 665546 ) on Thursday June 22, 2006 @10:45AM (#15582443)
    ...before they are told to just take a print-screen of the document, page by page, then use a graphics program to install the black boxes over words, then import each image as a page into their PDF creator...
  • Re:Maybe (Score:4, Insightful)

    by gEvil (beta) ( 945888 ) on Thursday June 22, 2006 @10:47AM (#15582461)
    You don't even need to go into vector graphics with these people. All you need to do is attempt to convince them that white text is still text, or that black text on a black background is still text. Either way, the text is still there. The only way to ensure that it's gone is to ACTUALLY GET RID OF THE TEXT.
  • by Billosaur ( 927319 ) * <(wgrother) (at) (> on Thursday June 22, 2006 @10:48AM (#15582473) Journal

    You would think that people would have learned after the first time around. Apparently not.

    You're giving people too much credit; as has been noted in this forum many times, the average computer user is not exactly bright and doesn't read Slashdot, so they would have no idea that this is a problem. People just assume that if something appears to work a certain way, it in fact works that way.

  • by cavtroop ( 859432 ) on Thursday June 22, 2006 @10:51AM (#15582495)
    No, more than likely they will just pass a new law, stating that "Copying and pasting of blacked out (redacted) lines is a felony" or somesuch...
  • by richg74 ( 650636 ) on Thursday June 22, 2006 @10:55AM (#15582525) Homepage
    This is in principle a good idea. However, the implementation may suffer from a fundamental problem.

    My grandfather used to say that there is one irreducible requirement for training a dog: you have to be smarter than the dog.

  • by jimktrains ( 838227 ) on Thursday June 22, 2006 @10:55AM (#15582528) Homepage
    "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." - Douglas Adams
  • by alewar ( 784204 ) on Thursday June 22, 2006 @11:06AM (#15582610)
    "Security by obscurity" :)
  • by gEvil (beta) ( 945888 ) on Thursday June 22, 2006 @11:08AM (#15582630)
    Why not have a handy context menu option, "Redact selection"

    Because management and clueless users will demand that there be an "unredact selection" menu option, also. I'll let you sort out the implications of that. Either that or original copies of documents everywhere will have text permanently blocked out by the above-mentioned clueless users and management types.
  • by milgr ( 726027 ) on Thursday June 22, 2006 @11:11AM (#15582654)
    I googled for redacted doctuments, chose some pdfs at random, and found that the text is behind the black bars.

    When I started searching, I googled for redact. There were two ads for products that remove the text from the pdf as well as create the black bar. One made it clear that the text would be inaccessible from hackers.

    So, why aren't these types of tools being used for all redactions?
  • Re:Maybe (Score:4, Insightful)

    by Nutria ( 679911 ) on Thursday June 22, 2006 @11:15AM (#15582679)
    these fellows at the NSA

    NSA? Since when does the NSA redact subpoenas for the District Attorney?

  • Imitation Blonde (Score:1, Insightful)

    by Anonymous Coward on Thursday June 22, 2006 @11:15AM (#15582680)
    Why do geeks assume that everyone else in the world are idiots?
    It is more likely that this "mistake" (wink, wink) was intentional.
    Many here are so smug about how much smarter they are than the poor person who didn't understand how PDFs work. In reality, it is the those smug people that come out looking gullible and naive. Somebody plays a little bit blonde, and you eat right out of their hands.
  • by Anonymous Coward on Thursday June 22, 2006 @11:17AM (#15582697)
    I think that's called the DMCA
  • by squiggleslash ( 241428 ) on Thursday June 22, 2006 @11:19AM (#15582703) Homepage Journal

    Alternatively, perhaps the technology is at fault. If the same mistake is made over, and over, and over again, many user interface experts would start investigating whether it's the UI, not the user that's at fault. The argument is that the mistake is being made because the correct solution is not intuitively obvious.

    I'd be curious to know what tool the users are using to black out the text. Are they just exporting from Word but, before exporting, "blocking it out" in Word? If so, how? Are they putting black blocks over text, or setting attributes of the relevent text? If these are the wrong techniques, what can be done to make the right techniques obvious (and the wrongness of these techniques equally obvious)?

    I've designed enough crappy UIs in the past and justified them with "It's user error! All they have to do is hit the OK or CANCEL buttons, of course it's not going to work if they close the window instead!" and other such stuff that, with hindsight, was utterly wrong and elitist of me, to know that technically skilled people are not the best judge of intuitiveness. The fact is, I'm a programmer. You're probably technically minded too. The average user isn't. We can't avoid making assumptions about what the user thinks works that are, on occasion, completely, 180 degrees, wrong. What we can do is own up to them and try to determine how to steer the user in the right direction.

  • by gstoddart ( 321705 ) on Thursday June 22, 2006 @11:20AM (#15582716) Homepage
    You're giving people too much credit; as has been noted in this forum many times, the average computer user is not exactly bright and doesn't read Slashdot

    You're giving people too little credit. Most people who use computers are probably fairly bright -- they're lawyers, doctors, accountants, and all sorts of things most people on Slashdot can't do. Reading Slashdot doesn't make you bright (in fact, given much of hte drivel, just the opposite.)

    But, they expect computers to work like a friggin' toaster, and to them, if the text it blanked out, it's not readable. They're not going to realize the 'black' is a representation of a rectangle in a different document layer, and that the actual internal tree of the PDF still contains the actual text. Really, how could they?

    They understand computers by metaphor and analog to the real world. They don't know or care about the actual internal stuff. Since the paradigms have been done to look like the real-world, these people assume that the rest of the things also apply.

    Many people use computers who don't have a full grasp on all of their intricacies. However, I haven't looked inside of a TV in 20+ years, but I'm comfortable using one.

  • Re:This proves it: (Score:4, Insightful)

    by Svartalf ( 2997 ) on Thursday June 22, 2006 @11:33AM (#15582822) Homepage
    Excuse me, any electronic format, unless it is a bitmap format, will have this problem unless
    all the viewers 100% honor the redaction as it's intended. In the case of a bitmap format,
    you can burn a black or white rectangle into the original image and then add an annotation
    a la TIFF's annotations that contains the original portion of the image that was redacted
    in an encrypted format so that it's difficult to expose the redaction- IF you need to have
    the redaction exposed. If not, you hand across the redacted image as-is without annotations.

    This has NOTHING to do with PDF or ODF at all- trying to make this a connection to these
    is bogus to say the least. In this case, I believe that the people doing it used the MS Office
    redaction capabilities and then exported the redacted content to PDF, which the export
    carried the same sort of redactions across to the other format. What happened is because
    someone didn't understand the tools they were using, not because of PDF or ODF.
  • Acceptance of Risk (Score:5, Insightful)

    by Kadin2048 ( 468275 ) <> on Thursday June 22, 2006 @11:39AM (#15582875) Homepage Journal
    While you make a good point, the people who have to use computers to accomplish their jobs, but do not make an attempt to understand how they work (and just treat them like "black boxes") are taking an enormous risk. They are hitching the metaphorical wagon of their livelihood to a team of horses that they don't know shit about.

    If you were somebody who made your living in television, but didn't understand anything about it, you would likewise be taking a great risk. You might, for instance, look like a big idiot when you show up to work at your anchor desk wearing a horizontally pinstriped shirt (which looks like ass on TV because of the Moire effect between the lines on the shirt and the TV scanlines). If you had understood the technology a little better, you might not have done that. That's a trivial example -- undoubtedly if you were a TV anchor, you'd learn or be told at some point not to wear a shirt like that without having to learn about scanlines -- but I hope you see my point.

    Whenever you use a technology without learning about it, you accept a certain amount of risk. Sometimes, you gamble and win: you just use the technology, get your job done, and nobody's the wiser. You're faster, more efficient, more competitive, you look like a hero to your boss, whatever. But if the technology doesn't work, then you're SOL -- but that's the price you pay for not understanding it. That's the risk you accepted when you said to yourself "eh, I don't really care what goes on inside there."

    In the case of PDF, we have a lot of people using a certain technology without knowing anything about how it works, and thus -- like the TV anchor in his pinstriped shirt (or a weatherman wearing chroma-key blue or green) -- you get these gaffes.

    I'm not saying that everybody needs to learn about how everything they use all day works, down to the bare metal. Virtually nobody needs to know that, except perhaps people who are doing things that are so dangerous that they can't afford to fuck up. However, people should be aware of the tradeoff they're making and the risk they're accepting when they forgo figuring out the internal details of a system and simply accept it as a whole, on faith that it will always work a certain way. As long as people are aware of that decision, and make it consiously, and accept the results, you can't ask for more.

    Generally speaking: faith is a fine thing, as long as you know when you're relying on it. It's when you thought you were relying on something else, and find out that you had nothing but faith, that a problem has occured.
  • by gEvil (beta) ( 945888 ) on Thursday June 22, 2006 @11:44AM (#15582910)
    What happens when I actually want to print white text on a black background? Will I have to go through some convoluted process because setting the background as black doesn't actually change the background to black, but rather also eliminates any text contained within it?
  • I am pretty sure that rasterized PDF documents violate government disability-access guidelines, since they can't be read with screenreaders, braille terminals, or basically anything other than a set of human eyes (or a good OCR program).

    They would be a lot better off going through the document in Word (or Notepad/Textedit/vi/EMACS/whatever) and just selecting the regions of text that they want to remove, and replacing it with [-- TEXT REMOVED --] or even [REDACTED]. If they were really slick, I'm sure somebody could write a little macro to replace the text with an equivalent number of characters of whitespace or random text or dashes, to preserve formatting. (Okay, so to really preserve the formatting it would have to be replaced with characters that have the same amount width as the deleted characters; maybe there's a font-set containing various widths of whitespace characters that they could use? In TeX it would be trivial.)

    The results would be ugly (but really, were black bars ever very beautiful?) but at least it would actually remove the information, and wouldn't result in an inaccessible, rasterized document.
  • by drc500free ( 472728 ) on Thursday June 22, 2006 @11:53AM (#15582982)
    "If you want my opinion (or even if you don't...:-p) this is the achelle's(sp) heel of our society today, most people are lazy bastards that just want to get done with somethign without learning anything about it."

    "Another thing that pisses me off is incopetence."

    Oh, the irony.
  • by DarkVader ( 121278 ) on Thursday June 22, 2006 @12:04PM (#15583059)
    You know, considering the state our government is in, I would much prefer that someone would build into all software going to the government an "unredact" feature to make it even easier to recover government coverups.

    Barring that, PLEASE don't educate them, or make it easier for them to really redact anything.
  • Circumvention (Score:3, Insightful)

    by Mateo_LeFou ( 859634 ) on Thursday June 22, 2006 @12:13PM (#15583126) Homepage
    If black squares count as a "technical measure" protecting access to a work... ? Someone actually should go ahead and launch this suit, to draw attention to the DMCA's shittiness.
  • by DarkVader ( 121278 ) on Thursday June 22, 2006 @12:13PM (#15583130)
    This is NOT scary. This is refreshing.

    I would much prefer my government be unable to successfully keep secrets from me.
  • by ukemike ( 956477 ) on Thursday June 22, 2006 @12:14PM (#15583138) Homepage
    They say you should open the original document in Word and EDIT the document by replacing the redacted text with a bunch of X's then print it to a PDF. That's a fundamentally different process than redacting. It's editing, and the temptation to ALTER the document would be huge. Also what would you do if you don't have the original Word document?

    Doing it right isn't so hard. You want to end up with a graphical only PDF of the document that has been redacted. (I can't believe I'm about to give the NSA good advise on how to keep secrets!)

    Use acrobat to mark out all of the evidence of your wrongdoing (oops I meant mark out anything classified...) Save it. Open it in a third party pdf program like FoxitPDF reader. Print it to a new pdf file using you PDFwriter of PDFdistiller print driver. You should now have a completely graphical pdf with no embedded text in the file. This is just as good as printing it, redacting it, then scanning it (which would be another good procedure.)

    It may look all blocky and pixelated but redacted documents from the government always look like crap.
  • by Anonymous Coward on Thursday June 22, 2006 @12:16PM (#15583159)
    The irony here is that you're complaining about people being "so damn lazy that they can't do a little research" when you haven't taken the (very small) amount of time researching how to correctly spell Achilles.
  • by DarkSarin ( 651985 ) on Thursday June 22, 2006 @12:17PM (#15583165) Homepage Journal
    Fortunately this does not apply to humans--not directly.

    I can easily train people that are smarter than myself, if the conditions are right. For instance, I know a fair bit about statistics and data analysis, and would be perfectly comfortable training certain folks in the field, as long as they didn't know more than I do. Even then, it perfectly possible for me to come up with a unique idea that someone smarter than myself hasn't (note that I didn't say couldn't) considered.

    In the public schools there are frequent cases of a teacher training a student more intelligent than themself. It is unavoidable, although it could be reduced by making sure only the smartest teachers were highered.

    Smarter? Not a requirement. More experienced? Having unique knowledge? Yes, that is required, but maybe not irreducibly.

  • by nixnutz ( 257339 ) on Thursday June 22, 2006 @12:22PM (#15583213)
    If you use Group 4 tiff encoding, which is standard in the legal industry, there should be no problem with file size. Clean text like a court filing should be no more than 20-30k per page. This is probably how I would do it; print to tiff (no screenshots of course), import to IPro or whatever and redact (any litigation database software should support redaction), then export PDF again. The problem is that if you need searchable pdf you need to OCR the tiffs at some point after you've redacted them and the quality of the OCR is not as good as extracted text from the original doc.

    I'm sure that in this case whoever redacted the pdf didn't have access to the original file, and while it's easy to draw boxes in Acrobat, there's no easy way to delete the underlying text. Whoever was responsible for this should have had access to the tools to do this correctly, or if not they should have hired a vendor, I think I'd charge about $35 for this job.

    Also, why is this hosted on the SFGate site? Where was it originally?
  • by wiredlogic ( 135348 ) on Thursday June 22, 2006 @12:46PM (#15583364)
    FWIW you could very easily write up some VBA code that converts highlighted text (maybe just one specific color like red) into Xs. Then you would just have to highlight the redacted sections and run the macro when you're finished. The highlighting could be optionally kept in place to make it more visible in the PDF. It also would be useful for an actively changing document to make the author more aware of where the sensitive bits are.
  • by BandwidthHog ( 257320 ) <inactive.slashdo ...> on Thursday June 22, 2006 @12:51PM (#15583397) Homepage Journal
    /. doesn't host with AT&T, so no worries.
    Doesn’t necessarily matter. Just because I don’t purchase services directly from NSAT&T, that doesn’t mean that my data isn’t flowing through their network at some point on its journey. So while I am immune (for now?) from NSAT&T’s content ownership bullshit, I can’t count on not having them dump my packets into Cheney’s inbox.

  • by SydShamino ( 547793 ) on Thursday June 22, 2006 @12:53PM (#15583410)
    They are hitching the metaphorical wagon of their livelihood to a team of horses that they don't know shit about.

    Millions of Americans hitch the physical "wagon" (or SUV, or sedan, or minivan) of their livelihood to a bundle of "horsepower" that they don't know shit about every single day, and then they drive that wagon at 75 MPH.*

    In the case of their cars, the consequences for misuse are serious injury or death. In comparison, the consequences for learning next to nothing about their computers seem slight.

    * It seems to me that knowing how to redact text in Acrobat is like knowing why you are supposed to turn on your headlights around dusk. Yes, you think you can still see just fine - the headlights are for others to see you. And no, I can't see your dim low-set parking lights if you turn those on alone.
  • Re:Maybe (Score:4, Insightful)

    by indifferent children ( 842621 ) on Thursday June 22, 2006 @12:54PM (#15583422)
    Just because you're releasing the 12th printing of the 4th edition, does not make this a 'new book'.
  • Re:Circumvention (Score:3, Insightful)

    by jZnat ( 793348 ) * on Thursday June 22, 2006 @01:28PM (#15583636) Homepage Journal
    I don't think that court documents like these are copyrighted, so you can't even apply the DMCA to it. The leading source of public domain material these days seems to be the government itself...
  • by gstoddart ( 321705 ) on Thursday June 22, 2006 @04:03PM (#15584692) Homepage
    I'm commuting a lot these days post Katrina...and it seems very few people understand the left lane is the passing lane...get the fsck out of the way of a driver coming up behind you faster than you're travelling. If possible, you should do the majority of your driving on the rt. lane (US).

    And, I'm equally amazed at how many people are too damned ignorant and intent at driving at Max 0.6 to realize I'm in the middle of fscking passing this guy (as evidenced by the fact that I'm going faster than him), and that fact that you want to go 2x speedlimit vs my 1.2x speedlimit doesn't mean I'm suddenly going to accelerate to your speed to complete my pass, or abandon my pass so you can fly by at insane speeds.

    When I finish passing the guy, I will get out of the passing lane, I've already factored that in. It doesn't mean I'm gonna relinquish the lane to you or speed even more to keep you happy.

    The passing lane isn't a free pass to drive like an asshole at the highest rate of speed you can manage. You need to cut the rest of us some slack when we're actually passing too. I've seen far too many people who, even though I'm in the middle of actually passing cars, expect I just scrape and grovel and get completely out of their fscking way -- those people might see my brake lights rather unexpectedly!

    People are bad drivers on both ends of that spectrum -- both the people who never move, and the people who expect you to move immediately as if they're the friggin emperor or something.
  • Re:Maybe (Score:5, Insightful)

    by frdmfghtr ( 603968 ) on Thursday June 22, 2006 @04:34PM (#15584906)
    Perhaps the people making these "blacked out documents" should be taught a little about Vector Graphics and that a black box is not the same as a sharpie. One word for them 'n00b'!!

    Sometimes I wonder if these incidents are really "accidents" or somebody's way of feigning ignorance of technology to get the facts out to the public.

The shortest distance between two points is under construction. -- Noelie Alito