Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292

You only think it's trivial because you haven't gotten into it yet. You are also assuming that somebody isn't trying to prevent webscraping. You run into problems such as cookie counter overflows, dropped sessions, rate limiting, and bogus redirects. I don't know if these things are deliberate, but you run into them if you just try manually scraping the web pages too.

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292

People say that screenscraping is easy, and they even attempt to do it, but screen scraping is not as simple as it looks. It's definitely not trivial. Try it.

But, more importantly, why should the citizenry put up with this nonsense? The law belongs to the people, and the government should not put up barriers blocking full public access equal to that of any attorney.

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292

If you're conducting research for an actual case, being restricted to a search tool like this is like knitting a sweater through a keyhole. The state knows this. That's why they sell (and only sell) access to the full set of current law text. They know lawyers (who can afford it) will spring for the $615 year after year. But mere citizens are kept in outer darkness.

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292

Still, as far as I can tell, there is no way to get a PDF of the current law. Only a year old. That's not helpful for case research. Even if the law were updated daily (which it is not), it would be simple to generate a time stamped PDF that represents the current law.

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292

Alas, not even close. The current law page still only displays a tiny fragment at a time -- virtually useless for researching the law for an actual case. And the search is only by citation number (!), not natural language, or even keyword. You have to know the exact citation of any particular passage in order to display it by search.

I would flunk any freshman computer science student delivering an interface this broken. It's clear that no competent programmer would build such a restricted access interface unless that was the specification provided by the government.

The historical PDFs are useless for current cases, as they can't legally be cited in any proceeding, and you'd be crazy to depend on them given the arbitrary routine changes that occur. But they do serve to prove that the State of Oregon is fully capable of delivering the current law as PDFs. They simply choose not to. I think it's obvious that the reason for that choice is a too-cozy relationship with the seller of the $614 "complete" current edition.

Comment Re:That's copyright for you (Score 1) 292 292

You at least need a prefix step to click the "I agree" copyright notice blocking page or you will get megabytes of just that notice.

Another poster (Jesse) has a working bash+wget script that requires extending to follow annotation links. He may be posting it to this slashdot story.

Comment Re:That's copyright for you (Score 1) 292 292


Alas, that quick fix doesn't work. Part of the problem is that the HTML links all have file:// URL prefixes, and they need to be http://. But another problem is that even with http://web.lixisnexis.com hardcoded into the link, you get redirected to a sign-in page. There must be some state the server is expecting to be set first.

However, you've built an excellent proof of concept. I'm happy to do the next round of revisions. In the interest of openness, would you be willing to post the bash script in this slashdot thread?

It's great, of course, that citizens can get around the government's attempts to lock down the law. But the real fix is to delete the bureaucracy that is blocking citizen access in the first place! So I am supporting public.research.org's fight against Georgia's lawsuit.

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292


Wget has been tried and is not sufficient, for various technical reasons. Bash with wget has been shown to work in principle, but there are corner cases, such as the critical history and cross reference links, that are tricky.

But this ignores the fundamental issue: why should citizens have to resort to such technical gymnastics at all, especially if they are easily blocked (e.g., by rate limiting) by a government intent on locking the people out? The "bug" in this system is the obfuscating bureaucracy. The fix is to delete that.

Comment Re:The article should use "ridiculous" 0 times. (Score 1) 292 292

I've found wget to be very useful in such situations.

But easily defeated by a government intent on locking out the people. Simple rate limiting can render screen scraping impractical. More importantly, why should citizens have to resort to such tactics? The law belongs to the people. Let the government put away its croniyism and be open and transparent as it should.

I don't know if Oregon's law is available via other channels, or if it's locked up by the state deliberately the way Georgia's is. If the latter, then Oregon needs to relinquish its restriction as well.

Comment Re:That's copyright for you (Score 1) 292 292

I downloaded your HTML extract and found one problem: it did not follow many of the links to subsidiary pages such as "Title Note" and "Article Note". For an example, see 15-10-26. which has the following Title Note:

CROSS REFERENCES. --Criminal Justice Coordinating Council, 35-6A-1 et seq. Establishment of county law libraries, 36-15-1 et seq. Court-martial jurisdiction, 38-2-370 et seq. Designation of courts which possess jurisdiction over traffic offenses, and procedure in such courts, 40-13-1 et seq. Indictment and punishment of judge of probate court for malpractice, partiality, conduct unbecoming office, and other offenses, 45-11-4.

LAW REVIEWS. --For article, "The Majority That Wasn't: Stare Decisis, Majority Rule, and the Mischief of Quorum Requirements," see 58 Emory L. J. 831 (2009).


Am. Jur. Trials. --Judicial Technology in the Courts, 44 Am. Jur. Trials 1.

According to the US Supreme Court, these notes are part of the official code and thus not protected by copyright. Citizens are held accountable to the interpretations given in these notes, and Georgia has made them part of the "official" code, and thus they must be available to all citizens.

Can you update your code to extract these notes as well? Thanks!

Comment Re:Banks vs Manchester. Law, no. Indexes by publis (Score 1) 292 292

From https://en.m.wikipedia.org/wik...

Requiring a license before allowing citizens to read or speak the law would be a violation of deeply-held principles in our system that the laws apply equally to all.This principle was strongly set out by the U.S. Supreme Court under Chief Justice John Marshall when they stated “the Court is unanimously of opinion that no reporter has or can have any copyright in the written opinions delivered by this Court, and that the judges thereof cannot confer on any reporter any such right.” Wheaton v. Peters, 33 U.S. (8 Pet.) 591 (1834). The Supreme Court specifically extended that principle to state law, such as the Official Code of Georgia Annotated, in Banks v. Manchester (128 U.S. 244, 1888) , where it stated that “the authentic exposition and interpretation of the law, which, binding every citizen, is free for publication to all, whether it is a declaration of unwritten law, or an interpretation of a constitution or a statute."

If you have to ask how much it is, you can't afford it.