Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Tracking the Congressional Attention Span 89

Turismo writes "Ars Technica covers a new research project that uses computers to look at 70 million words from the Congressional Record. The project's goal was to track what our representatives were talking about at any given time, and researchers were able to do it without human training or intervention. From the article: '...researchers found, for instance, that "judicial nominations" have consumed steadily more Congressional attention between 1997 and 2004. In fact, the topic produced the most number of words published in a single "day" of the Congressional Record: 230,000 on November 12, 2003.' It looks like automated topic analysis has truly arrived."
This discussion has been archived. No new comments can be posted.

Tracking the Congressional Attention Span

Comments Filter:
  • TheyWorkForYou.com (Score:5, Informative)

    by Bogtha ( 906264 ) on Friday August 04, 2006 @08:17AM (#15845654)

    Even with a large team of grad students at their disposal, researchers find it difficult to tag more than a small subset of the speeches in question

    Are there really that many speeches? TheyWorkForYou.com [theyworkforyou.com] offer a similar service for the UK's Houses of Parliament, except it's done manually, and there's only a dozen volunteers working on it.

  • by mapkinase ( 958129 ) on Friday August 04, 2006 @08:34AM (#15845705) Homepage Journal
    The data generating process that motivates our model is the following. On each day that
    Congress is in session a legislator can make speeches. These speeches will be on one of a finite
    number K of topics. The probability that a randomly chosen speech from a particular day will be
    on a particular topic is assumed to vary smoothly over time. At a very coarse level, a speech can
    be thought of as a vector containing the frequencies of words in some vocabulary. These vectors of
    word frequencies can be stacked together in a matrix whose number of rows is equal to the number
    of words in the vocabulary and whose number of columns is equal to the number of speeches. This
    matrix is our outcome variable. Our goal is to use the information in this matrix to make inferences
    about the topic probabilities and how they change over time as well as the topic membership of
    individual speeches.


    Word frequency? That is primitive given the fact that there already tools that can parse the grammar of the sentence finding relations between words.
  • by sgtrock ( 191182 ) on Friday August 04, 2006 @08:52AM (#15845768)
    30 years ago, I learned in my high school civics class that any Senator or Representative can insert anything he or she wants into it at any time. Examples that were pointed out to us were speeches on the floor of the Senate that were never made, modifications to committee meetings, etc. The CR is by no means an accurate measure of anything. Except maybe the size of their combined egos.
  • by jejones ( 115979 ) on Friday August 04, 2006 @09:13AM (#15845856) Journal
    They know, don't they, that a representative can have arbitrary text inserted in CR as if it had been read?

    Also, if you watch CSPAN while Congress is in session, in the evenings you'll see long stretches with just a few people who are delivering their rants into a nearly empty room. Can that be separated from the rest of the text?
  • by Peyna ( 14792 ) on Friday August 04, 2006 @10:37AM (#15846338) Homepage
    You're half-right there. They can get anything they want into the record without actually having to say it in front of everyone. This is good in some respects, because it allows that person to be officially on the Congressional Record on a particular point without having to tie up the time of the congressional body.

    However, they can't modify things that are already in the record (at least, not without being subjected to censure or other punishment).
  • by joeljkp ( 254783 ) <joeljkparker.gmail@com> on Friday August 04, 2006 @11:07AM (#15846562)
    As I understand it, they're searching through the Congressional Record, not simply transcripts of congressional speeches. The CR is full of pages upon pages of stuff that doesn't get spoken anywhere, except for saying "please insert this into the Record" (or something to that effect). The CR has full text of speeches, letters, reports, amendments, textual evidence, etc.

  • by joeljkp ( 254783 ) <joeljkparker.gmail@com> on Friday August 04, 2006 @11:10AM (#15846587)
    A lot of this is substantive depate in disguise. They may literally be arguing whether Bill 1 gets an hour of debate or a day of debate, but what they're really trying to do is either kill it or give it room to breathe.
  • by carlivar ( 119811 ) on Friday August 04, 2006 @12:02PM (#15846935)
    I just finished reading John Stossel's new book (quite good, though not as good as his first). He has a section in it about the Congressional Record.

    If you think the Congressional Record is an accurate account of what happens in Congress you are dead wrong. Congressmen use taxpayer dollars to manipulate the Record because there is nothing that says they can't. They insert bogus info, like "Congressman Bob Blowhard addressed the House with a commendation for the 4-H Club of Woohah, Oklahoma". Which never really happened but it makes Senator Blowhard look good with his constituents. They also change the words of what they really said on the floor to make themselves sound better.

    Here is a blog post mentioning the problem Stossel brings up and a small excerpt [powerblogs.com]

    Carl
  • by General Wesc ( 59919 ) <slashdot@wescnet.cjb.net> on Friday August 04, 2006 @01:14PM (#15847437) Homepage Journal

    Progress = Walk forward
    Congress = Walk together/with

    '-gress' is from the Latin 'gradi' (to walk)/gradus (a step). 'ghredh' comes from the same place, but 'go' obviously makes less sense than 'walk' (which it also means).

"Gravitation cannot be held responsible for people falling in love." -- Albert Einstein

Working...