Please create an account to participate in the Slashdot moderation system


Forgot your password?
The Internet

Web Log 'Word Bursts' Could Identify New Crazes 242

Zorgatron writes "New Scientist reports that a researcher from Cornell University has come up with clever method of identifying what's cool by automatically searching weblogs. Sudden increases or "bursts" in the usage of particular words may reflect a new craze, according to Jon Kleinberg. He has demonstrated the technique by searching through state of the union addresses given since 1790." I wonder how long before this can be done real time enough to really make this useful.
This discussion has been archived. No new comments can be posted.

Web Log 'Word Bursts' Could Identify New Crazes

Comments Filter:
  • Google? (Score:5, Insightful)

    by troll ( 593289 ) on Wednesday February 19, 2003 @09:29AM (#5334122) Journal
    Could this be what Google wants with Blogger?
    They have the capacity to do this, I don't see why they wouldnt.
    • Re:Google? (Score:4, Insightful)

      by tmark ( 230091 ) on Wednesday February 19, 2003 @09:59AM (#5334305)
      Except that Google already has the de-facto capability of rapidly searching as many weblogs as they care to. Sure, it takes long to spider them across the web, but it takes a long time to spider damn well near every single page in the world.

      As for how long it will be before we can do this in "real time", this all depends on what your definition of "real time" is. If you're happy with doing a few thousand blogs and getting results back in a few minutes, since at most only a few pages change on the aveage blog a day, I'd say any decent Perl guy could do that for you now.
    • Re:Google? (Score:3, Informative)

      by zeno_2 ( 518291 )
      Although its not really what the story is about, I always had thought that the Google Zeitgeist [] was a good indication of "new crazes".
  • Blogdex (Score:5, Informative)

    by nob ( 244898 ) on Wednesday February 19, 2003 @09:29AM (#5334125) Homepage
    Theres another "what's popular on blogs" webpage at Blogdex []. It tracks links, showing which pages are most linked to.
  • by flokemon ( 578389 ) on Wednesday February 19, 2003 @09:29AM (#5334126) Homepage
    In a simple historical test of the technique, Kleinberg analysed all the annual State of the Union addresses given by US Presidents since 1790. He found that particular word "bursts" could indeed be linked to important events at the time the speeches were delivered.

    Has an important increase of the use of the word "nukular" been reported in the last few weeks then?
    • Re:Nukular weapons (Score:4, Insightful)

      by tmark ( 230091 ) on Wednesday February 19, 2003 @10:06AM (#5334341)
      He found that particular word "bursts" could indeed be linked to important events at the time the speeches were delivered.

      Does anyone else find this painfully obvious ? Certainly you wouldn't expect to hear the word "computer" much in FDR's state of the union addresses; just as you wouldn't expect to hear "icebox" in GWB's addresses.

      The idea isn't as revolutionary as the author makes it out to be. People have been searching for terms in literature and using counts as indices of "importance" for a long time. Just to cite one example, researchers commonly use citation indexes to find out which fields are/were "hot".
      • But just think! (Score:3, Insightful)

        by jabber01 ( 225154 )
        What will future searchers make of Slashdot (and by extension, the net as a whole), what with the waxing and waning in the popularity of Natalie Portman, Hot Grits, Soviet Russia, All your base, gonads and strife, MEEEEEEPT!, and the ever-present FIST PROST.

        This is a significant tool for the post-information age. It could reliable guage the effectiveness of viral marketing. It could also intercept sub-culture developments before they become popular, and introduce them to the general population in association with a corporate brand.

        Imagine if Nike or Pepsi, or *shudder* Microsoft, had caught the "All Your Base" thing on the upswing. They'd have a better slogan than the top down "Dude, you're gettin a Dell".
  • Google (Score:4, Informative)

    by Citizen of Earth ( 569446 ) on Wednesday February 19, 2003 @09:30AM (#5334129)
    Google can do much the same thing, on a real-time basis, by examining what phrases are searched for.
    • Re:Google (Score:5, Informative)

      by ccweigle ( 25237 ) on Wednesday February 19, 2003 @09:38AM (#5334187)
      Google can do much the same thing, on a real-time basis, by examining what phrases are searched for.

      And they do that much already ... on their Zeitgeist page: []

      But this is different. The article is about monitoring the blogs, not the searches. As suggested in another comment, this may be related to Google's acquisition of Blogger.

      • Actually, their news page is even more up to date. The zeitgeist is updated infrequently compared to the news page. Of course, the news page is biased, since it only gets its info from news sites which have been defined already.

        Of course they could do something similar with weblogs.


    • Re:Google (Score:3, Insightful)

      by XCondE ( 615309 )

      I'm eager to see what will come up next with Google's recent entry in weblog world.

      It's just what I thought when someone said " Blogs are like dreams; they're only interesting to the people they belong to".

    • Re:Google (Score:4, Insightful)

      by gmuslera ( 3436 ) on Wednesday February 19, 2003 @09:59AM (#5334301) Homepage Journal
      Is more subtle than that, is not what you are searching for, but it tracks how you (or society) changes it way to express itself based in current trends, news, etc. That can be related or not with what you are currently searching in google.

      In a way, it should track even how languages evolve, how new meanings are given to existing words (i.e. in the past would anyone think that defensive attack were not opposite words? :)

      I wonder if this kind of analysis can be affected by people like me that without proper knowledge of english write in it :)
  • Great.... (Score:5, Funny)

    by xtermz ( 234073 ) on Wednesday February 19, 2003 @09:30AM (#5334130) Homepage Journal
    ..Now we're going to see Pepsi add's slinging "in soviet russia, you drink pepsi' , and Nike yelling about "all your sports belong to us..."...
    • God Help Us (Score:4, Funny)

      by Oculus Habent ( 562837 ) <oculus DOT habent AT gmail DOT com> on Wednesday February 19, 2003 @09:33AM (#5334152) Journal
      And think, the DMCA will become the most popular piece of legislation in existance - at least on slashdot.

      And CowboyNeal is the most popular man alive!
    • by Surak ( 18578 )
      The Dell Dimension, with the powerful Intel Pentium 4 processor.

      "Dude, imagine if you had a Beowulf cluster of these things!"

    • "in soviet russia, you drink pepsi"

      No, that would be misleading advertising, because obviously:

      In Soviet Russia, Pepsi drinks YOU!
    • Perhaps this emerging trend early warning system could be used to prevent such tragedies as the chronic overuse of the word "uber."

      The first time I remember seeing "uber" being used was in the days when Microsoft's plan for world domination was described as "Windows uber alles." Since then, it's snowballed and these days, the word has been so overused it's simply become an annoying cliche.

      If only we'd had an early warning system back then, we might have been able to prevent the uber-ification of Slashdot.

    • Actually, I was out at a retirement lunch the other day, at a chinese restaurant, and the new thing is to read the fortune cookie thusly:

      "In Soviet Russia. . ." (text of fortune cookie).

      Which is a refreshing change, but often not as funny as:
      (text of fortune cookie) ". . . in bed."
  • Conspiracy (Score:3, Funny)

    by Scott Hussey ( 599497 ) <sthussey+slashdo ... m minus math_god> on Wednesday February 19, 2003 @09:30AM (#5334136)
    I can already see the collusion of weblog editors.

    "Okay, everyone write about polka dot socks tomorrow. And throw in something about drinking rotten milk. I bet we can start a new fad..."
  • "What's cool"? (Score:5, Insightful)

    by ites ( 600337 ) on Wednesday February 19, 2003 @09:31AM (#5334138) Journal
    By my definition "cool" is that which most people have not yet discovered. Example: that... ah, but I'm not going to tell you. Perhaps this method can tell you what just became cool, but it's hard to track something that is by definition under the radar. Otherwise, just track Google searches. You'll soon see what's popular.
    • Re:"What's cool"? (Score:5, Insightful)

      by deanc ( 2214 ) on Wednesday February 19, 2003 @10:31AM (#5334501) Homepage
      That's what the researchers seem to track. Not the commonality of a phrase, but the "burstiness" of a certain word or phrase... ie, the delta of the word use over time. High delta values indicate something is starting to take off, though it may not yet have become popular or mainstream. That's a decent metric of "coolness."
    • But Marketable is what is becoming cool, and things stay marketable for longer than a few instants. This is not really about cool, it's about marketably cool.
    • Re:"What's cool"? (Score:4, Insightful)

      by Fishstick ( 150821 ) on Wednesday February 19, 2003 @11:24AM (#5334924) Journal
      Ever see Merchants of Cool [] on Frontline?

      A Report on the Creators & Marketers of Popular Culture for Teenagers

      Yeah, that's right. Popular Culture is manufactured -- everything the teenies think is "cool" or "hot" is identified months in advance by a highly sophisticated machine that probes the minds of kids to predict what will be the next trend so that the marketing establishment can gear up to take advantage of the short window where the "thing" is "cool" and can be sold to teens in such a way that they don't even realize what is going on.
  • Applications (Score:2, Interesting)

    by benjiboo ( 640195 )
    This work has been around for a long time in the data mining literature. For instance, searching the logs of customer service calls to identify common problems etc.

    These techniques could easily be expanded to searching weblogs - I imagine the findings could be very interesting for content providers - eg a simple measure of what people want to read about.

  • Apache Logs too (Score:4, Interesting)

    by josephgrossberg ( 67732 ) on Wednesday February 19, 2003 @09:32AM (#5334147) Homepage Journal
    "Joe Millionaire winner" and "Bubb Rubb" have generated most of my personal blog []'s hits.

    I, myself, am a distant third.

    Write about enough things and then check your referral logs for Google and Yahoo searches (which include the query in the URL), and you get an imperfect idea of what people are interested in this week.
  • I wonder how long before this can be done real time enough to really make this useful.

    Define "useful."
    • It might give us the ability to make intelligent statements about what people are thinking and talking about - a cultural barometer, if you will. One that's tied to something other more valid and more immediate than the Letters section of People magazine.

      Someone once wrote (I'm really sorry, I forgot who - public thank you to the person who knows) something like "individually, nobody knows what's going on but collectively, we know exactly what's going on." This kind of meta-information is a social scientists wet dream, I bet. I admit I'm fascinated. It's very...William Gibson? Was he the guy who wrote that line? Damn it.
  • Useful? (Score:4, Insightful)

    by Longjmp ( 632577 ) on Wednesday February 19, 2003 @09:32AM (#5334150)
    I wonder how long before this can be done real time enough to really make this useful.

    Yes, I bet the spammers can't wait until they can use it...
  • Imagine (Score:3, Insightful)

    by jos091 ( 570342 ) <joseph AT ctcgsc DOT org> on Wednesday February 19, 2003 @09:33AM (#5334151)
    Imagine the feedback loop that could develop...
    • by skillet-thief ( 622320 ) on Wednesday February 19, 2003 @09:47AM (#5334235) Homepage Journal
      It is kind of like the stock market craze and the theory that "all the information you need to know about a stock is contained in the market itself" (ie. in the stock's chart). Enough people start believing that theory, and the stocks quit behaving rationally.

      The analysis only works if your tool doesn't start modifying the data you are analyzing. If this thing ever caught on, it would quickly become meaningless, because everybody wants to be part of whatever craze is going on. Every morning you check which words are hip, you put them on your website... etc. etc.

      You are right about feedback: the buzz would become a terrible din. That said, it is a cool idea.

    • Re:Imagine (Score:2, Funny)

      by kalidasa ( 577403 )
      Re: Imagine the feedback loop that could develop...
  • 'has come up with clever method of identifying what's cool'

    So is this guy like Screech in Saved by the Bell, constantly looking for a way to impress Zack and the guys?
  • Great! Now I'll know sooner what the latest pop culture craze is so I can "be different" and follow everyone else to stay popular! Then I can put that information in my blog and let everyone else know I'm following the latest trend!

    Seriously, just read /. if you want to know the important stuff of the day. :)

    • by Duds ( 100634 ) <dudley@enter[ ] ['spa' in gap]> on Wednesday February 19, 2003 @09:38AM (#5334184) Homepage Journal
      Seriously, just read /. if you want to know the important stuff of the day. :)

      Twice usually.
    • Great! Now I'll know sooner what the latest pop culture craze is so I can "be different" and follow everyone else to stay popular!

      Except now popularity will last about 6 hours, tops, before some new wave of pop culture replaces it. By the time craze "X" hits the craze detector, all the really cool people will already be onto craze "Y", which will be detected a few hours later.

      It's like the whole "avant-garde/in-style/out-of-style/retro/back-in-s tyle" cycle managed by a Perl script in an infinite loop.

  • This system should be run over all speech's too. ".Net" "Drag & Drop" "Point & Click" they are full of it.
  • Daypop (Score:5, Informative)

    by Apreche ( 239272 ) on Wednesday February 19, 2003 @09:36AM (#5334169) Homepage Journal

    Its got the top 40 every day. Doing it some other way would only catch memes sooner. And if the system doesn't catch it until its popular, it really doesn't help. What we need is a large and complete database of all meme type things.
  • Oh, great (Score:4, Funny)

    by ZoneGray ( 168419 ) on Wednesday February 19, 2003 @09:36AM (#5334171) Homepage
    Whoopeee. The marketers will start using this to identify trends, and next thing you know, we'll have some fast food named "Cheese-Eating Surrender Monkeys."

    Not In Our Brand Name, say I.
  • by rubberpaw ( 202337 ) on Wednesday February 19, 2003 @09:36AM (#5334172) Homepage Journal
    Of course, since there is only a very specific socioeconomic subset of the world population weblogging, what real usefulness does this give us? Honestly, even if you did ranking based on the most popular weblogs, that wouldn't help you very much.

    Furthermore, this thing isn't telling me anything I don't know. So it finds the word "Vietnam" during the Vietnam years. Hooray. I bet it finds the word Iraq today, or the phrase "Bin Ladin" last year.

    Whoopdie-do. I'm impressed :P. Unless this thing actually can find out the things that people are excited about that aren't well-known, it's pretty much just another search tool limited to blogs.
    • by barnaclebarnes ( 85340 ) on Wednesday February 19, 2003 @10:08AM (#5334348) Homepage
      Unless this thing actually can find out the things that people are excited about that aren't well-known, it's pretty much just another search tool limited to blogs.

      Thats the whole point. Weblogs are not the mainstream media so he is betting that a new craze (or refresh of an old one) will show up there beofore the mainstream sites get a hold of. Face it, once it has hit CNN it is already past its sell by date.

      Take the whole potato gun thing for instance. if this was appearing on peoples weblogs 6 months ago and an underground following had started then it would pick this up. Could be a perfect time for one of the toy companies to start producing a parent friendly version (Not sure how...but hey!). By the time the craze hits CNN Toys 'R Us is stocked with a version that fires water ballons, only uses compressed air and comes in 10 different plastic colours. Then they would have the advantage before the other companies jump on the bandwagon.

      Of course, since there is only a very specific socioeconomic subset of the world population weblogging, what real usefulness does this give us?

      A lot! Let me see, I have a large group of people who are rich, computer owning, and probably middle /Upper Middle Class all saying they want X. Now who is your target audience again? Not low income, no disposible cash types.


    • Out of the six billion people on the planet, only 3 percent can afford one. Of those that can afford one, half decide they actually want one. Combine that half with the lonely few in cyber cafes and markets and you have the world's top spenders in one place, perfect for advertisers.
    • Well, I don't think anyone's going to start a marketing campaign based solely on the information retrieved from blogs. However, it can act as a nice supplement to other information one might have, in that it can 1. reinforce, 2. show a rise in popularity of a specific item, and 3. show a decline in popularity of another item.

      Viral marketing is something marketing officers have been trying to tap into forever, and this might help them determine how that sort of thing works; how information and trends are passed between people.

      Also, maybe they could pinpoint characteristics of market leaders, i.e. those who talk about major trends before those trends get major.
  • It's useful *now* (Score:3, Insightful)

    by backlonthethird ( 470424 ) on Wednesday February 19, 2003 @09:38AM (#5334183)
    I wonder how long before this can be done real time enough to really make this useful.

    Why have to wait until it's realtime? Historical analysis is very useful, and not just to historians. Linguists, anthropologists, social scientists, etc.. Taking such a body of texts is called studying a "corpus," and such studies often yield surprising and interesting results (better than "atomic" showing up in the ocld war). A new method like this would be very useful to nearly every discipline in the humanities I can think of

    Not all geeks are computer geeks. Not all nerds care only about the future.

  • The ultimate way of watching trends on a month-to-month basis has to be Zeitgeist [] from Google.
  • No Kidding? (Score:2, Insightful)

    by BuBu_ ( 72690 )
    Did anyone read the article? Amazingly enough this wonderful software with its POWERFUL algorithms proved a true point of "no shit". While running this gem of coding genius, the authors managed to find reoccuring references to "Depression" while scanning texts from the 1930's. Imagine that, finding the word depression from a time period thats been nicknamed "The Great Depression" I would of never linked the word "Depression" with "The Great Depression". Have we really reached the point where we can just do the same shit over and over again and it's magically a new invention?

    MS is bringing out 3 Degrees which is reinventing IRC, this guy is telling us the painfully obvious, and I've been working on this little trick thats gonna really change the way we think of food, get this guys: I take two pieces of bread, a piece of cheese, and a piece of meat and stack it together.. I call this wonderful new life shaping discovery "The meat-and-cheese-on-bread" I really think it's gonna change how we eat!
    • By your reasoning, the war taking place in Europe from 1914-1918 was already named World War I by the participants. They were aware that this naming scheme was easily extended to incorporate future conflicts such was WWII, WWIII, etc.

      But, if memory serves me correctly, this war was actually called "the big war" or "the war to end all wars" by its participants. It was only years later, when WW II erupted that they renamed the earlier conflict to WW I.

      Over 100 years hindsight is 20/20. I think the goal of this technology is to provide hindsight over a span of days/weeks/months.
  • I can see a nice distributed implementation for burst-searching - a "mod_ephemera" module for apache.

    The module would count words/phrases most commonly served (less tags and the top-n most common words in the language-encoding), then serves out the top-10 as HTTP header messages. That way, the results are unobtrusive and easy to recover.

    Of course, this approach would inevitably be easy to skew/cheat. Anyway, that's my sixpeneth :)
  • by djupedal ( 584558 ) on Wednesday February 19, 2003 @09:41AM (#5334199)
    ...Yahoo, today, was accused of seeding 2.5 million user blogs with keywords designed to influence/fool/skew robots that attempt to identify what's cool by automatically searching weblogs for so called 'word bursts'.
  • by stinky wizzleteats ( 552063 ) on Wednesday February 19, 2003 @09:42AM (#5334204) Homepage Journal

    I guess this pretty much lays to rest the article about how nerds don't work to be popular. We automate it!

  • They have a realtime search mechanism that can search within Chat rooms also , and TV and radios streams. (Kevin Kelly is on the Board). Used to be a downloadable personal edition. there is a free trial. Not a plug !!! , they became a corporate (financial and others) company , turning back on "Free Information Now" roots. but at least it works :)
  • Zeitgeist and Memes (Score:3, Informative)

    by mrmiasma ( 650879 ) on Wednesday February 19, 2003 @09:47AM (#5334231)

    Sounds like a combination of Google's Zeitgeist [http] and LiveJournal's MemeTracker []. In other words, nothing that new.

    It's also the basis for Computational Lexicography. Doing analysis on large corpora. One of the interests people have in this field is introduction of new words in society. The field used to use corpora such as the British National Corpus [], but since the explosion of the Web, sites such as Google can far exceed that size. Weblogs are simply a good example of a more natural form of language. The interesting thing would be not so much to find new trends through words... but if we can truly solve the whole natural language parsing problem and use such information to extract higher-level knowledge

  • It's no secret that the most commonly searched item on the internet is pronography. Only once has this top ranking ever been dethroned - September 11, 2001. It returned to the top spot shortly thereafter. So, by examining web logs, we will find that - year after year - we are all interested in pornography. While this is likely the case, it is also a trapping of the medium through which the research is being conducted. You'll excuse my complete lack of surprise.
  • New article title (Score:2, Insightful)

    by Samus ( 1382 )
    Should have been entitled "Nerds Find Automatic Method to Enable Them to Talk to Other People." I have this picture in my head of some poor guy who is a social outcast that wants to figure out a way to be able to talk to a girl about things she might be interested in.
  • The approach could also be applied to sifting through other types of information. Identifying word bursts within email messages sent to a company's customer support address might help maintenance staff spot a major new problem.

    I'm sure customer support employees are going to love this idea... This way you can keep up an appearance of actually having read the customer emails, while really just redirecting to /dev/null (through the filter of course).
  • I attended a conference last year, where they proposed a similar method to find trends in scientific fields, and more importantly, link them and predict future connections. For instance, when words from two unrelated fields start showing up associated in many papers, there is possibly a trend for those fields to meet and merge in the near future. Of course Informatics doesn't replace traditional methods, because it needs the input data, but it's a helpful tool.
  • Oh great. Just what we need. "Well, after careful analysis computer analysis with my powerful algorithms, I have concluded that break-dancing is now cool. I will be the first nerd in history to be atop this new trend."
  • Now we have the facts to say, "Dude, that's so 2001".
  • So, in this article, the examples are:

    The words 'militia', 'British' and 'savages' were used a lot around the time the American 'militia' tended to fight the 'British' and what they called the 'savages'.

    The word 'depression' was used a lot during the 'depression'.

    The word 'atomic' was used a lot during the cold war, and 'Vietnam' was used a lot during the Vietnam war.

    I am utterly at a loss as to how such a seemingly interesting field as tracking word usage (well, it interests me) could possibly yield such stupefyingly, numbingly, almost frighteningly obvious and dull results.

    I can only assume the true significance of Dr. Kleinberg's results was simply too terrifying to be revealed...

  • Individual words will be useful showing some trends, but maybe counting phrases or how n-tuples of words could be better (AND harder). Sometimes "what's cool" is not a single isolated word.

    With common words the language or the way society express itself could change in a way that doing simple word counting not show, at least, not show clearly.
  • strange... (Score:3, Interesting)

    by dotgod ( 567913 ) on Wednesday February 19, 2003 @10:13AM (#5334375)
    Our definition of "cool" is the output of a computer analysis of weblogs then sit there wondering why nerds are so unpopular []?!?
  • something similiar on yahoo already? how do they figure out their buzz index? because i thought it was based on buzzwords. mind you, its buzz words in hyperlink format, but its still the same concept i think
  • (Score:2, Interesting)

    Is this news considered "new"? This is exactly what Amazon did in order to forecast what book titles would sell the most money. They became the biggest web retailer because of this very same idea -- but many years ago. And now somebody at Cornell copies the idea but uses weblogs instead of IRC and newsgroups and suddenly he's "clever"? I know lots of people are complaining that the information gleamed from this is not useful; but it is! It's an amazing way to forecast what will sell.
  • by JPMH ( 100614 ) on Wednesday February 19, 2003 @10:25AM (#5334469)
    Since the early '90s, the Economist has from time to time published occasional tongue-in-cheek articles about its "Recession Index", a useful leading indicator of the state of the US economy -- namely, the number of times the 'R-word' appears per month in the New York Times and the Washington Post. This appears to correlate strongly with the future state of the economy...


    Dec 10, 1998 []

    Nov 21, 2002 []

  • what's george michel and a pair of wellies got in common?

    They both get sucked off in blogs.
  • by tazochai ( 213288 ) on Wednesday February 19, 2003 @10:33AM (#5334523)
    .... one more time why don't you. And I quote,

    "For example, identifying word bursts in the hundreds of thousands of personal diaries now on the web could help advertisers quickly spot an emerging craze."

    Gonfonit!!! Why does cool new social technology have to be related to ways to help people sell things to Americans! Why is it okay for us to be considered a nation of consumers, otherwise basically useless biological skinsacks?!

    I'll just strap my wallet to my chest with duct tape now and write my social security number in huge numbers on the back of my t-shirt for fast credit checks.
  • Like harvesting the info about some (rand(10)+15) year old person writing bullshit about (boyfriends|girlfriends|music|movies|stupid online quizes|webrings) with a site design that's usually so horrible they could be succesfully sued for crimes against huminity. All the marketing companies would encounter is the hype they created a few days before the harvest. So it might work to check if hypes/trends work out, but looking at "blogs" (the very word disgusts me) for something new an innovative is about as futile as trying to comprehend Bush' ramblings. The few remaining web logs or journals, as I prefer to call them without retching, are mainly technical. What trends are they going to squeeze out of the journal of a team of developers who want to keep the outside world up to date about what has happened lately? That such and so compiler sucks? That the network admin is a bitch? That the coffee tastes like sewage waste?

    Heck, if any of those marketing companies are GOOD, they'll MAKE their own trends, not ride around on the succes of others.

  • Paper is here (Score:2, Informative)

    J. Kleinberg. Bursty and Hierarchical Structure in Streams. [] Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002.

    Data from state of the union addresses here. []

  • I read the heading as:

    Web Log 'Word Bursts' Could Identify New

    and thought it must just go through blogs looking for long rambling outburts about black helicopters, FBI, greys and aluminium beanies. Blimey, that's half the bloggers out there - you don't need a program to identify the crazies!

  • Well, that idea was my entry attempt for the google programming contest [], inpired by the Google Zeitgeist [] which I personally find was too infrequent (and not to say static).

    But finally they've put exactly that system for use in Google news []. Keywords that suddenly appear in many news sources get sent to the top of the front page. That's where I learnt of Columbia, a few minutes after it happened, and the first headlines didn't make sense at first.

    So as usual, just search google !

  • I've noticed that certain words and phrases come and go in the news media, and sometimes other areas. For a while a few years ago, I was seeing the phrase vis-a-vis (I forgot an accent somewhere) all over the place. I even had a history professor at the time who couldn't use the word often enough. But I haven't seen it used for years now.

    One thing I've been noticing recently is `N.B.' I don't really know what it means, but people use it to insert extra comments when writing or updating something.
  • I wonder how long before this can be done real time enough to really make this useful.

    About 3 weeks after the patent expires.

    * Note: I don't actually know if the guy patented the idea, this is a joke.
  • Why do you think this would be useful. Enough time is already wasted on the "latest thing". We need some gadget which tends to start people thinking, not to identify what is making them stop thinking.
  • The Economist has an index of sort that has a similar idea to this. Here's an article [] that describes it.

    It looks for the occurance of the word recession in major newspapers, and it's a pretty good predictor (better than most economists).

    Unfortunately, a lot of the related articles are subscribed content.

  • And next week he discovers that water is wet, death sucks and Republicans are evil.
  • It seems inevitable that as information technology, anthropology, marketing and research technologies (and everything else) are brought to bear on the tasks of predicting, identifying, and capitalizing on an emerging fad ASAP, this will inspire the creative forces that generate new culture to avoid generating it. Creative forces (artists, "the hip", those who do this kind of stuff) want to be differently expressive. If ideas are co-opted for mass exposure and profit as soon as they begin to emerge, those ideas will stop emerging. Those creative forces will inevitably learn to generate anti-fads (new, different, difficult to co-opt in the current culture), whatever that turns out to be.

    The requirement to exploit emerging creative difference will change those fads to something else.

    Let's figure out what it will be and sell it!
  • Carnivore was so they'd have the edge on being cool.

    As you can see, even with advanced technology and a huge corpus of email and search requests, coolness is all about the mirrored shades and gold embroidery.

  • Damn we been sussed!

  • Doesn't Dilbert already have a copyright on that one?
  • by Parsec ( 1702 ) on Wednesday February 19, 2003 @03:33PM (#5337346) Homepage Journal

    What a great opportunity for culture jamming! We just need a few thousand webloggers to start using weird words designed to repel "normal" people.

    Obviously this could backfire and we could actually start a real trend. So, I propose that the first words we need to put out are ( geek || nerd ) && sexy. (And if you understood that, you must be hot stuff.) I'm willing to take this risk if you are.

  • all about bragging about your method of searching online blogs... ...cribbing memes... ...and knowing where to find all of the State of the Union addresses since 1790...

  • I just tested it on feeds from the major news media. The highest bursti-ness readings are on the words "Bush", "Iraq" and "terrorism." The software package I'm using was developed by a team of millions of independent non-programmers, and is called "Duh!".
  • something something.. COOL!!!!!!! something something FUNNY!!!!!!
    Which Garbage Pail Jr Kid are U?! TRY IT!! I'M SMELLY MCGEE!!!!
    I'm gonna move out at 18!!! MOM SUCKS!!!!
    New Homestar Runner this week. SO COOL!!!!
    Dog bites man ;-P HAHA IRONY!!!
    My mom totally hates my hair!!! SUCH A BITCH!!!
    Wow you can block pop-unders!! KEWL ITS FREE!!
    Flash RULEZ!! This movie is wack ya'all!!!
    Buffy something something!!! !!!!!!
    Man bits dog back!! SUPER IRONY!!


  • "Word bursts" is too boring a term, moreover it's not very academic. I suggest "blogolalia" instead.
  • The algorithms used to identify these sudden bursts are relatively simple, but very powerful, says Christos Papadimitriou, at the University of California at Berkeley. OK, show us! Why all the talk and no examples? If these simple algorithms exists, why doesn't the article give us a site that actually uses these algorithms, so we can see what's popular today for ourselves?
  • The algorithms used to identify these sudden bursts are relatively simple, but very powerful, says Christos Papadimitriou, at the University of California at Berkeley.

    OK, show us! Why all the talk and no examples?

    If these simple algorithms exists, why doesn't the article give us a site that actually uses these algorithms, so we can see what's popular today for ourselves?

The rich get rich, and the poor get poorer. The haves get more, the have-nots die.