Slashdot stories can be listened to in audio form via an RSS feed, as read by our own robotic overlord.

 



Forgot your password?
typodupeerror
The Almighty Buck

Close but no Cigar for Netflix Recommender System 114

Posted by CmdrTaco
from the fifty-grand-aint-bad dept.
Ponca City, We Love You writes "In October 2006, Netflix, the online movie rental service, announced that it would award $1 million to the first team to improve the accuracy of Netflix's movie recommendations by 10% based on personal preferences. Each contestant was given a set of data from which three million predictions were made about how certain users rated certain movies and Netflix compared that list with the actual ratings and generated a score for each team. More than 27,000 contestants from 161 countries submitted their entries and some got close, but not close enough. Today Netflix announced that it is awarding an annual progress prize of $50,000 to a group of researchers at AT&T Labs, who improved the current recommendation system by 8.43 percent but the $1 million grand prize is still up for grabs and a $50,000 progress prize will be awarded every year until the 10 percent goal is met. As part of the rules of the competition, the team was required to disclose their solution publicly. (pdf)"
This discussion has been archived. No new comments can be posted.

Close but no Cigar for Netflix Recommender System

Comments Filter:
  • Re:Moving target? (Score:5, Informative)

    by illegalcortex (1007791) on Wednesday November 14, 2007 @10:30AM (#21349469)
    It's not a moving target. It's a very fixed number (RMSE = 0.8563) that the winning algorithm has to come up with. The netflix algorithm never gets re-run on the data for the prize.

    Netflix is free to merge any improvements into their actual system in the meantime.
  • Re:Bad title (Score:3, Informative)

    by dintech (998802) on Wednesday November 14, 2007 @10:32AM (#21349497)
    I don't think I'm mistaken in saying that $50,000 buys quite a few cigars too.
  • by hashmap (613482) on Wednesday November 14, 2007 @10:34AM (#21349529)

    Most noteworthy aspect of the winning entry is that their winning method works by combining 107 different types of prediction strategies.

    They state that you can get pretty far by blending the 3-4 best strategies, but of course doing so would not have netted them the progress prize

    It is kind of sad realization that there actually is no better method. Your best bet is to use brute force and attempt to find some weighting methodology that combines known methods. By the way this is a well known issue in protein structure prediction competitions, for many years now so called meta-servers (predictions work by merely combining other predictions) win all the time. The joke is that we now need meta-meta-servers, combine the results of combiners

    Also a clarification on the progress prize: to get it you need to have at least 1% improvement over the previous result. Considering that there is only 1.57% to go there is room for only one more progress prize until it hits the Grand Prize (10% improvement over the original results).

  • Not how it works (Score:5, Informative)

    by illegalcortex (1007791) on Wednesday November 14, 2007 @10:42AM (#21349599)
    That's not how the contest works. It's based on the RMSE that the original netflix algorithm got at the beginning of the contest. This is fixed and does not change. See the contest site for more details.
  • Re:I'd say... (Score:3, Informative)

    by flynt (248848) on Wednesday November 14, 2007 @10:59AM (#21349859)
    From the Netflix Prize FAQ, they say how they currently do it:

    "Straightforward statistical linear models with a lot of data conditioning."

    The Netflix programmers shouldn't necessarily get special recognition for using least-squares modeling, but feel free to pass on your praise to Gauss, Legendre, Galton, and Fisher.

    What's amazing is how hard it is to improve drastically on these 150-year-old statistical techniques.
  • Re:I'd say... (Score:4, Informative)

    by Fnord666 (889225) on Wednesday November 14, 2007 @11:03AM (#21349911) Journal

    Thats why they need to get multiple data points to make a recommendation. If you rented a lot of the "I'm a Mac Guy" movies and rated them highly, then there is a bigger chance that that is the reason you liked that movie. If you refused to rent, or rated poorly the movie "Rugrats Gone Wild" then you probably aren't a rabid Bruce Willis fan etc. The entire goal of the project is to find films you like without you having to do a mini-review of every movie you have rented/saw.
    One of the common approaches to recommender systems is SVD, or Singular Value Decomposition [wikipedia.org]. SVD tries to isolate "features" in the training set that best represent a particular trait of the data and its value, such as the examples above. You may not have any idea what the feature actually represents, but that is fairly common in machine learning. It is an iterative process. Once you have defined one feature as well as you can, you move on to a new one. There are diminishing returns with this approach though, and identifying too many features can overspecialize your system and yield worse results. If your results are not good enough, you can try a different approach. Once you have tried several approaches that are almost good enough, you can try combining the different results to varying degrees to get a hopefully better result. That is what the leaders have done so far.
  • Re:I'd say... (Score:5, Informative)

    by DerekLyons (302214) <(fairwater) (at) (gmail.com)> on Wednesday November 14, 2007 @12:21PM (#21351167) Homepage

    ook at Amazon's recommendation system: pretty good overall but still makes some egregious errors.

    Egregious errors? It's downright useless unless you pretty much buy only one genre of book/music/whatever. Their system is heavily weighted towards whatever you most recently bought - and drops huge slabs of quasi related stuff into your request list at the slightest provocation.
     
    I buy (among other things) serious works of culinary history, sociology, etc... Yet my reccomendation list is clogged with food porn (coffee table cookbooks) and the latest crap offerings from whichever TV chef is the current flavor of the moment. It also doesn't recognize the difference between editions - if you buy a hardback, it'll happily reccomend you buy the paperback. If you buy a frequently reprinted SF novel, it'll happily add each new printing/edition to your queue.

Mr. Cole's Axiom: The sum of the intelligence on the planet is a constant; the population is growing.

Working...