Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
The Almighty Buck

Close but no Cigar for Netflix Recommender System 114

Ponca City, We Love You writes "In October 2006, Netflix, the online movie rental service, announced that it would award $1 million to the first team to improve the accuracy of Netflix's movie recommendations by 10% based on personal preferences. Each contestant was given a set of data from which three million predictions were made about how certain users rated certain movies and Netflix compared that list with the actual ratings and generated a score for each team. More than 27,000 contestants from 161 countries submitted their entries and some got close, but not close enough. Today Netflix announced that it is awarding an annual progress prize of $50,000 to a group of researchers at AT&T Labs, who improved the current recommendation system by 8.43 percent but the $1 million grand prize is still up for grabs and a $50,000 progress prize will be awarded every year until the 10 percent goal is met. As part of the rules of the competition, the team was required to disclose their solution publicly. (pdf)"
This discussion has been archived. No new comments can be posted.

Close but no Cigar for Netflix Recommender System

Comments Filter:
  • Moving target? (Score:5, Interesting)

    by ktappe ( 747125 ) on Wednesday November 14, 2007 @11:13AM (#21349319)
    Will Netflix incorporate the near-winners' ideas into their current system? If so, won't future teams be aiming at a moving (improving) target? If not, won't current Netflix customers know that their recommendations could be better if Netflix just incorporated a now publicly-disclosed algorithm into their servers?
  • Re:I'd say... (Score:3, Interesting)

    by Billosaur ( 927319 ) * <wgrother@oEINSTE ... minus physicist> on Wednesday November 14, 2007 @11:13AM (#21349327) Journal

    It's hard to say. On the one hand, it could be that the current system is good enough that improvements are minutely incremental, though 8+% is pretty good if you ask me. On the other, it may be that the system is so fraught with dependencies and/or the relationships are so variable that it's hard to make gigantic leaps in sophistication. Look at Amazon's recommendation system: pretty good overall but still makes some egregious errors. Add the tagging system to the mix and it's possible to lead the recommendations astray. I don't know how Netflix works (I don't have to watch movies on TV, let alone rent them), but I have to think it would be very similar to Amazon.

  • Re:Moving target? (Score:3, Interesting)

    by Ngarrang ( 1023425 ) on Wednesday November 14, 2007 @11:19AM (#21349389) Journal
    I had the same thought. And to what extent is the accuracy of a suggestion system important? Sometimes, throwing in a completely different suggestion might garner you a rental and possibly more rentals because you might like other movies of that type.
  • Re:I'd say... (Score:5, Interesting)

    by bigbigbison ( 104532 ) on Wednesday November 14, 2007 @11:36AM (#21349545) Homepage
    I think the problem is that (and I may be wrong) any new system that researchers come up with isn't allowed to ask the user for more information. This would make if very hard for any system to be acurate if it is based soley on what dvds you rented and how you rated them.
    If I liked Die Hard 4, for example, did I like it because of Bruce Willis, the "I'm a Mac" guy, the special effects, the plot, or some other reason that even I don't know?
    Personally, I know that I have rated something like 900 movies on the netflix site and nearly all the recommendations are things I've no interest in or they simply say, "Sorry we have no recommendations for you at this time."
    I would like to think that if they could ask me why I rated one movie a 4 and another a 1 then they might have more accurate recommendations. Even if they just had a drop down menu with something like, "I liked this movie because of a) the starts, b) the plot, c) the genre and so on" it would make recommendations a lot easier.
  • Re:I'd say... (Score:3, Interesting)

    by antifoidulus ( 807088 ) on Wednesday November 14, 2007 @11:45AM (#21349639) Homepage Journal
    Thats why they need to get multiple data points to make a recommendation. If you rented a lot of the "I'm a Mac Guy" movies and rated them highly, then there is a bigger chance that that is the reason you liked that movie. If you refused to rent, or rated poorly the movie "Rugrats Gone Wild" then you probably aren't a rabid Bruce Willis fan etc. The entire goal of the project is to find films you like without you having to do a mini-review of every movie you have rented/saw.
  • diminishing returns (Score:3, Interesting)

    by caffeinemessiah ( 918089 ) on Wednesday November 14, 2007 @01:00PM (#21350763) Journal
    From my experience with the Netflix Prize, and ML/stat.learning techniques in general, that last 1.57% is going to be the hardest. There is a diminishing returns effect going on here, i.e. the effort required for each successive 1% increase gets progressively larger.
  • by illegalcortex ( 1007791 ) on Wednesday November 14, 2007 @02:23PM (#21352255)
    I think you have to consider that netflix is working off a very large user base with a very large list of titles. In this sense, computation time is going to go way up the more you keep adding all these factors. I'm sure they've had projects internal to netflix to use more data, but found that it just didn't pay off with the increased computation time. It's much better to get good recommendations onto the page instantly than make the user wait 2 seconds for great recommendations. The same is possibly true for doing recommendations ahead of time and having to spend the extra compute time and storage space.

    Plus, I think there's always going to be some level of "noise" in the system. People rating things incorrectly (clicked on the wrong number), people changing their minds, etc. And then there's the cases where it makes no logical sense that if I liked movie A, B and C that I should hate movie D. The question is, how good can a recommendation system get when it will always be thrown off by the noise.

    So while I agree with you in theory, I think it may not work out to be such a great thing in practice.
  • Re:I'd say... (Score:3, Interesting)

    by IceFox ( 18179 ) on Wednesday November 14, 2007 @03:30PM (#21353277) Homepage
    Most everyone who tried was able to beat Netflix's existing system. I put together a little framework to help people get up and running faster. A lot of people seemed to be spending time just getting all the data into memory before they got to play with any algorithm ideas. I include a few algorithms including Simon Funk's which should be enough to get you started. []
  • Buried gem (Score:3, Interesting)

    by vrmlguy ( 120854 ) <> on Wednesday November 14, 2007 @04:20PM (#21353985) Homepage Journal
    The most interesting part of the research paper was this: "More specifically, if movie i was rated x days later than movie j, we multiply their similarity by exp(-x/600). The denominator 600 (days) was determined by cross validation, and reflects the fact that after two years, similarity decays by approximately a factor of 3." Apparently Joe Average's tastes in movies slowly evolve over time, and something you liked three years ago may not be that attractive today.

    This raises the question, should someone's age affect the denominator? People in or just out of college generally see their tastes evolve quickly, while people in retirement homes might take decades to get tired of something.

    I also wonder if this decay factor applies to other fields. Not just books or music, but toothpaste or politicians. In the US, your representative is presumably re-elected before your opinion has time to change much; the president just as you're getting tired of him. It makes me wonder how Senators get re-elected at all.

Marvelous! The super-user's going to boot me! What a finely tuned response to the situation!