Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Close but no Cigar for Netflix Recommender System

Posted by CmdrTaco on Wed Nov 14, 2007 10:00 AM
from the fifty-grand-aint-bad dept.
Ponca City, We Love You writes "In October 2006, Netflix, the online movie rental service, announced that it would award $1 million to the first team to improve the accuracy of Netflix's movie recommendations by 10% based on personal preferences. Each contestant was given a set of data from which three million predictions were made about how certain users rated certain movies and Netflix compared that list with the actual ratings and generated a score for each team. More than 27,000 contestants from 161 countries submitted their entries and some got close, but not close enough. Today Netflix announced that it is awarding an annual progress prize of $50,000 to a group of researchers at AT&T Labs, who improved the current recommendation system by 8.43 percent but the $1 million grand prize is still up for grabs and a $50,000 progress prize will be awarded every year until the 10 percent goal is met. As part of the rules of the competition, the team was required to disclose their solution publicly. (pdf)"
+ -
story

Related Stories

[+] News: Wal-Mart Closes Online Movie Download Service 136 comments
eldavojohn writes "A year after opening its movie download service, Wal-Mart has abandoned the endeavor. They claim this is a result of HP's decision to stop supporting its video download store software. The article also notes that, unlike iTunes, Wal-Mart offered variable pricing which attracted a lot of studios. 'The world's largest retailer instead turned its rental service over to Netflix Inc. Wal-Mart still operates a music download service and continues to sell CDs and DVDs at retail stores and over the Internet for shipping by mail.' Is this evidence of the strength of unified pricing in media downloads or just another company being squished by the giant Netflix & Apple?"
[+] Developers: Psychologist Beating Math Nerds in Race to Netflix Prize 205 comments
s1d writes "An almost-anonymous British psychologist named Gavin Potter has suddenly risen to the top of the Netflix prize charts. With his very first attempt, he got a score which took the BellKor team seven months to reach. Currently at a score of 8.07, he has only five teams ahead of him now in the race for the ultimate Netflix algorithm. 'Potter says his anonymity is mostly accidental. He started that way and didn't come out into the open until after Wired found him. "I guess I didn't think it was worth putting up a link until I had got somewhere," he says, adding that he'd been seriously posting under the name of his venture capital and consulting firm, Mathematical Capital, for two months before launching "Just a guy." When he started competing, he posted to his blog: "Decided to take the Netflix Prize seriously. Looks kind of fun. Not sure where I will get to as I am not an academic or a mathematician. However, being an unemployed psychologist I do have a bit of time."'"
[+] Science: Interest Still High In the Netflix Algorithm Competition 77 comments
circletimessquare brings us an update to the status of the million-dollar Netflix competition to develop a better algorithm for movie recommendations. We've discussed aspects of the competition since it started two years ago, but the New York Times has a lengthy overview of where it stands now. "The Netflix competition is still going strong, with a vibrant, competitive roster of some 30,000 programmers around the globe hard at work trying to win the prize. The Times provides a look at some of the more obsessive searchers, such as Len Bertoni, a semi-retired computer scientist near Pittsburgh who logs 20 hours a week on the problem, oftentimes with the help of his children. There's also Martin Chabbert in Montreal: 'After the kids are asleep and I've packed the lunches for school, I come down at 9 in the evening and work until 11 or 12.' The article gets into the history of the search algorithm Netflix currently uses, and explores the hot commodity called 'singular value decomposition' that serves as the basis for most of the algorithms in competition."
[+] Technology: Netflix Prize May Have Been Achieved 83 comments
MadAnalyst writes "The long-running $1,000,000 competition to improve on the Netflix Cinematch recommendation system by 10% (in terms of the RMSE) may have finally been won. Recent results show a 10.05% improvement from the team called BellKor's Pragmatic Chaos, a merger between some of the teams who were getting close to the contest's goal. We've discussed this competition in the past."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Is the new margin of improvement for victory then?
    • I would think they would not implement the AT&T team's solution given it did not reach the 10% mark, however AT&T has the lead in reaching that mark unless someone comes up with some quantum leap in design./p.

    • Actually, and my math will suck here. If they implement the new AT&T data, and then ask for 10% it would be much harder then what AT&T themselves did.

      If they were at 50% accuracy and AT&T gave them 8.63%, if they implement that they are now at 58.63% accuracy. If they require a 10% increase then a new person will have to bring them up to 68.63% accuracy, much harder then the 60% AT&T was aiming for. Assuming that it becomes harder as you get closer to 100% accuracy.
      • I gave AT&T an extra 0.2% because I love Big Brother. Not because my math sucks.

        -Dick Cheney
      • My guess is that they're giving the award for cutting out 10% of the inaccuracy, ie if they're at 50% and you can get them to 55%. At that point, another 10% would be 4.5% instead of 5%. This is because there's almost no chance of getting to 100% probability, so you're going to the limit of 100% without any chance of getting there.
      • Not how it works (Score:5, Informative)

        by illegalcortex (1007791) on Wednesday November 14 2007, @10:42AM (#21349599)
        That's not how the contest works. It's based on the RMSE that the original netflix algorithm got at the beginning of the contest. This is fixed and does not change. See the contest site for more details.
    • From my experience with the Netflix Prize, and ML/stat.learning techniques in general, that last 1.57% is going to be the hardest. There is a diminishing returns effect going on here, i.e. the effort required for each successive 1% increase gets progressively larger.
  • Any chance of not tagging this story with this meme?
  • I'd say... (Score:5, Insightful)

    by Otter (3800) on Wednesday November 14 2007, @10:05AM (#21349251) Journal
    If the people who created Netflix's system are still with the company, I'd say they deserve some retroactive recognition (and bonuses). That's pretty damn good optimization if it's that hard to improve upon, and there seem to have been some really sophisticated people trying to beat them.
    • by wattrlz (1162603) on Wednesday November 14 2007, @10:11AM (#21349307)
      Perhaps they should look at whatever chooses the slashdot page-bottom quotes for inspiration: Mosher's Law of Software Engineering: Don't worry if it doesn't work right. If everything did, you'd be out of a job.
    • Re: (Score:3, Interesting)

      It's hard to say. On the one hand, it could be that the current system is good enough that improvements are minutely incremental, though 8+% is pretty good if you ask me. On the other, it may be that the system is so fraught with dependencies and/or the relationships are so variable that it's hard to make gigantic leaps in sophistication. Look at Amazon's recommendation system: pretty good overall but still makes some egregious errors. Add the tagging system to the mix and it's possible to lead the recommen

      • by Anne_Nonymous (313852) on Wednesday November 14 2007, @10:41AM (#21349579) Homepage Journal
        Customers who bought the items in your shopping cart also bought:

        Empress Charmeuse Silk Sheet Set - Queen - Ivory ~ $399.00
        Black Leather Victorian Vintage Shaper Corset Boned Lace Up corset ~ $119.99
        Orgazyme Clitoral Stimulation Gel, Topical, 0.8 oz ~ $20.79
        Pampers Cruisers, Size 4, Economy Plus Pack, 140 Cruisers ~ $38.99

        I don't think Amazon has much room to improve their recommendation technology.

      • Re:I'd say... (Score:5, Informative)

        by DerekLyons (302214) <fairwater@NOsPAM.gmail.com> on Wednesday November 14 2007, @12:21PM (#21351167) Homepage

        ook at Amazon's recommendation system: pretty good overall but still makes some egregious errors.

        Egregious errors? It's downright useless unless you pretty much buy only one genre of book/music/whatever. Their system is heavily weighted towards whatever you most recently bought - and drops huge slabs of quasi related stuff into your request list at the slightest provocation.
         
        I buy (among other things) serious works of culinary history, sociology, etc... Yet my reccomendation list is clogged with food porn (coffee table cookbooks) and the latest crap offerings from whichever TV chef is the current flavor of the moment. It also doesn't recognize the difference between editions - if you buy a hardback, it'll happily reccomend you buy the paperback. If you buy a frequently reprinted SF novel, it'll happily add each new printing/edition to your queue.
      • Re: (Score:3, Interesting)

        Most everyone who tried was able to beat Netflix's existing system. I put together a little framework to help people get up and running faster. A lot of people seemed to be spending time just getting all the data into memory before they got to play with any algorithm ideas. I include a few algorithms including Simon Funk's which should be enough to get you started. http://www.icefox.net/programs/?program=NetflixRecommenderFramework [icefox.net]
    • At the risk of getting marked redundant: I totally agree, and I don't work for Netflix

      This is a great contest, considering they have to publicly release the solution.

      Although what is AT&T doing working on this problem?
      • Re: (Score:2, Insightful)

        by Anonymous Coward
        AT&T Labs = bunch of people from former Bell Labs = welfare for AI researchers ;)
      • It's too bad they didn't win the full prize, but at least they now have lucrative job opportunities in the booming spam industry. I keep getting spammed with Viagra ads and illegal replica watches, but I really want all of the hardcore pr0n spam.
    • Re:I'd say... (Score:5, Interesting)

      by bigbigbison (104532) on Wednesday November 14 2007, @10:36AM (#21349545) Homepage
      I think the problem is that (and I may be wrong) any new system that researchers come up with isn't allowed to ask the user for more information. This would make if very hard for any system to be acurate if it is based soley on what dvds you rented and how you rated them.
      If I liked Die Hard 4, for example, did I like it because of Bruce Willis, the "I'm a Mac" guy, the special effects, the plot, or some other reason that even I don't know?
      Personally, I know that I have rated something like 900 movies on the netflix site and nearly all the recommendations are things I've no interest in or they simply say, "Sorry we have no recommendations for you at this time."
      I would like to think that if they could ask me why I rated one movie a 4 and another a 1 then they might have more accurate recommendations. Even if they just had a drop down menu with something like, "I liked this movie because of a) the starts, b) the plot, c) the genre and so on" it would make recommendations a lot easier.
      • Re: (Score:3, Interesting)

        Thats why they need to get multiple data points to make a recommendation. If you rented a lot of the "I'm a Mac Guy" movies and rated them highly, then there is a bigger chance that that is the reason you liked that movie. If you refused to rent, or rated poorly the movie "Rugrats Gone Wild" then you probably aren't a rabid Bruce Willis fan etc. The entire goal of the project is to find films you like without you having to do a mini-review of every movie you have rented/saw.
        • Re:I'd say... (Score:4, Informative)

          by Fnord666 (889225) on Wednesday November 14 2007, @11:03AM (#21349911) Journal

          Thats why they need to get multiple data points to make a recommendation. If you rented a lot of the "I'm a Mac Guy" movies and rated them highly, then there is a bigger chance that that is the reason you liked that movie. If you refused to rent, or rated poorly the movie "Rugrats Gone Wild" then you probably aren't a rabid Bruce Willis fan etc. The entire goal of the project is to find films you like without you having to do a mini-review of every movie you have rented/saw.
          One of the common approaches to recommender systems is SVD, or Singular Value Decomposition [wikipedia.org]. SVD tries to isolate "features" in the training set that best represent a particular trait of the data and its value, such as the examples above. You may not have any idea what the feature actually represents, but that is fairly common in machine learning. It is an iterative process. Once you have defined one feature as well as you can, you move on to a new one. There are diminishing returns with this approach though, and identifying too many features can overspecialize your system and yield worse results. If your results are not good enough, you can try a different approach. Once you have tried several approaches that are almost good enough, you can try combining the different results to varying degrees to get a hopefully better result. That is what the leaders have done so far.
      • Re: (Score:2, Insightful)

        The idea is that with enough data, you could extract the "why" automatically. For example, if you rated all Arnold Scwarzenegger 5, then its probably because you like Arnold. If however you gave a rating of 1 to Kindergarden Cop, as well as "The Game Plan" and a bunch of similar movies, the system could also infer, that as much as you like Arnold, you don't like kids movies starring washed up "action" movie stars.

        This is the whole idea behind the field of "machine learning": inferring causes/relationship/
    • Re: (Score:3, Informative)

      From the Netflix Prize FAQ, they say how they currently do it:

      "Straightforward statistical linear models with a lot of data conditioning."

      The Netflix programmers shouldn't necessarily get special recognition for using least-squares modeling, but feel free to pass on your praise to Gauss, Legendre, Galton, and Fisher.

      What's amazing is how hard it is to improve drastically on these 150-year-old statistical techniques.
      • "Straightforward statistical linear models with a lot of data conditioning."

        There's a whole lot of devil in those details, though.

    • Netflix's system is already 90.3% accurate!
  • Moving target? (Score:5, Interesting)

    by ktappe (747125) on Wednesday November 14 2007, @10:13AM (#21349319)
    Will Netflix incorporate the near-winners' ideas into their current system? If so, won't future teams be aiming at a moving (improving) target? If not, won't current Netflix customers know that their recommendations could be better if Netflix just incorporated a now publicly-disclosed algorithm into their servers?
    • Re: (Score:3, Interesting)

      I had the same thought. And to what extent is the accuracy of a suggestion system important? Sometimes, throwing in a completely different suggestion might garner you a rental and possibly more rentals because you might like other movies of that type.
      • Accuracy in this contest is defined as the user rating highly the movies that the system would suggest to them. The whole point of it is trust. If you're throwing out lots of suggestions that the user doesn't like just to try to find one they might like, you're destroying their trust in the system. They won't bother even reading the recommendations if they know they're filled with garbage.
    • Re:Moving target? (Score:5, Informative)

      by illegalcortex (1007791) on Wednesday November 14 2007, @10:30AM (#21349469)
      It's not a moving target. It's a very fixed number (RMSE = 0.8563) that the winning algorithm has to come up with. The netflix algorithm never gets re-run on the data for the prize.

      Netflix is free to merge any improvements into their actual system in the meantime.
  • Bad title (Score:5, Funny)

    by markov_chain (202465) on Wednesday November 14 2007, @10:25AM (#21349433) Homepage
    The prize was clearly a million dollars, not a cigar! I guess the editors don't even read the summary.
  • by hashmap (613482) on Wednesday November 14 2007, @10:34AM (#21349529)

    Most noteworthy aspect of the winning entry is that their winning method works by combining 107 different types of prediction strategies.

    They state that you can get pretty far by blending the 3-4 best strategies, but of course doing so would not have netted them the progress prize

    It is kind of sad realization that there actually is no better method. Your best bet is to use brute force and attempt to find some weighting methodology that combines known methods. By the way this is a well known issue in protein structure prediction competitions, for many years now so called meta-servers (predictions work by merely combining other predictions) win all the time. The joke is that we now need meta-meta-servers, combine the results of combiners

    Also a clarification on the progress prize: to get it you need to have at least 1% improvement over the previous result. Considering that there is only 1.57% to go there is room for only one more progress prize until it hits the Grand Prize (10% improvement over the original results).

  • hmmm .... (Score:5, Funny)

    by Average_Joe_Sixpack (534373) on Wednesday November 14 2007, @10:36AM (#21349543)
    if ($director eq "Michael Bay") {
            print "Not recommended";
    }

    That should improve the system by at least 20%
  • I'm skeptical about these sorts of prizes. The X prize, Top Coder, Clay Institute Millennium Prizes-- if those were the only reasons to do something, few would. Seems pretty risky to do a lot of work for what amounts to a lottery ticket. So, who got 2nd place, and how well did they do? 1 group wins a paltry $50K and a little publicity and recognition, maybe even an endorsement or two, and the other 27000 plus get what? Nothing much. It's cool and fun to work on such problems, but people have bills to

    • I can say I played with it because I found it fun. I'm a coder, it's what my brain is interested in. There have been contests for ages simply because human beings like to compete, even if second place gets nothing.

      And FYI, netflix doesn't get any "ideas" from anyone but the winner. You only have to submit your code if you win.
    • Well, I'll tell you something. Most criminals suck at judging risks. They simply tend to forget that robbing a bank is very likely to put them in jail. On the other hand, some people are afraid of speeding even though they're driving on a road that is extremely unlikely to be patrolled by cops. My point is that different people have completely views on risks and I think it's extremely rare that people actually back up their actions with maths and statistics, instead its all about emotions.

      I think its good t
    • by MBCook (132727) <foobarsoft@foobarsoft.com> on Wednesday November 14 2007, @11:09AM (#21350003) Homepage

      Two reasons I can think of. One is the challenge. I like to code but I'm not great with coming up with projects to do myself. This kind of thing would be nice for that.

      The other is the experience. If you get second in this, no, you won't win the prize. But you can bet that having that on your resume would make getting many jobs much easier. Amazon would like your skills. So would many other retailers.

      Also, as a side note, it's not a lottery. There is a three prong legal test in the US to determine if something is a lottery. I think the three parts are you have to pay to get it, everyone has an equal chance of winning, and there is no skill involved. I'm not positive about the second part. This is free to enter and is based quite a bit on skill, so it's not like a lottery.

      Don't exaggerate.

      This isn't a way to get free work. It's a way to get very smart job candidates to find you. It's a recruiting tool. You don't honestly think that they will take the winning idea, pay the $1m, and then just say "bye" do you? They will offer that person a job if at all reasonable (if it's a team of 500 students, obviously they couldn't).

    • That's true, but the prize is just icing to most (all?) of these groups. Many will spend much more than the prize to get it, and everyone involved knows this and still goes through with it. Sometimes it's enough to do it to advance the technology. You can consider it a prize that millions of people will make use of what you produced. Also don't underestimate the fun factor. It's a big drive for what people do. It's a cliche but money isn't everything. Also, research groups could be working on this, while ge
    • by AdamTrace (255409) on Wednesday November 14 2007, @12:04PM (#21350837)
      "Any contestants reading this? Maybe you could enlighten the rest of us on why you bothered competing?"

      There are two immediate reasons I can think of why anyone would bother competing:

      1) To win money.
      2) Because they enjoy the challenge of trying to solve an interesting problem.

      I'm just a simple coder, and knew that I didn't have any realistic chance of winning money. But I still found it very satisfying to try to come up with a solution and send it in and see how I did. I don't regret spending hours of my own leisure time on the project.

      That said, eventually I gave it up. It was very clear that I'm not smart enough to meet the challenge. I had my fun, and it was time to move on to the next project. In summary, I don't think it's safe to assume that everyone is in it for the money.

      "Pardon my cynicism, but seems like contests like this are a way to get a lot of ideas and work for very little money."

      I call it "brilliant". Netflix probably put some pricetag on what it would pay to get >10% improvement on their system. That pricetag is probably more than $1 million. That means profit!
  • They should give AT&T $843,000.

  • some

    if age 18 and male then hard porn and south park
    if age > 18 and male and lives at home then any sci-fi movie (plus points if it's a sequel)
    if age > 18 and female then any movie with Princess in the title

    100% match up !
  • that's what i get for listening to, uh, slashdot: http://slashdot.org/article.pl?sid=06/10/09/1344235 [slashdot.org]
  • Here's an idea... (Score:4, Insightful)

    by the JoshMeister (742476) on Wednesday November 14 2007, @03:18PM (#21353957) Homepage Journal
    Why not give users more control over their recommendations? Heck, even a bunch of checkboxes would be useful.

    For example, Netflix frequently recommends rated R movies to my family, but we have never rented a single R-rated movie and have no desire to do so. Moreover, every time we get a recommendation for an R-rated movie, we rate it "Not Interested." I've probably marked dozens of R-rated movies "Not Interested," but they continue to be recommended. (Either Netflix is trying to tell me to just give in and rent one already, or they really don't understand my family's movie preferences.)

    A simple checkbox for "Do not recommend R-rated movies" would be all Netflix needs to substantially improve its accuracy for my family. I imagine Netflix could add checkboxes for similar criteria as well. In any case, I think a key point is giving more control over recommendations to the users themselves.
  • Buried gem (Score:3, Interesting)

    by vrmlguy (120854) <samwyse@gmail.YEATScom minus poet> on Wednesday November 14 2007, @03:20PM (#21353985) Homepage Journal
    The most interesting part of the research paper was this: "More specifically, if movie i was rated x days later than movie j, we multiply their similarity by exp(-x/600). The denominator 600 (days) was determined by cross validation, and reflects the fact that after two years, similarity decays by approximately a factor of 3." Apparently Joe Average's tastes in movies slowly evolve over time, and something you liked three years ago may not be that attractive today.

    This raises the question, should someone's age affect the denominator? People in or just out of college generally see their tastes evolve quickly, while people in retirement homes might take decades to get tired of something.

    I also wonder if this decay factor applies to other fields. Not just books or music, but toothpaste or politicians. In the US, your representative is presumably re-elected before your opinion has time to change much; the president just as you're getting tired of him. It makes me wonder how Senators get re-elected at all.
    • You do know netflix already HAS a recommendation engine, right? Supposedly a very good one. The whole point of the contest was to significantly improve that engine.
    • I think you have to consider that netflix is working off a very large user base with a very large list of titles. In this sense, computation time is going to go way up the more you keep adding all these factors. I'm sure they've had projects internal to netflix to use more data, but found that it just didn't pay off with the increased computation time. It's much better to get good recommendations onto the page instantly than make the user wait 2 seconds for great recommendations. The same is possibly tr