Close but no Cigar for Netflix Recommender System 114
Ponca City, We Love You writes "In October 2006, Netflix, the online movie rental service, announced that it would award $1 million to the first team to improve the accuracy of Netflix's movie recommendations by 10% based on personal preferences. Each contestant was given a set of data from which three million predictions were made about how certain users rated certain movies and Netflix compared that list with the actual ratings and generated a score for each team. More than 27,000 contestants from 161 countries submitted their entries and some got close, but not close enough. Today Netflix announced that it is awarding an annual progress prize of $50,000 to a group of researchers at AT&T Labs, who improved the current recommendation system by 8.43 percent but the $1 million grand prize is still up for grabs and a $50,000 progress prize will be awarded every year until the 10 percent goal is met. As part of the rules of the competition, the team was required to disclose their solution publicly. (pdf)"
1.57% (Score:2)
Re: (Score:2)
I would think they would not implement the AT&T team's solution given it did not reach the 10% mark, however AT&T has the lead in reaching that mark unless someone comes up with some quantum leap in design./p.
Re:1.57% (Score:5, Funny)
Re: (Score:2)
If they were at 50% accuracy and AT&T gave them 8.63%, if they implement that they are now at 58.63% accuracy. If they require a 10% increase then a new person will have to bring them up to 68.63% accuracy, much harder then the 60% AT&T was aiming for. Assuming that it becomes harder as you get closer to 100% accuracy.
Re: (Score:2)
-Dick Cheney
Re: (Score:2)
Not how it works (Score:5, Informative)
diminishing returns (Score:3, Interesting)
Don't meme me bro (Score:2)
Re: (Score:1, Redundant)
Re: (Score:2)
That was funny
I'd say... (Score:5, Insightful)
Re:I'd say... (Score:4, Funny)
Re: (Score:3, Interesting)
It's hard to say. On the one hand, it could be that the current system is good enough that improvements are minutely incremental, though 8+% is pretty good if you ask me. On the other, it may be that the system is so fraught with dependencies and/or the relationships are so variable that it's hard to make gigantic leaps in sophistication. Look at Amazon's recommendation system: pretty good overall but still makes some egregious errors. Add the tagging system to the mix and it's possible to lead the recommen
Re:I'd say... (Score:5, Funny)
Empress Charmeuse Silk Sheet Set - Queen - Ivory ~ $399.00
Black Leather Victorian Vintage Shaper Corset Boned Lace Up corset ~ $119.99
Orgazyme Clitoral Stimulation Gel, Topical, 0.8 oz ~ $20.79
Pampers Cruisers, Size 4, Economy Plus Pack, 140 Cruisers ~ $38.99
I don't think Amazon has much room to improve their recommendation technology.
Re:I'd say... (Score:5, Informative)
Egregious errors? It's downright useless unless you pretty much buy only one genre of book/music/whatever. Their system is heavily weighted towards whatever you most recently bought - and drops huge slabs of quasi related stuff into your request list at the slightest provocation.
I buy (among other things) serious works of culinary history, sociology, etc... Yet my reccomendation list is clogged with food porn (coffee table cookbooks) and the latest crap offerings from whichever TV chef is the current flavor of the moment. It also doesn't recognize the difference between editions - if you buy a hardback, it'll happily reccomend you buy the paperback. If you buy a frequently reprinted SF novel, it'll happily add each new printing/edition to your queue.
That's funny... (Score:1)
Re: (Score:2)
Re: (Score:3, Interesting)
AT&T Labs? (Score:2)
This is a great contest, considering they have to publicly release the solution.
Although what is AT&T doing working on this problem?
Re: (Score:2, Insightful)
Re: (Score:2)
Re:I'd say... (Score:5, Interesting)
If I liked Die Hard 4, for example, did I like it because of Bruce Willis, the "I'm a Mac" guy, the special effects, the plot, or some other reason that even I don't know?
Personally, I know that I have rated something like 900 movies on the netflix site and nearly all the recommendations are things I've no interest in or they simply say, "Sorry we have no recommendations for you at this time."
I would like to think that if they could ask me why I rated one movie a 4 and another a 1 then they might have more accurate recommendations. Even if they just had a drop down menu with something like, "I liked this movie because of a) the starts, b) the plot, c) the genre and so on" it would make recommendations a lot easier.
Re: (Score:3, Interesting)
Re:I'd say... (Score:4, Informative)
Re: (Score:1)
Your predictions will be more accurate if you don't try to match with other people who liked Die Hard 4 and Finding Nemo, unless they only rent movies that have CGI/special effects.
Re: (Score:2)
Re: (Score:1)
Re: (Score:2, Insightful)
This is the whole idea behind the field of "machine learning": inferring causes/relationship/
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
I'm probably wronger...
If you rated Die Hard 4 highly, and rated lowly some other movie featuring the "I'm a Mac" guy, but rated highly another featuring Bruce, would it be hard to figure that out?
You rate some movie well, it should recommend a similar movie, see how you rate that and keep building the data set.
It wouldn't even have to know anything about the movie in particular, just how other people rated the same movies.
Find somebody who rated movies rather similarly to how you have rated your movies
Re: (Score:3, Informative)
"Straightforward statistical linear models with a lot of data conditioning."
The Netflix programmers shouldn't necessarily get special recognition for using least-squares modeling, but feel free to pass on your praise to Gauss, Legendre, Galton, and Fisher.
What's amazing is how hard it is to improve drastically on these 150-year-old statistical techniques.
Re: (Score:2)
There's a whole lot of devil in those details, though.
Re: (Score:2)
There's a whole lot of devil in those details, though.
Absolutely. With the size, sparsity, self-censoring and ordinal nature of their dataset, a 'straightforward statistical linear model' would not get very far.
It's a scam. (Score:2)
Re: (Score:1)
90% of the movies I rate are either 3 or 4 stars. I already pre-filter so I don't watch movies that would get a 2 or 1 often and 5s are hard to fine. Trying to differentiate the emotions generating "meh" and "yeah" is gonna be tough. I don't know if most people rate similarly but I imagine they do.
A 10 star system would add more data points and might be better. But a simple system with multiple axis would be a lot better, I bet
Re: (Score:1)
Re: (Score:1)
This is why... (Score:2)
When I did use NetFlix, I spent a good amount of time flagging as many movies I did NOT want and would never, ever rent as those I did or would. The result was a pretty consistent selection that reasonably matched my taste.
Moving target? (Score:5, Interesting)
Re: (Score:3, Interesting)
Trust (Score:2)
Re: (Score:2)
-nB
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
It's very, very important. If it isn't highly accurate, you're just going to completely ignore what it suggests, and get no benefits from it.
And your analogy is extremely flawed. If it's a movie you would like, then it SHOULD be recommended. That's what the system is there for. The odds that recommending a random movie to someone will inadvertently
Re:Moving target? (Score:5, Informative)
Netflix is free to merge any improvements into their actual system in the meantime.
Re: (Score:1)
Bad title (Score:5, Funny)
Re: (Score:3, Informative)
Re: (Score:2)
Re:Bad title (Score:5, Funny)
Re: (Score:2)
Marijuana affects the memory.
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Huh? (Score:2)
Re: (Score:1)
Well I think that shows that I know they have one! I was merely saying that there are better ones in my opinion, and it will be very hard for netflix to match them.
Re: (Score:1)
Plus imdb dont rent out dvds so really they need to stay one up on the competition in another area. Maybe someone should suggest that they start renting them?
no breaktrough - just blending (Score:5, Informative)
Most noteworthy aspect of the winning entry is that their winning method works by combining 107 different types of prediction strategies.
They state that you can get pretty far by blending the 3-4 best strategies, but of course doing so would not have netted them the progress prize
It is kind of sad realization that there actually is no better method. Your best bet is to use brute force and attempt to find some weighting methodology that combines known methods. By the way this is a well known issue in protein structure prediction competitions, for many years now so called meta-servers (predictions work by merely combining other predictions) win all the time. The joke is that we now need meta-meta-servers, combine the results of combiners
Also a clarification on the progress prize: to get it you need to have at least 1% improvement over the previous result. Considering that there is only 1.57% to go there is room for only one more progress prize until it hits the Grand Prize (10% improvement over the original results).
Re: (Score:1)
Also a clarification on the progress prize: to get it you need to have at least 1% improvement over the previous result. Considering that there is only 1.57% to go there is room for only one more progress prize until it hits the Grand Prize (10% improvement over the original results).
Where did you get that? The rules (http://www.netflixprize.com/ [netflixprize.com]) state:
To qualify for a year's $50,000 Progress Prize the accuracy of any of your submitted predictions that year must be less than or equal to the accuracy value established by the judges the preceding year.
You just have to be better.
Re: (Score:1)
hmmm .... (Score:5, Funny)
print "Not recommended";
}
That should improve the system by at least 20%
like trying to win the lottery (Score:2, Insightful)
I'm skeptical about these sorts of prizes. The X prize, Top Coder, Clay Institute Millennium Prizes-- if those were the only reasons to do something, few would. Seems pretty risky to do a lot of work for what amounts to a lottery ticket. So, who got 2nd place, and how well did they do? 1 group wins a paltry $50K and a little publicity and recognition, maybe even an endorsement or two, and the other 27000 plus get what? Nothing much. It's cool and fun to work on such problems, but people have bills to
The thrill of victory (Score:2)
And FYI, netflix doesn't get any "ideas" from anyone but the winner. You only have to submit your code if you win.
Re: (Score:2)
I think its good t
Re: (Score:2)
Re:like trying to win the lottery (Score:5, Insightful)
Two reasons I can think of. One is the challenge. I like to code but I'm not great with coming up with projects to do myself. This kind of thing would be nice for that.
The other is the experience. If you get second in this, no, you won't win the prize. But you can bet that having that on your resume would make getting many jobs much easier. Amazon would like your skills. So would many other retailers.
Also, as a side note, it's not a lottery. There is a three prong legal test in the US to determine if something is a lottery. I think the three parts are you have to pay to get it, everyone has an equal chance of winning, and there is no skill involved. I'm not positive about the second part. This is free to enter and is based quite a bit on skill, so it's not like a lottery.
Don't exaggerate.
This isn't a way to get free work. It's a way to get very smart job candidates to find you. It's a recruiting tool. You don't honestly think that they will take the winning idea, pay the $1m, and then just say "bye" do you? They will offer that person a job if at all reasonable (if it's a team of 500 students, obviously they couldn't).
Re: (Score:2)
1) Prize
2) Chance
3) Consideration - you pay something to enter.
Eliminating item 3 is typically how sweepstakes are made legal.
Re: (Score:2)
Re: (Score:2)
Re:like trying to win the lottery (Score:5, Insightful)
There are two immediate reasons I can think of why anyone would bother competing:
1) To win money.
2) Because they enjoy the challenge of trying to solve an interesting problem.
I'm just a simple coder, and knew that I didn't have any realistic chance of winning money. But I still found it very satisfying to try to come up with a solution and send it in and see how I did. I don't regret spending hours of my own leisure time on the project.
That said, eventually I gave it up. It was very clear that I'm not smart enough to meet the challenge. I had my fun, and it was time to move on to the next project. In summary, I don't think it's safe to assume that everyone is in it for the money.
"Pardon my cynicism, but seems like contests like this are a way to get a lot of ideas and work for very little money."
I call it "brilliant". Netflix probably put some pricetag on what it would pay to get >10% improvement on their system. That pricetag is probably more than $1 million. That means profit!
Re: (Score:2)
partial credit? (Score:2)
Man, that's too easy... (Score:2)
some
if age 18 and male then hard porn and south park
if age > 18 and male and lives at home then any sci-fi movie (plus points if it's a sequel)
if age > 18 and female then any movie with Princess in the title
100% match up !
Re: (Score:2)
if age > 40 and male and single, porn
i thought this had already concluded (Score:2)
only yourself to blame (Score:2)
Re: (Score:2)
Yes, please improve it. (Score:2)
I, for one, have really never found Netflix's recommendations all that useful. It sometimes recommends movies that I've already Netflixed. But to be fair I think they fixed that. It has recommended movies that I already have in my queue but most of the time it will be movies that I have no interest in at all. Then there was the time I turned in a 'G' rated movie, Disney I think, and it recommened ether Saw I or Saw II.
Not really sure where it got that one from. Nothing I had turned in that week had
Throwing away too much information (Score:2)
User ratings are a deeply flawed way of getting this information. They're one-dimensional and prone to serious randomizations based on the user's mood; a 5 today might have been a 3 tomorrow. Since most of the movies that a user rates will be between 3 and 5 (it's just not that hard to spot a movie you're going to hate, so why would you rent it in the first place?) that makes the data... well, not valueless, but containing a lot
Sometimes throwing stuff away is good (Score:3, Interesting)
Re: (Score:1)
Re: (Score:2)
Perhaps I should have phrased some of this better: what ELSE did they look at and decide against? What did they rent and change their minds about? (That's the second one and I really should have proof-read that better.)
I said in the original thread announcing the co
They're not interested in "truth" necessarily (Score:2)
Re: (Score:1)
Some of it might be useful, but a lot of it seems like noise.
* What did the user look at? - Noise, looking is way too weak an endorsement, and if not queued, it's probably not an endorsement at all
* What did the user rent? - Slightly more useful, but because the user wasn't explicit about their feelings (and most users don't rate a lot of movies), it's hard to come up with a convincingly reasonable way to rate them by default.
* How did they order their queu
I rate 1 & 2 stars all the time (Score:2)
I'd like to see 3 improvements, though:
1) Half-star ratings. I'm given recommendations in fractional star increments, and there are times where I think half-stars make sense -- there's been movies that haven't totally sucked, and 2.5 stars seems app
Here's an idea... (Score:4, Insightful)
For example, Netflix frequently recommends rated R movies to my family, but we have never rented a single R-rated movie and have no desire to do so. Moreover, every time we get a recommendation for an R-rated movie, we rate it "Not Interested." I've probably marked dozens of R-rated movies "Not Interested," but they continue to be recommended. (Either Netflix is trying to tell me to just give in and rent one already, or they really don't understand my family's movie preferences.)
A simple checkbox for "Do not recommend R-rated movies" would be all Netflix needs to substantially improve its accuracy for my family. I imagine Netflix could add checkboxes for similar criteria as well. In any case, I think a key point is giving more control over recommendations to the users themselves.
Re: (Score:1)
It seems worth setting this and try. It seems to me that they'd definitely limit what mov
Buried gem (Score:3, Interesting)
This raises the question, should someone's age affect the denominator? People in or just out of college generally see their tastes evolve quickly, while people in retirement homes might take decades to get tired of something.
I also wonder if this decay factor applies to other fields. Not just books or music, but toothpaste or politicians. In the US, your representative is presumably re-elected before your opinion has time to change much; the president just as you're getting tired of him. It makes me wonder how Senators get re-elected at all.
Theoretical Limit? (Score:1)
1. inherent randomness in each individual's ratings
if you give me a list of movies today to rate them, then the same list a week later, I probably would give inconsistent scores. the more randomness in it, the less predictable it is. hack, Netflix could have deliberately introduced some randomeness in it so that nobody could ever get the prize.
2. sample size
imagine there is a underlying theoretical model that drives us to rate t
So, there's money in movie ratings, eh? (Score:1)
I'm a subscriber--why not give me a little of that cash?
I mean, if my opinion of a movie is this valuable, I expect to be compensated for participating in the system.
And that's why I never rate anything on the internet--they haven't made it worth my time.
Actually, what I'd prefer is if Netflix would give me a list of movies recommended by a group of professional reviewers I tend to agree with.
And at $4.99/month, that list doesn't have to be mo
netflix (Score:1)