Build a Better Netflix, Win a Million Dollars? 197
An anonymous reader writes "In a quest to better movie recommendations, Netflix is opening their database (nytimes, registration and first child required) to users to try to craft a better recommendation technology. The problem is not easy. Says one researcher: 'You're competing with 15 years of really smart people banging away at the problem.'" Recommender systems are really an interesting problem, and that is likely very interesting data to play with.
Seems like a free gift for Netflix to me... (Score:3, Insightful)
But if someone does win within a year they will still have the ability to use others' code, free of charge, as part of their product.
The article doesn't say but how will you know if your code is making choices better than their existing system? I wouldn't be submitting my code unless I was sure I was going to win. Then again I'm not a gambler or a coder
Suggestion (Score:5, Insightful)
RSSTimes (Score:5, Insightful)
Why is it that the Slashdot editors are just too damn lazy to look up the RSS feed links to these pages?
While this may be true, I wouldn't let it deter you. Collaborative filtering is a field that is far from dead. The interesting thing about collaborative filtering is that on the surface, it seems pretty straight forward but once you dig into the mechanics of it, there is actually a lot of playing you can do. Ironically, the way you display the data to the end user is often what determines how well of a job you did.
Allow me to take a naïve approach at this topic and say we generate a movie index of each person. I would have A Clockwork Orange and Koyaanisqatsi at 5 while The Ring 2 would be at the very low end. My friend might have similar movies. If he has A Clockwork Orange up there, you might be able to compute a Euclidean distance between us. However, this approach falls apart because no one has seen Koyaanisqatsi and of the 20 movies I've ranked highly, they are hard to find.
You don't have to stop there, however. You could also database the movies I marked as "uninterested" or the movies that were presented to me but I didn't vote on. Like if I had seen the offer to mark J-Lo's latest flop but didn't, wouldn't that tell you something about me?
So these caveats present themselves all along the way and, at the end computation, you have many different strategies for this data. For example, while you might not be able to link my friend an I through movies, how far apart are we on a nod network? What I mean is, if you plotted every user in their own dimension depending on the movies they ranked and attempted to compute as good a distance as possible between all users, how far would I be away from my friend by hopping on these nodes? There's a lot of information to be gleaned in this sort of friend-of-a-friend collaborative approach.
Now you need to present this information to the user. Do you just up and recommend him a movie? Do you take Amazon's approach and say "Other people did this -- so should you."? Or do you give them some sort of three dimensional flash plotting of you versus the people nearest to you? Do you allow the user to contact those closest to them? Those farthest away?
My point is that while 15 years of research has been done, it doesn't mean there's been 15 years of testing and implementation which, in the end of creating products, is where most of the importance lies.
About no-login links on /. (Score:3, Insightful)
I think it is the reason.
Slashdot can't send thousands of users with a fake referrer to NY Times. That link you provided is for people using RSS readers and subscribed to NY Times RSS feed.
I think they should talk with NY Times web team to allow slashdot readers with referrer=slashdot without needing login. They can arrange it for sure, this isn't a "no name" site.
It would be nice for NY Times for statistics too. I bet they currently have to tweak the statistics for "fake" RSS links from Slashdot.
About "no ads" version: It would be like NY Times mentioning Slashdot and sending people to some other domain (slashdot sux? I forgot) which doesn't have Slashdot ads which makes this site work/pay for the costs. That also means hundreds of thousands users.
I am not apologising for NY Times or trying to start a discussion about advertising, I just say my end user point of view and plain guesses.
Remove Artificial Supply Limitations (Score:3, Insightful)
5 star rating is flawed (Score:4, Insightful)
The problem is is that that is my rating system. It works for me. But it does little good to anybody else because they are rating based purely on something else.
I think they need to implement the ability to rate more aspects of the movie. I'm sure some people out there rate the movie poorly if their disc is scratched or the transfer quality is poor even. A simple 1 to 5 system doesn't cut it. People rate things that aren't "Was the (romance) plot good?", "Do you like this director?", "Do you like these actors?". People rate things that aren't on the box.
Re:only a million? (Score:2, Insightful)
So, you could take the money from Netflix, use it to start your business, then license it to the other players, too.
Re:Suggestion (Score:3, Insightful)
I was about to mention that I mark things as Not Interested when I own them, to avoid being reccommended the rest (Usually because I prefer to buy series I like, and rent actual movies), but then I realized that fits into what you said perfectly.
Point conceded.
Re:Privacy issues? (Score:3, Insightful)
Re:Adapting their business model (Score:2, Insightful)
Re:uh huh (Score:1, Insightful)
The problem with "anonymous" search queries is that when aggregated, they provide data that can expose the searcher's identity. Taken as a whole, searches like "Tupelo Mississipi", "STD symptoms", and "Dr. Smith" reveal quite a bit of information about the searcher.
However, the knowledge that user #1234567 rated "Beach Babes From Beyond" 5 stars, and yet gave "The Godfather" a paltry 2, while interesting, does nothing to help me determine who user #1234567 actually is.
Re:Seems like a free gift for Netflix to me... (Score:3, Insightful)