
Amazon's AI Product Reviews Seen Exaggerating Negative Feedback (bloomberg.com) 72
A anonymous reader shares a report: Shopping on Amazon.com has long entailed scrolling through pages and pages of often redundant customer feedback. In an effort to make the task less onerous, the company in August began using artificial intelligence to convert billions of reviews into brief summaries consisting of a few sentences apiece. As is often true with generative AI, the results aren't perfect. In some cases, the summaries provide an inaccurate description of a product. In others, they exaggerate negative feedback. This has potential implications not just for customers, but for Amazon merchants who depend on positive reviews to boost sales. Making matters worse, merchants say, the summaries were deployed just as they were headed into the crucial holiday shopping season -- giving them one more thing to worry about besides inflation-battered shoppers.
Most shoppers can probably tell when the AI has misclassified a product. For example, the home fitness company Teeter sells an inversion table designed to ease back pain. Amazon's AI generated summary calls it a desk: "Customers like the sturdiness, adjustability and pain relief of the desk." The technology's tendency to overplay negative sentiment in some reviews is less obvious. The $70 Brass Birmingham board game, for instance, boasts a 4.7-star rating based on feedback from more than 500 shoppers. A three-sentence AI summary of reviews ends with: "However, some customer have mixed opinions on ease of use." Only four reviews mention ease of use in a way that could be interpreted as critical. That's fewer than 1% of the overall ratings, yet the negative sentiment accounts for about a third of the AI-generated blurb.
Most shoppers can probably tell when the AI has misclassified a product. For example, the home fitness company Teeter sells an inversion table designed to ease back pain. Amazon's AI generated summary calls it a desk: "Customers like the sturdiness, adjustability and pain relief of the desk." The technology's tendency to overplay negative sentiment in some reviews is less obvious. The $70 Brass Birmingham board game, for instance, boasts a 4.7-star rating based on feedback from more than 500 shoppers. A three-sentence AI summary of reviews ends with: "However, some customer have mixed opinions on ease of use." Only four reviews mention ease of use in a way that could be interpreted as critical. That's fewer than 1% of the overall ratings, yet the negative sentiment accounts for about a third of the AI-generated blurb.
Reviews are only barely useful at best (Score:3)
I mostly look at the negative ones and try to make sense of what they are saying. I also assume that all reviews are fake or biased, and try to find some kind of useful pattern. In many cases, useful and informative patterns can be found. Also, the product category is important. High priced industrial stuff tends to have more useful reviews than low priced mass market stuff
Re:Reviews are only barely useful at best (Score:5, Insightful)
"Amazon sent me the wrong item. 1 star"
Re: (Score:3)
Re:Reviews are only barely useful at best (Score:5, Insightful)
My favorite is when they say things like "1 star, arrived late". You just reviewed UPS, not the product.
Re: (Score:3)
"Haven't received yet, but I'm really excited." 4 stars. You're so excited, you just reviewed yourself. Settle.
Re: (Score:3)
Re:Reviews are only barely useful at best (Score:4, Funny)
Re: (Score:2)
What positive things are mentioned and at what frequency over what period of time.
What number of posts have high ratings but fail to mention any positive features.
What negative things are mentioned and at what frequency over what period of time.
That would help me decide and possibly detect bots/ terrible reviews.
Re: (Score:2)
It's just bots all the way down, just make the AI recommend you a product at that point
Re: (Score:3)
Negative reviews give the best hints when a good product is good.
Same applies to restaurants. If the complaints are about a misunderstanding where the restaurant actually cooked a dish correctly but the guest is used to the "wrong" way it can really tell you what level the restaurant is at.
Re: (Score:2)
Or when the customer leaves a negative review because they couldn't get a seat without a reservation on Valentine's day.
Grandma's recipe not Taco Bell's (Score:4, Funny)
complaints are about a misunderstanding where the restaurant actually cooked a dish correctly but the guest is used to the "wrong" way
Not an online review, but an angry woman at the register complaining about her tacos not having any lettuce or cheese. The owner replied with "we use my grandmother's recipe not Taco Bell's".
Re: (Score:2)
Re: (Score:2)
What seems strange to you about reviews for banks? Given how important that relationship tends to be, I would think you'd want to hear about the experiences of existing customers before you opened an account.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
Yup. I look at patterns over time periods and if the review is using market speak, i.e. "This brand delivers value!". Even with things like apartment reviews you can find a bunch of negative reviews but they may be from many years ago and clustered together reflecting a possible change in management. I also have to assume that most people who are happy with a product don't leave reviews, a few negative ones are usually outliers.
You can't just go on stars either. If the product has a large number of revi
Re: (Score:2)
Negative reviews/rebuttals useful in politics too (Score:2)
I mostly look at the negative ones and try to make sense of what they are saying.
I find it useful to evaluate the worst the reviews can say. Sometimes the worse complaints tell you the product is pretty good if "that" is about as bad as it gets.
This approach works for voting too. My state issues a voter's guide with arguments for and against, but the most meaningful content are the rebuttals to the arguments for and against. Less BS and hand waiving in the rebuttals usually.
Be sure to consider review dates (Score:2)
I mostly look at the negative ones and try to make sense of what they are saying.
Be sure to include looking at the dates of the reviews. A bunch of reviews claiming the thing died after a couple of weeks of use, all in 2017, nothing like those since then. OK, they had manufacturing problems one year long ago.
Re: (Score:2)
I bought this for my husband on my husband day and he loves it@!!!111
The critical reviews are the most important (Score:5, Interesting)
But it is the negative reviews that are the most important. [xkcd.com] When I look for reviews, I order by the star ratings then pick the 3-star reviews. The 1-star reviews tend to be "It was broken in shipment" and the 5-star reviews are "I can't wait for this product to arrive!" It's the balanced reviews and the negative reviews that tell me the good and bad of the product. So picking out the small number of negative reviews is probably the right thing to do. And especially since we are comparing them to other products, which may also have negative reviews too. So it isn't like this is happening in a vacuum.
Re: (Score:2)
Imagine if there was a good version of this AI.
"Several reviewers had the item broken during shipping but were able to get a replacement item sent with no trouble. The second shipment did on average take a while so be prepared for a possibly longer wait."
Re: (Score:2)
Yes, that would be better.
Re: (Score:2)
>5-star reviews are "I can't wait for this product to arrive!"
Omg these trigger me. Then there are the people who think they are sending a personal message to someone in the Q&A section. "I don't know, I haven't gotten it yet." Must be their first day on the internet.
Re: (Score:2)
Some smartphone app developers must be thinking of people like that when they set the program to ask for a five-star review within the first five minutes of it being used.
Re: (Score:2)
That generally results in a 1 star why are you annoying the shit out of me already review.
Re: (Score:2)
Re: The critical reviews are the most important (Score:3)
The I don't know responses stem from folks getting a notification to answer a question. They treat it as if a person actually directly asked rather than processing the context of posting to a question.
Re: (Score:2)
Then there are the people who think they are sending a personal message to someone in the Q&A section. "I don't know, I haven't gotten it yet." Must be their first day on the internet.
I used to think that. These days I think that is just the level of "insight" far too many people operate on.
Re: (Score:2)
The problem is that what you are describing is more like reading more than a couple of reviews, which gathers more info in a more balanced way than what that LLM is doing. It's more useful skim read what people are actually saying. It would be useful, though, if some LLM corrected grammar and spelling in reviews. Reading reviews in my language hurt, especially when I know that people writing them learned grammar for at least eight years and are mostly being lazy.
LLM should read psych papers on reviewers (Score:2)
Re: (Score:2)
But it is the negative reviews that are the most important. [xkcd.com] When I look for reviews, I order by the star ratings then pick the 3-star reviews. The 1-star reviews tend to be "It was broken in shipment" and the 5-star reviews are "I can't wait for this product to arrive!" It's the balanced reviews and the negative reviews that tell me the good and bad of the product. So picking out the small number of negative reviews is probably the right thing to do. And especially since we are comparing them to other products, which may also have negative reviews too. So it isn't like this is happening in a vacuum.
I'd agree on this. I also think negative reviews tend to be more detailed since the review has something specific to talk about.
Honestly, the big thing missing from Amazon reviews is the price. You're likely to rate a $50 item more generously than a $75 item since you'd expect a bit more quality for that extra $25.
But you don't actually know the history of the price or how much the previous reviewers paid for the item. So you might buy the $75 item thinking it had 4.5 stars as a $75 item, but the price went
Re: The critical reviews are the most important (Score:2)
Camelcamelcamel. Has a browser extension even.
Re: (Score:3)
Camelcamelcamel. Has a browser extension even.
Will do.
As further evidence of what I was talking about I'm looking at a $200 speaker with a 4.6 star rating that's 70% off for a 1-day sale [amazon.ca], that looks like a real good deal.
So I decided to look for other 3rd party reviews, oh, it was 73% off on June 1st [thestreet.com] and 70% on November 2nd [yahoo.com] and 75% off a year ago [reddit.com].
So are they 4.6 stars for a $200 speaker or 4.6 stars for a $60 speaker?
Re: (Score:2)
Exactly. Most reviews are useless and you find much more of the few useful ones in the midrange and low range. That said, I doubt "AI" can identify actually useful reviews because that requires fact-checking capabilities.
Re: (Score:2)
But it is the negative reviews that are the most important. [xkcd.com] When I look for reviews, I order by the star ratings then pick the 3-star reviews. The 1-star reviews tend to be "It was broken in shipment" and the 5-star reviews are "I can't wait for this product to arrive!" It's the balanced reviews and the negative reviews that tell me the good and bad of the product. So picking out the small number of negative reviews is probably the right thing to do. And especially since we are comparing them to other products, which may also have negative reviews too. So it isn't like this is happening in a vacuum.
Negative reviews are often as bad as positive ones, A lot of them are just pointless complaining about a perceived slight.
The problem with user reviews are that we have no point of reference with the reviewer so no way to compare their experiences to what we can expect.
The other problem is with the way review systems are run, they are inherently untrustworthy. As soon as you order something you get hounded to give a review even if you haven't received the product yet, let alone had time to use it, lea
Re: (Score:2)
Agreed. And sometimes we must write the review within 30 days, which isn't enough time to know if it will be reliable.
Re: (Score:2)
Negative reviews are often as bad as positive ones, A lot of them are just pointless complaining about a perceived slight.
That's fine. The AI will then put a perceived slight into the summary, and the reader will decide if that slight matters to them or not. "
"A+ perfect cable, great signal. 5/5 stars."
"Awful: the insulation had a weird texture. 1/5 stars"
"Good snug connector, thick insulation. 5/5 stars."
"Worked. 5/5 stars."
"No problem. 5/5 stars."
"Great. 5/5 stars."
Summary:
"Great signal, snug connection, but the insulation has a weird texture."
^^ What's wrong with that? I don't want the AI to filter pointless complain
AI Reviews (Score:2)
I started noticing the AI reviews on Amazon a couple weeks ago. All I can say is that they're comically bad. How long until Amazon has a lawsuit on their hands for misrepresenting products carried on their site, either by consumers, or by vendors?
Re: (Score:2)
But that's where this gets hairy. This is Amazon using AI to generate content about a product, and like the summary says, is not fully indicative of user's ACTUAL reviews. This gets into an untested legal gray area.
"Who taught you to do this?" (Score:3)
At least it admits to being machine generated (Score:3)
Re: (Score:3)
The ones I like are the ones which repeat the product name several times in the review as well as use the exact same wording from the company site.
"This Smth Corona typewriter is the easiest to use typewriter ever! Smith Corona always has the best products and I will never buy anything other than Smith Corona. With their patented anti-sticking keys I know my Smith Corona typewriter won't fail me when I need it most."
Re: (Score:2)
Re: (Score:2)
Could this be part of a SEO effort?
Re: (Score:2)
Re: (Score:3)
Some are written by people paid to write reviews that just copy-paste them. I was going to make a comment about bad English might be a sign of this kind of review, but them I read some of my own comments in this site.
Re: (Score:2)
Question (Score:3)
Do these AI-generated reviews only apply to non-Amazon products, or do Amazon products get the same treatment?
Re: (Score:2)
I've searched several products when I first noticed this "feature" - Amazon Basics products tend to have a more favorable AI generated "review" compared to competitors. This of course was a very small sample size that I checked by hand, so don't take it as absolute fact. But there was a very noticeable lack of negative wording at all in Amazon Basic products.
How do you represent 1% of reviews in a blurb? (Score:2)
Re: (Score:3)
Brass Birmingham does have ease of use issues in the sense that the first time you play it you'll probably realise halfway through that you've been getting some rule badly wrong. Probably the second and third times as well. It's still a good game.
Re: (Score:2)
This game asks you to make painful choices. Amazon AI determined that it would be 5 stars if that wasn't the case.
Meanwhile, at boardgamegeek: 3.89 / 5 complixity rating.
GIGO (Score:2)
Amazon reviews are a pointless waste of time. Most are not even for products being sold, others are written with the intent to sabotage rivals or prop up sales. Reviews for obvious scams involving worthless garbage such solar powered molecular heaters have majority of people selecting 5 star evaluations. Amazon is a cesspool of bullshit and fraud.
This is a good good thing. (Score:2)
If customers complain then the AI might not be working, but if the seller is complaining I think the AI might be regurgitating buried truths, haha too bad.
negative is positive (Score:2)
Most of the time, only negative reviews are useful. First, look at the percent of negative reviews. If it's over about 5%, there is probably an issue, but only if there are a large total number of reviews, i.e. decent sample size. If the negative reviews all cite the same or similar problem, then they are probably right, and you can figure there is a good chance you will encounter it also, since most other victims don't waste their time writing reviews. You can really see this effect with disk drives.
So
Re: (Score:3)
Most of the time, only negative reviews are useful. First, look at the percent of negative reviews. If it's over about 5%, there is probably an issue, but only if there are a large total number of reviews, i.e. decent sample size. If the negative reviews all cite the same or similar problem, then they are probably right, and you can figure there is a good chance you will encounter it also, since most other victims don't waste their time writing reviews. You can really see this effect with disk drives.
Sometimes, other reviews are useful if they give details such as actual instructions as opposed to the none supplied by the Chinese mfr, or if they explain why the negative reviews were user error.
I haven't seen the AI summaries, but unless they offer citations to their claims, I would discount them.
Of course, if you really want reliable data, you should look at the return rate. Amazon won't provide that, of course, but that's what would be most valuable to consumers — whether it sucked enough early enough for people to ship it back.
Re: (Score:3)
The thing is that quite often it is the negative reviews that explain things, give facts and actually justify the review. Sure, not all (or even most) negative review are like that. But you should read at least negative reviews. Sometime you find they literally tell you why not to buy, sometimes they just show that some insight into the product is needed (and the review is bogus) and sometimes they tell you that a specific use runs into specific issues. Sometimes they are even "good product, but bad deliver
For once the AI seems accurate (Score:2)
The review summaries are bad, inaccurate and overwhelmingly negative. That's pretty much exactly what Amazon reviews are once you discard the fake ones.
Re: (Score:2)
The review summaries are bad, inaccurate and overwhelmingly negative. That's pretty much exactly what Amazon reviews are once you discard the fake ones.
Or at least when you discard the content-free reviews (which need not be fake to be useless).
The thing is, what I want is to see everything that more than one reviewer pointed out, whether positive or negative, that doesn't appear in the product description, along with interesting use cases that don't appear in the product description, and the ability to adjust the threshold to say that there's too much crap, so only show me things that at least five people said, or ten, or a hundred. But put the person br
Re: (Score:2)
I totally agree
et tu Brute? (Score:3)
The summary spends half of its time talking about poor summaries, pointing out that, in one single case, the negative reviews that accounted for 1% of poor reviews occuped 1/3 of the AI-summarized comments.
But what fraction of the AI summaries are actually bad? Did TFS do the same thing it criticizes Amazon's automated tool for doing and over-represent cherry-picked negative outcomes?
GIGO (Score:2)
Ages old problem when non-intelligent data processing systems are used to simulate insightful behavior.
Amazon is very unfriendly to sellers (Score:2)
inaccurate summary (Score:1)