How an Empty S3 Bucket Can Make Your AWS Bill Explode (medium.com) 70
Maciej Pocwierz, a senior software engineer Semantive, writing on Medium: A few weeks ago, I began working on the PoC of a document indexing system for my client. I created a single S3 bucket in the eu-west-1 region and uploaded some files there for testing. Two days later, I checked my AWS billing page, primarily to make sure that what I was doing was well within the free-tier limits. Apparently, it wasn't. My bill was over $1,300, with the billing console showing nearly 100,000,000 S3 PUT requests executed within just one day! By default, AWS doesn't log requests executed against your S3 buckets. However, such logs can be enabled using AWS CloudTrail or S3 Server Access Logging. After enabling CloudTrail logs, I immediately observed thousands of write requests originating from multiple accounts or entirely outside of AWS.
Was it some kind of DDoS-like attack against my account? Against AWS? As it turns out, one of the popular open-source tools had a default configuration to store their backups in S3. And, as a placeholder for a bucket name, they used... the same name that I used for my bucket. This meant that every deployment of this tool with default configuration values attempted to store its backups in my S3 bucket! So, a horde of misconfigured systems is attempting to store their data in my private S3 bucket. But why should I be the one paying for this mistake? Here's why: S3 charges you for unauthorized incoming requests. This was confirmed in my exchange with AWS support. As they wrote: "Yes, S3 charges for unauthorized requests (4xx) as well[1]. That's expected behavior." So, if I were to open my terminal now and type: aws s3 cp ./file.txt s3://your-bucket-name/random_key. I would receive an AccessDenied error, but you would be the one to pay for that request. And I don't even need an AWS account to do so.
Another question was bugging me: why was over half of my bill coming from the us-east-1 region? I didn't have a single bucket there! The answer to that is that the S3 requests without a specified region default to us-east-1 and are redirected as needed. And the bucket's owner pays extra for that redirected request. The security aspect: We now understand why my S3 bucket was bombarded with millions of requests and why I ended up with a huge S3 bill. At that point, I had one more idea I wanted to explore. If all those misconfigured systems were attempting to back up their data into my S3 bucket, why not just let them do so? I opened my bucket for public writes and collected over 10GB of data within less than 30 seconds. Of course, I can't disclose whose data it was. But it left me amazed at how an innocent configuration oversight could lead to a dangerous data leak! Lesson 1: Anyone who knows the name of any of your S3 buckets can ramp up your AWS bill as they like. Other than deleting the bucket, there's nothing you can do to prevent it. You can't protect your bucket with services like CloudFront or WAF when it's being accessed directly through the S3 API. Standard S3 PUT requests are priced at just $0.005 per 1,000 requests, but a single machine can easily execute thousands of such requests per second.
Was it some kind of DDoS-like attack against my account? Against AWS? As it turns out, one of the popular open-source tools had a default configuration to store their backups in S3. And, as a placeholder for a bucket name, they used... the same name that I used for my bucket. This meant that every deployment of this tool with default configuration values attempted to store its backups in my S3 bucket! So, a horde of misconfigured systems is attempting to store their data in my private S3 bucket. But why should I be the one paying for this mistake? Here's why: S3 charges you for unauthorized incoming requests. This was confirmed in my exchange with AWS support. As they wrote: "Yes, S3 charges for unauthorized requests (4xx) as well[1]. That's expected behavior." So, if I were to open my terminal now and type: aws s3 cp ./file.txt s3://your-bucket-name/random_key. I would receive an AccessDenied error, but you would be the one to pay for that request. And I don't even need an AWS account to do so.
Another question was bugging me: why was over half of my bill coming from the us-east-1 region? I didn't have a single bucket there! The answer to that is that the S3 requests without a specified region default to us-east-1 and are redirected as needed. And the bucket's owner pays extra for that redirected request. The security aspect: We now understand why my S3 bucket was bombarded with millions of requests and why I ended up with a huge S3 bill. At that point, I had one more idea I wanted to explore. If all those misconfigured systems were attempting to back up their data into my S3 bucket, why not just let them do so? I opened my bucket for public writes and collected over 10GB of data within less than 30 seconds. Of course, I can't disclose whose data it was. But it left me amazed at how an innocent configuration oversight could lead to a dangerous data leak! Lesson 1: Anyone who knows the name of any of your S3 buckets can ramp up your AWS bill as they like. Other than deleting the bucket, there's nothing you can do to prevent it. You can't protect your bucket with services like CloudFront or WAF when it's being accessed directly through the S3 API. Standard S3 PUT requests are priced at just $0.005 per 1,000 requests, but a single machine can easily execute thousands of such requests per second.
The real lesson here (Score:5, Insightful)
Create a cloud account at your own risk!
When you're subject to usage pricing, you never know what sort of unintended interaction will cause your bill to go nuts.
I would not be surprised if this weren't the only example of somebody not doing anything wrong, and yet incurring huge charges for things they didn't do.
Re: (Score:3)
Amazon might be on to something! I am thinking of charging my customers for every HTTP request my servers handle. Then, I'll set up bots to make HTTP requests to their web sites. Then, infinite profits!
Re: (Score:3)
I mean sure, if you want to spend time in jail for fraud your plan is brilliant.
Re: (Score:2)
Correct, thinking of it twice, I don't even need to charge by HTTP request since you say it's illegal. I simply need to charge for bandwidth usage just like Amazon does.
Re: (Score:1)
Did you remember to get it translated from the Chinese into English by a Czech who speaks neither Chinese nor English? "Golden fingers" time.
Re: (Score:2)
All you need to cause problems is to flood their cloud accounts and hope that they get billed by the request.
Re: (Score:2)
The small engineering team I used to be on would sometimes get $30k bills one month but sub-$1k another. Managers were obviously upset with us, but it's not like we were doing it on purpose. We were just reproing bugs and developing tests to fix issues, mostly customer reported ones that needed some peculiar AWS setup. In the end we make the money back when we continue our relationship with customers, but managers didn't like my excuse of "cost of doing business".
I feel like working out Amazon's billing mag
Was the offending tool written by an Amazon dev? (Score:4, Interesting)
Perhaps you have a remedy there on the excessive charges.
Re:How to be an idiot (Score:5, Informative)
That should be the name of the article. Logging or not, dumb fuck left his S3 bucket publicly accessible and writable. Just another "tech writer" on medium without any tech chops.
Read closer. Initially it was not writable. Just the attempts were being charged to their account.
Re: (Score:3)
It sounds like you didn't make it to the second paragraph, but it's also possible that in AWS-land "private" means "publicly acessible and writable?"
If so, you can't really blame the author for the confusion.
at some time dident all users = any one in AWS or (Score:2)
at some time dident all users = any one in AWS or anybody online? and not just any user in your domain?
Re: at some time dident all users = any one in AWS (Score:1)
Re: (Score:3)
That's very nice of you to provide a demonstration on how to be an idiot. My key takeaway on your demonstration of "how to be an idiot" is to act super duper smart but be incapable of reading.
Re: (Score:3, Informative)
That should be the name of the article. Logging or not, dumb fuck left his S3 bucket publicly accessible and writable. Just another "tech writer" on medium without any tech chops.
Lesson 1: Anyone who knows the name of any of your S3 buckets can ramp up your AWS bill as they like.
No, they can't if you know what you are doing. This guy clearly has no tech know-how at all and is simply trying to make a big todo about nothing at all. There are plenty of beginner AWS classes out there, he should take a few of them.
No, you are wrong. That is **NOT** what happened. His S3 bucket was **NOT** publicly accessible and writable. It just happened to have the same name as the default name used by some shitty "open source tool".
His S3 bucket was hit by hundreds of millions of requests, **THE REQUESTS WERE REJECTED** because they were unauthorized, but he was still billed for them anyway. This is a major fuckup by Amazon.
Re: (Score:2)
If you hosted everything yourself, on your own hardware, you'd still have to pay for rejected requests... something has to process them.
Re: (Score:3)
Re: (Score:2)
for which the customer pays directly. That sounds like Amazon, indeed.
Not New (Score:3)
Also, make sure to set the max compute instances on your AWS account. Because having unhappy customers is a slower path to bankruptcy.
Goldmine (Score:2)
It might be "worth it" for a criminal to allow the backups and harvest private data.
Now skulduggerous types will be executing github searches for default bucket names and putting filters in for .bitcoin directories, browser password files, and similar nuggets of gold.
I'm glad I pay for vm's by the data rate available!
Re: (Score:2)
From a security perspective, this looks REALLY bad. I wonder what the processing order is for bucket collisions?
The cynic in me wonders if Amazon did this so they could charge arbitrary amounts whenever they want. It looks like a wonderful way to keep your financials appearing rosy.
Re: (Score:2)
From a security perspective, this looks REALLY bad. I wonder what the processing order is for bucket collisions?
The cynic in me wonders if Amazon did this so they could charge arbitrary amounts whenever they want. It looks like a wonderful way to keep your financials appearing rosy.
I find this part a bit troubling:
"Yes, S3 charges for unauthorized requests (4xx) as well[1]. That's expected behavior."
To me that reads, "We know you may screw up, but we DEFINITELY screwed up. And if we can find a way, we're going to charge you for it."
do some collect call systems bill an fee for even (Score:2)
do some collect call systems bill an fee for even when it's unauthorized and the person being called says no?
Re: (Score:2)
do some collect call systems bill an fee for even when it's unauthorized and the person being called says no?
Not as far as I've ever seen. If I say no, which I have here or there, I never see a bill for it. Granted, I haven't had a collect call in several years, so that may have changed with the way the current crop of telecom companies work. There's probably some auto-triggered "service fee" for enabling the automatic collect call system or some nonsense.
Cost-based shutdown (Score:5, Insightful)
This is why cloud service should have a cost-based shutdown option.
It's easily /the/ most obvious piece of functionality that should have been put in place on the second day of coding things like AWS and Azure. And yet it is missing. The reason is obvious: the companies owning those cloud services make money by denying people this option. Specifically individuals using the environments for development, or perhaps just learning, and who obviously do not have the means to legally fight such companies.
And it's really not that difficult; making resources unavailable from internet,or stopping or outright deprovisioning them when usage leading to cost above some limit is detected.
Re:Cost-based shutdown (Score:5, Interesting)
This is why cloud service should have a cost-based shutdown option.
They do, all sorts of metric quotas and handler actions.
One big detail that jumped out at me was:
"Other than deleting the bucket, there's nothing you can do to prevent it. You can't protect your bucket with services like CloudFront or WAF when it's being accessed directly through the S3 API"
A shutdown bucket can still have queries made against it.
The response is "400 bad request - Invalid target"
A billable event
Worse, redirect requests aside, all errors no matter how critical that are due to your customer configuration are 400 errors.
Amazon reserves 500 errors for critical infrastructure failures, aka problems on their end.
I'm betting the only reason deleting the bucket works to stop this, as PUTs to a deleted bucket (invalid name) is also a 400 error, is that without a valid token (unauthenticated) amazon has no way to know who to bill...
Re: (Score:3)
This needs to change.
If there is a legitimate business need to bill a 4xx error to the owner of the bucket, I'd ilke to hear the rationale.
Re: (Score:1)
It hits the infrastructure of the service provider.
So the hit itself costs computing power and electricity.
And then all 40x errors together might trigger human investigation which might be costly.
On the other hand you could say, they should be included in the basic operation costs and hence somehow covered by base fees or somerhing.
Re: (Score:2)
That's true, but the act of returning an error to an API call is infinitesimally small. I feel like they might be overcharging for failed API calls. I mean, technically, they aren't "failed" but they are "errors."
I use a half-dozen other S3-compatible storage providers and only one of them charges API fees and they are a tiny fraction of what AWS charges.
Re: (Score:2)
It's easily /the/ most obvious piece of functionality
For whom? Why is this in the interest of Amazon to implement?
Re: (Score:1)
As I have shared in several other venues, we agree that customers should not have to pay for unauthorized requests that they did not initiate. We’ll have more to share on exactly how we’ll help prevent these charges shortly.
Do any other storage systems not bill for unauth? (Score:1)
This article made me wonder, do any other storage providers not bill for unauthorized access attempts? That seems like a pretty big potential billing hazard.
I can see why providers might charge for that, it is taking up resources, but I would also hope providers would do some work to help drop unauthorized requests in a way that would not bill you, or maybe some very minimal fee.
At the very least a good lesson to monitor an S3 bucket access logs as soon as you create one to make sure nothing else is pointi
I am going to roasted for this but is Amazon wrong (Score:2)
Lets say you hosted your own web infrastructure. Chances our bandwidth costs are fixed and so is the size of the total pipe. If you start getting 'packeted' you will:
still be on the hook for the extra power because that CPU never gets to idle, as you have to keep pumping out the 404s
still have your other finite resources like log storage consumed
be facing potentially costly loss of business if the request rate is high enough it effectively DOSs you. That might be more then the AWS bill depending on what you
Re:I am going to roasted for this but is Amazon wr (Score:4, Interesting)
I'd classify it as "overhead" to AWS rather than any single account. Individual accounts don't get billed for unauthorized access attempts, at least not directly. AWS totals up the cost of handling those unauthorized requests and factors it into the costs of their services. Then those costs get spread across all accounts, so for each account it amounts to a fixed percentage of their bill that goes to paying for "overhead".
Or permit us to isolate the service endpoints behind a firewall or vnet, like I'd do going the traditional route, so the Internet at large couldn't hit them with unauthorized requests.
Re: (Score:2)
That would be a reasonable approach, I agree but its not without its own set of challenges.
Now Amazon would have powerful incentives to cut off any clients who are say controversial and likely to trigger DOS attacks etc. We be able to add being 'unhosted' alright kinda thing to list of being unbanked, deplatformed, and canceled. You'd expose every client to the hecklers veto; no matter how deep their pockets.
I am not sure that is good thing either.
Re: (Score:3)
Or permit us to isolate the service endpoints behind a firewall or vnet, like I'd do going the traditional route, so the Internet at large couldn't hit them with unauthorized requests.
THIS!
I don't know if S3 has this as an option or not (based on the comments, it's not an option). If it's not an option / if there is no way to prevent random and unauthenticated users from abusing this, IMO Amazon should be eating this cost. If they can't/won't provide the user the means of preventing it, the user shouldn't be responsible for those invalid requests.
One can deploy their own S3 compatible service on a VM in the cloud, and put that VM on an internal network (RFC1918). That would do the trick,
Re: (Score:2)
There is a way to create an S3 endpoint inside a VPC that only your internal services and Amazon CloudFront can access.
This is how to avoid 4xx "attacks" like OP describes, but I feel like in this circumstance it's a freak accident.
Re: (Score:2)
Thank you!
This is how to avoid 4xx "attacks" like OP describes, but I feel like in this circumstance it's a freak accident.
Agreed, though I wonder if Amazon is aware of this instance and if they blocked others from creating an S3 bucket with this name, since there is this known issue of tons of clients using it as a default. IMO, it does seem like something that should have a better default (defaulting to only visible to your stuff) or should not result in charges for the user for unauthorized attempts from random clients/addresses. As is, it feels adjacent to security through obscurity.
Re: (Score:2)
I guess what I am saying is that there is an actual cost to events like this and someone has to pay them.
Yes, it's reasonable for a provider to charge for this overhead. However, it sounds bad. Amazon can get away with charging the customer because he didn't know about it; hence, no bad publicity. Well, at least no bad publicity until slashdot.
Now, if Microsoft or Google provided some sort of protection for this (e.g., an option for a max limit for these unauthorized requests coupled with an email alert) or even just absorbed the costs and pointed out how Amazon doesn't, that could be a competitive advantage.
"Bezos! Got them bezos! First one's free!" (Score:5, Interesting)
"Okay, I can't find a job in software that doesn't require me to know AWS backwards and forwards, so I guess you got me. I'm signing up for the free tier."
"Excellent. Let's get you set up. We'll just need a credit card number."
"Wait, why do you need a credit card number for the free tier?'
[points and screeches like a pod person]
Re: (Score:3)
I guess it's fine.. but you'd better name all your S3 Buckets, and any images you create with max length names that come out of a RNG, so nobody can guess your resource names.
Re: (Score:2)
Actually, naming of resources is quite a tricky business - there are a few of resources that if guessed can compromise your security (usually if other conditions allow). Name them carefully, and then keep the names secret, and make sure all your other security is in place too. S3 is the oldest and crustiest of their resources, so by the looks of it the worst by a good margin though.
Then again, there are things like AWS account numbers, which you can't name or change but you have to use in a number of places
Missed opportunity (Score:2)
Don't leave your S3 bucket empty (Score:2)
Always have some S3 Virges, some trios and a savage or two in it.
What about the random part? (Score:2, Insightful)
As it turns out, one of the popular open-source tools had a default configuration to store their backups in S3. And, as a placeholder for a bucket name, they used... the same name that I used for my bucket. This meant that every deployment of this tool with default configuration values attempted to store its backups in my S3 bucket! ./file.txt s3://your-bucket-name/random_key. I would receive an AccessDenied error, but you would be the one to pay for that request.
So, if I were to open my terminal now and type: aws s3 cp
If there is a "random_key" in your path, how are other people defaulting to that?
Re:What about the random part? (Score:4, Informative)
The random_key is just a meaningless file name they are trying to Upload to the bucket.
The upload will always be Denied because they are not an Authorized user, but you are charged 1 Request anyways.
This is like having a Shared hosting provider hosting service where One of the billing line items is a Charge for every time you Login to the console, Except they decide to charge every SSH failed login attempt when your username is given as a login as well.
Some poor soul decided to pick "root" as their username, and gets a bill for $10000 due to all the SSH brute forcers out there scanning every IP on the internet.
Re: (Score:2)
There isn't necessarily a random key in the path. I just checked and you can download some files we make publically available in one of our buckets and the form is:
https://regionname.amazonaws.c... [amazonaws.com]
That's a "GET" request.
Our AWS account ID / random key is not needed.
Re: (Score:2)
If there is a "random_key" in your path, how are other people defaulting to that?
What part of charging for failed requests and "Access Denied error" do you not understand here? The whole point is they *aren't* defaulting to your random key.
Can't? (Score:2)
If all those misconfigured systems were attempting to back up their data into my S3 bucket, why not just let them do so? I opened my bucket for public writes and collected over 10GB of data within less than 30 seconds. Of course, I can't disclose whose data it was.
Sure you can. They gave it to you; you can do whatever you want with it.
It would be a great way to shame Amazon into changing their policies.
Re: (Score:1)
They gave it to you; you can do whatever you want with it. ... You certainly should check up on your ethics and moral compass.
Legally? Nope!
Ethically
Re: Can't? (Score:1)
Re: (Score:2)
> It would be a great way to shame Amazon into changing their policies.
Amazon are in trouble here for charging for failed auth attempts the customer cannot prevent.
What people choose to do with the bucket - that's nothing to do with Amazon. There's some open source project somewhere which has a default config with "my-awesome-bucket" or some such in it, and a whole lot of people are using it without changing it. It's all those users that need to change their policies - but you're unlikely to reach any of
killing your cloud-based competition (Score:2)
1. Discover competitor S3 buckets.
2. Get a virtual server in a far-away place using crypto.
3. Make useless requests relentlessly to said S3 buckets.
4. Hold this pattern for a few weeks.
5. Profit!
Re: (Score:2)
7. Prison
8. Buttsecks
Re: (Score:1)
I'll bet you think about that last one often.
One more person discovers the cloud is terrible (Score:3)
Maybe, just maybe, just like in the 80s when the personal computer finally broke the mainframe monopolies and freed us from insufferable BOFHs on power trips and insane pricings, someone or something will come along to break the cloud monopolies.
And then we'll be free again, until the next bunch of suckers lets history repeat itself once more. But I'll be long dead by then.
Re: (Score:1)
And then we'll be free again
Free from what? Free from easy collaboration? Free from public access to our services? Free from the ability from our phones to work as well as our laptops when accessing a service despite being connected to wildly different networks?
You're delusional. The cloud isn't going away, it provides too much utility.
Re: (Score:2)
You're right: the cloud isn't going away.
What should go away is for-hire cloud services from monopolistic and abusive vendors. My hope is that people will eventually be able to deploy and manage their own clouds without paying a fortune to, or having your data pilfered by Big Data giants, thereby giving the Microsofts, Amazons and other Googles the middle finger they so richly deserve.
PUT attack? (Score:2)
So how long before someone takes advantage of this information to do a PUT attack (or a Distributed PUT attack?) on some businessâ(TM)s bucket, causing them thousands of dollars in expenses?
This should not be possible, with PUTs being be restricted by default, with only authorised or unprotected PUTs being charged.
I ran into this using Cyberduck/Mountain Duck (Score:2)
I ran into this very same problem using Cyberduck and the paid version called Mountain Duck. AWS charges for API usage and it can be very expensive, especially when S3 is used as an online filesystem for which it is almost certainly not intended to be used. The s3fs application also suffers from this.
I'm very interested to hear now the new, official Mountpoint for Amazon S3 application impacts API usage costs.
There are other S3-compatible providers that don't charge API usage. Also, for a time our AWS S3
How does this happen? (Score:2)
Really? (Score:2)
Working as designed (Score:2)
In my experience, AWS services *always* bill you exponentially more than you'd expect. Their pricing model is full of hidden fees and fees you have no control over.
Re: (Score:2)
See also, Cory Doctorow's introduction of the word "enshittification".
ArsTechnica has more on this (Score:2)
ArsTechnica has more on this: https://arstechnica.com/information-technology/2024/04/aws-s3-storage-bucket-with-unlucky-name-nearly-cost-developer-1300/
Amazon waived the bill, and agrees that it shouldn't be this way.