GCP – Secure your storage: Best practices to prevent dangling bucket takeovers
Storage buckets are where your data lives in the cloud. Much like digital real estate, these buckets are your own plot of land on the internet. When you move away and no longer need a specific bucket, someone else can reuse the plot of land it refers to — if the old address is still accessible to the public.
This is the core idea behind a dangling bucket attack. It happens when you delete a storage bucket, but references to it still exist in your application code, mobile apps, and public documentation. An attacker can then simply claim the same bucket name in their own project, effectively hijacking your old address to potentially serve malware and steal data from users who unknowingly still rely on a bucket that is no longer officially in use.
Fortunately, you can protect your applications and users from this threat with the following four steps.
First, implement a safe decommissioning plan
When you delete a bucket, do so carefully. A deliberate decommissioning process is critical. Before you type gcloud storage rm
, follow these steps:
Step 1: Audit and learn
Before deleting anything, take the time to understand who and what are still accessing the bucket. Use logs to check for recent traffic. If you see active traffic requests coming from old versions of your app, third-party services, and users, investigate them. Pay extra attention to requests attempting to pull executable code, machine learning models, dynamic web content (such as Java Script), and sensitive configuration files.
You might see a lot of requests coming from bots, data crawlers, and scanners by checking the user agent of the requester. Their requests are essentially background noise, and don’t indicate that the bucket is actively required for your systems to function correctly. These are not dangerous and can be safely disregarded because they don’t represent legitimate traffic from your applications and users.
Step 2: Delete with confidence
Many automated processes and user activities don’t happen every day, so it’s important to wait at least a week before deleting the bucket. Waiting for at least a week increases the confidence that you’ve observed a full cycle of activity, including:
- Weekly reports: Scripts that generate reports and perform data backups on a weekly schedule.
- Batch jobs: Automated tasks that might only run over the weekend or on a specific day of the week.
- Infrequent user access: Users who may only use a feature that relies on the bucket’s data once a week.
After you’ve verified that no legitimate traffic is hitting the bucket for at least a week, and you’ve updated all of your legacy code, then you can proceed with deleting the bucket. Deleting a Google Cloud project effectively deletes all resources associated with it, including all Google Cloud Storage buckets.
Next, find and fix code that references dangling buckets
Preventing future issues is key, but you may have references to dangling buckets in your environment right now. Here’s a plan to hunt them down and fix them.
Step 3: Proactive discovery
Analyze your logs: This is one of your most powerful tools. Query your Cloud Logging data for server and application logs showing repeated 404 Not Found
errors for storage URLs. For example, a high volume of failed requests to the same non-existent bucket name is a major red flag (and to remediate it, we recommend you continue with Step 3 and then proceed to Step 4.)
Scan your codebase and documentation: Perform a comprehensive scan of all your private and open-source code repositories (including old and archived ones), configurations, and documentation for any references to your storage bucket names that may no longer be in use. One of the ways to find them is to look for the following patterns:
gs://{bucket-name}
storage.googleapis.com/{bucket-name
}
{bucket-name}.storage.googleapis.com
commondatastorage.googleapis.com/{bucket_name}
{bucket_name}.
commondatastorage.googleapis.com
You can find whether a bucket still exists by querying https://storage.googleapis.com/{your-bucket-name}
. If you see response NoSuchBucket
, it means you identified a dangling bucket reference.
- code_block
- <ListValue: [StructValue([(‘code’, ‘<Error>rn<Code>NoSuchBucket</Code>rn<Message>The specified bucket does not exist.</Message>rn</Error>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef76a4bbc40>)])]>
If the bucket exists (and you do not get a NoSuchBucket
error), you should verify that it actually belongs to your organization — a threat actor may have already claimed the name.
The easiest way to check for ownership is to try to read the bucket’s Identity and Access Management (IAM) permissions.
If you run a command like gcloud storage buckets get-iam-policy gs://{bucket-name}
and receive an Access Denied
or 403 Forbidden
error, this is a sign bucket is claimed by someone else. It proves the bucket exists, but your account doesn’t have permission to manage it — indicating it has been taken over. This reference should be treated as a risk and be removed.
For your convenience, we provide a script below that can find dangling references in a given file.
- code_block
- <ListValue: [StructValue([(‘code’, ‘import rernimport sysrnfrom typing import Optional, Setrnrnimport requestsrnfrom requests.exceptions import RequestExceptionrnrndef check_bucket(bucket_name: str) -> Optional[requests.Response]:rn try:rn with requests.Session() as session:rn response = session.head(f”https://storage.googleapis.com/{bucket_name}”)rn return responsern except RequestException as e:rn print(f”An error occurred while checking bucket {bucket_name}: {e}”)rn return Nonernrnrndef sanitize_bucket_name(bucket_name: str) -> Optional[str]:rn # Remove common prefixes and quotesrn bucket_name = bucket_name.replace(“gs://”, “”)rn bucket_name = bucket_name.replace(“\””, “”)rn bucket_name = bucket_name.replace(“\'”, “”)rn bucket_name = bucket_name.split(“/”)[0]rnrn # Validate the bucket name format according to GCS naming conventionsrn if re.match(“^[a-z0-9-._]+$”, bucket_name) is None:rn return Nonern return bucket_namernrnrndef extract_bucket_names(line: str) -> Set[str]:rn all_buckets: Set[str] = set()rnrn pattern = re.compile(rn r’gs://([a-z0-9-._]+)|’rn r'([a-z0-9-._]+)\.storage\.googleapis\.com|’rn r’storage\.googleapis\.com/([a-z0-9-._]+)|’rn r'([a-z0-9-._]+)\.commondatastorage\.googleapis\.com|’rn r’commondatastorage\.googleapis\.com/([a-z0-9-._]+)’,rn re.IGNORECASErn )rnrn for match in pattern.finditer(line):rn # The first non-empty group is the bucket namern if raw_bucket := next((g for g in match.groups() if g is not None), None):rn if sanitized_bucket := sanitize_bucket_name(raw_bucket):rn all_buckets.add(sanitized_bucket)rnrn return all_bucketsrnrnrndef main(filename: str) -> None:rn with open(filename, ‘r’) as f:rn for i, line in enumerate(f, 1):rn bucket_names = extract_bucket_names(line)rn for bucket_name in bucket_names:rn response = check_bucket(bucket_name)rn if response.status_code == 404:rn print(f”Dangling bucket found: {bucket_name} (line {i}), {line}”)rnrnrnif __name__ == “__main__”:rn if len(sys.argv) != 2:rn print(“Usage: python find_dangling_buckets.py <filename>”)rn sys.exit(1)rn rn main(sys.argv[1])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef76a4bb3a0>)])]>
Please be aware that this script and recommendations can only find hardcoded references, not those generated dynamically during runtime. Also, codebase might have hardcoded bucket names that do not follow the pattern but are being used by Google Cloud Storage clients.
Step 4: Reclaim and secure
If you find a dangling bucket name that might represent security risk to you or your clients, act fast.
If you do not own the dangling bucket:
Use all available data from the previous step to find dangling buckets and remove any hardcoded references in your code or documentation. Deploy the fix to your users to permanently resolve the issue.
If you own the dangling bucket:
- Reclaim the name: Create a new storage bucket with the exact same name in a secure project you control. This prevents an attacker from claiming it.
- Lock it down: Apply a restrictive IAM policy to the reclaimed bucket. Deny all access to
allUsers
andallAuthenticatedUsers
and enable Uniform Bucket-Level Access. Enable Public Access Prevention control to turn the bucket into a private “sinkhole.”
By building these practices into your development lifecycle and operational procedures, you can effectively close the door on dangling bucket takeovers. Securing your cloud environment is a continuous process, and these steps will add powerful layers of protection for you and your users.
To learn more about managing storage buckets, you can review our documentation here.
Read More for the details.