Google’s John Mueller on Managing Legacy AMP Subdomains: Advice and Best Practices
Google’s John Mueller offers practical solutions for handling legacy AMP subdomains and managing crawl budgets. Learn how to efficiently redirect or remove old AMP URLs and keep your site optimized as you transition away from AMP, without worrying about crawl budget impacts for mid-sized sites.
Google Search Advocate John Mueller recently offered guidance to website owners grappling with legacy AMP (Accelerated Mobile Pages) subdomains and their potential impact on crawl budgets. This issue often arises when websites have transitioned away from AMP but find that Googlebot continues to crawl AMP URLs. Mueller’s advice is particularly relevant for large sites that have adopted AMP in the past and are now managing significant numbers of URLs.
Background: The AMP Transition Challenge
AMP was introduced by Google to improve mobile page load speeds by creating lightweight, fast-loading versions of web pages. Many websites, especially those with large traffic volumes, implemented AMP pages to improve their mobile performance and visibility in Google Search.
However, as web technologies have evolved, many publishers are phasing out AMP. This process can be tricky, particularly when it comes to managing the legacy AMP URLs that were once part of a site’s structure. Even after moving away from AMP, site owners often find that Googlebot continues to crawl these pages, raising concerns about crawl budget efficiency.
Case Study: A Site with 500,000 URLs
In a recent discussion on Reddit, a website owner managing around 500,000 URLs sought help regarding an AMP subdomain they had abandoned. Despite implementing 301 redirects from the AMP subdomain to the main domain three years ago, Googlebot was still crawling hundreds of thousands of AMP URLs. The owner had also removed the AMP sitemap file and ensured both HTTP and HTTPS versions were redirected, but the crawling persisted.
This situation highlights a common issue for websites transitioning away from AMP: despite technical efforts, legacy AMP pages may still attract unwanted crawling, which could theoretically impact the website’s crawl budget.
Key Advice from John Mueller
Mueller provided two practical solutions for handling legacy AMP subdomains:
- Maintain the 301 redirects: Keeping the redirects in place ensures that any remaining requests to AMP URLs will lead users and crawlers to the correct, up-to-date content on the main domain.
- Remove the AMP subdomain from DNS: By removing the hostname from DNS, you can eliminate the AMP subdomain entirely, preventing any future crawling or requests to the AMP URLs.
Mueller emphasized that for most websites, particularly those with around 500,000 pages, the crawl budget is not a major concern. He reassured the site owner that if the AMP URLs are hosted on a separate subdomain, that subdomain would likely have its own crawl budget. As a result, the impact on the crawl budget for the primary domain would be minimal.
Understanding Crawl Budget and AMP
Crawl budget refers to the number of pages Googlebot is willing to crawl on your website within a given timeframe. This is influenced by factors like the size of your site, its health, and the server’s capacity. While large sites with millions of URLs need to optimize their crawl budget carefully, Mueller’s comments suggest that for mid-sized sites, this is less of an issue, especially if AMP content is on a separate subdomain.
Google’s own documentation, “Large site owner’s guide to managing your crawl budget,” provides further insights into optimizing crawl budget, but Mueller’s advice indicates that complicated technical solutions are rarely needed in cases like this.
Why This Matters Now
Mueller’s guidance comes at a time when many publishers are reevaluating their AMP usage. With advancements in web technologies like responsive design and Core Web Vitals, the benefits of AMP are less clear-cut than they were when the framework was first introduced. As a result, publishers are increasingly choosing to drop AMP, focusing instead on optimizing their main web properties for mobile.
For site owners in this position, it’s crucial to manage the legacy AMP content effectively to avoid unnecessary crawl activity and to ensure that Googlebot focuses on the current, most valuable parts of their site.
Next Steps for Website Owners
If you're facing similar challenges with legacy AMP URLs, you have two main options to resolve the issue:
- Continue using 301 redirects: This is a simple solution that ensures users and bots are directed to the main domain without losing any value from the original AMP URLs.
- Remove the AMP subdomain from DNS: If you no longer want Googlebot to crawl AMP URLs, removing the DNS entry for the AMP subdomain is an effective method to stop it.
In addition, consulting Google’s official documentation on crawl budget management can provide more tailored advice depending on your site’s size and needs.
Final Thoughts
John Mueller’s response highlights the need for simplicity in handling legacy AMP subdomains. While crawl budget is a genuine concern for very large sites, smaller sites transitioning away from AMP don’t need to worry as much. Following these straightforward steps can help ensure that your site remains efficient and focused on the right content, with minimal technical overhead.