Site Migrations for Publishers: Best Practices and Pitfalls
Changing the technology and design of your publishing site can be a daunting affair. Here's a look at what to do and, more importantly, what not to do.
It’s been a while since my last newsletter, but the subscriber numbers have continued to grow. At the start of July, SEO for Google News surpassed 10,000 subscribers! I’m immensely grateful, thanks to all of you for making this newsletter one of the most rewarding endeavours I’ve ever undertaken.
The reason for the long gap between newsletter editions is that I’ve taken some time off from work, and also wanted to let the dust settle a bit on the AI Overviews hullabaloo.
As I predicted in my last newsletter, the hype around AI Overviews has turned out to be a storm in a teacup. Rather than heralding the ‘death of publishing on the web’ as some proclaimed, AI Overviews are just another search feature (and a relatively uncommon one).
AIO may cannibalise some traffic for some publishers, but that’s nothing new - it’s a trend continued from almost all previous search features Google has introduced.
Plus, it seems likely this current generation of AI technology (Large Language Models) could be more hype than substance. To improve LLMs you need more data, and frankly humans are incapable of generating the volume of data that LLMs need to keep improving.
So, in order to improve, LLMs will need to be trained on data (i.e. content) generated by other LLMs. And that, as it turns out, leads to model collapse. Perhaps LLMs aren’t the panacea of AI that some may have hoped for.
Now that the madness has calmed down and media commentators are dedicating fewer column inches to AI, with a better signal to noise ratio, let’s dig into another aspect of SEO that many publishers struggle with: site migrations.
Site Migration Best Practices
Rather than rehash what others have written about how to execute site migrations properly, let me start by sharing some of the best content out there on this topic.
Relevant Google Docs:
Migration Guides:
Site Migration Fundamentals - Wix / Chris Green
40-Step SEO Migration Checklist - SEOSLY / Olga Zarr
SEO Guides, Checklists and Tools For Web Migrations - Aleyda Solis
Now, let’s look at the specifics of site migrations for publishing sites.
Types of Site Migrations
I keenly remember my friend Jono Alderson proclaiming, on stage at a major SEO conference, that there is no such thing as a site migration.
I raised my eyebrows at that one, but he had a point. ‘Site migration’ is a broad term that encompasses a wide range of different projects, and we need to clarify what type of site migration we’re dealing with if we want to define the best possible approach.
You can classify site migrations in many different ways. For the sake of simplicity, I’ll define four types of site migrations:
Redesign
Restructure
Replatforming
Relocation
Each of these has different types of changes applied to the site, which results in different levels of risk for any negative SEO repercussions.
Many migration projects will fit neatly into one of these four categories, but others can be hybrids. For example, a recent migration project I consulted on was a replatforming with redesign and restructure components.
Redesign
What changes: Webpage HTML
In a redesign, the site’s front-end design and layout may change, but the underlying technology remains more or less the same. This type of site migration has the fewest risks. There are still some areas where SEO could be impacted.
With any redesign, it’s important to ensure the new HTML is semantically optimised and clean (i.e. easily parsable for Google) and structured data and critical meta tags are properly implemented (title, meta description, Open Graph, canonicals, mobile alternates, hreflang, robots tag, etc.).
Beyond those aspects, redesigns are generally low-risk and don’t offer much opportunity for ruining the site’s existing SEO performance.
Nonetheless, you should never underestimate the inventiveness of developers. I recently saw a website where many of the very basics of SEO, such as <title> tags, were omitted as part of a redesign.
For even the smallest migration projects, it’s useful to have a detailed spec document that outlines everything the site needs to have in terms of features and capabilities. Use this to ensure all necessary SEO elements are in place.
Ideally, SEO is an integral part of the QA process that all website updates should undergo before go-live.
Restructure
What changes: Site navigation & internal links
Changes to the site’s navigation can impact on organic traffic. Top-level navigation and internal linking are key elements of a site’s SEO. Navigation links indicate the site’s topical specialities, and internal links determine how link value flows through the site.
A restructure of the site could result in drastic changes to these aspects - sometimes for the better, sometimes for worse - and you need to be aware of this and the potential repercussions before you push the new site structure into production.
I’ve previously written about how categories and tags play a major part in topic authority signals, so make sure you read that and consider the impact of navigational changes.
A site’s navigation is often taken for granted, and probably doesn’t get the attention it deserves. Navigation links can also be subject to corporate politics, with middle managers fighting to have their section represented in the nav bar.
Ideally, site navigation is designed as part of a taxonomy and/or information architecture project. Don’t make drastic changes to your site’s navigation links, section pages, and tag pages without having a detailed understanding of best practices and the potential effects.
Replatforming
What changes: Technology stack, webpage HTML, URLs (probably)
With a replatforming, the site’s technology stack is altered. A new off-the-shelf CMS is implemented, or perhaps the site moves to a headless CMS with an entirely new codebase.
What matters is that both the HTML code on every page is changed, and most likely the URLs of existing pages are changed.
Sometimes with a replatforming, existing URLs can be maintained. This makes for a much smoother process, but in my experience it’s quite rare. Most replatforming projects result in site-wide URL changes, with almost every page having its URL altered and requiring redirects.
It’s important to realise that even a tiny change in the URL - for example, changing from a trailing slash to a non-trailing slash - has the same impact as a radical URL change. The end result is the same: The URL changes, and a redirect has to be put in place from the old URL to the new one.
Note that URLs don’t have to be hierarchical, nor fully keyword-rich. A URL is a tiny ranking factor and not worth obsessing about. I often see migration projects used as a reason to implement site-wide URL changes to make them more ‘SEO-friendly’, but this is a misguided effort. A ‘suboptimal’ URL that is indexed by Google and ranks for relevant searches is infinitely more valuable than an ‘optimised’ URL would be if it means changing the existing URL.
A key reason for this is that redirects are a major source of SEO issues. While Googlers have gone on record saying that a redirect doesn’t result in diminished PageRank, in my experience there is actually some loss of link value.
I see it like this: A redirect acts like an extra link. When you redirect an old URL to a new one, you are inserting an extra link hop between the original page and the new page. PageRank diminishes with every link hop (the ‘PageRank Damping effect’ which I explain here), so a redirect causes a small loss of link value as if it was an extra link hop for Googlebot to crawl through.
This is confirmed by anecdotal evidence and some A/B tests, where reducing internal redirects on a large scale resulted in improved traffic to key pages on the site. It seems to support the theory that reducing redirects helps optimise the flow of link value and can drive improved rankings.
Redirects also create extra crawl effort. When Googlebot sees a redirect on the original URL, it then also has to crawl the new URL that the redirect points to. Twice the crawl effort for the same page.
So, for replatforming projects there is an additional level of care that needs to be applied. First of all, redirects should be in place for all changed URLs. And, to prevent unnecessary waste of link value, all internal links should be updated to point directly to the new URLs so that no internal link results in a redirect hop.
I also believe that Google uses a site’s primary category pages as crawl entries. Googlebot will crawl your main categories frequently to find new content associated with each category’s topic. When you change a category URL, you are essentially ‘resetting’ that category’s place in Google’s understanding of your site structure, and its value needs to be rebuilt from the ground up. Best to leave the URL alone in the first place to avoid that issue.
Relocation
What changes: Hostname
The most difficult site migrations are relocations. This is when the site changes its hostname (i.e. the domain name or subdomain) which means that for Google it’s basically a new site.
The reason for a relocation can vary. For example, a publisher undergoes a rebrand and a domain name change is part of that. Or, some publishers originally started as a country-specific brand with a ccTLD and they want to broaden their international appeal by switching to a generic TLD. Or a foundational business decision is made to split a site up across multiple subdomains with each serving a specific purpose.
Relocations can be accompanied by redesigns, restructures, and/or replatformings, which makes them the most intensive form of migration projects with the most moving parts and greatest risks.
In an effort to reduce risk, often a relocation will keep the site’s design, structure, and technology the same (at least initially) and just move the existing site to a new domain name. This is what The Times & Sunday Times did recently, when they moved from thetimes.co.uk to thetimes.com.
The biggest risk with a relocation is that the site loses Google News inclusion.
Getting into Google News (and, by extention, into Top Stories carousels) is a tough challenge in the first place. With the algorithmic inclusion process introduced in 2019, it can take two years or more for a newly launched publisher to accumulate sufficient authority and trust signals for Google to rank it in their news ecosystem.
This inclusion is based on the site’s hostname. When the hostname is changed, Google News inclusion is not automatically moved across to the new hostname.
I’ve seen it happen all too frequently: A site moves to a new domain name, and most of their organic traffic disappears because the site’s articles stop showing in Top Stories on Google. The site has to re-earn that inclusion, taking two years or more, which can be disastrous.
Fortunately there seems to be a way to prevent losing Google News inclusion. Shared with me by Oleg Korneitchouk, a fellow news SEO consultant who first discovered this tactic, it works like this:
In the existing (Google News included) site’s Publisher Center, go to the Settings (the gear icon in the drop-down where you select your publication). In Publication Settings, you can scroll down to ‘Publication URLs’ where you can add ‘Additional URLs’.
This is where you should add the new hostname for the site before you migrate, ideally a few weeks before, to allow Google to understand that the new domain is an alternate URL for the existing domain.
Oleg tried this tactic with one of his clients, and it worked seamlessly; the new domain got Top Stories inclusion within two weeks.
Recently I’ve seen it work myself in another instance, where a new domain name was added to Publisher Center as an additional URL several months before the migration. When the site was finally moved to the new domain, it took only two days for the new domain’s articles to start appearing in Top Stories.
This seems to work independently of the Change of Address function in Search Console intended to facilitate domain changes in Google. I’ve used the Publisher Center tactic once on a partial site migration (so Change of Address wasn’t applicable, as the majority of the site remained on the original domain), and it worked smoothly with articles published on the new domain appearing in Top Stories within a few days.
Now, I should stress that this is not an official Google-approved tactic for ensuring Google News inclusion with domain changes. As far as I know, there is no ‘proper’ tactic for maintaining news-inclusion in Google when changing hostnames. We lost that when Google retired the manual Google News inclusion process which also featured a domain change form you could submit.
Use this tactic at your peril. I’m pretty confident that it works, but I can’t guarantee it.
What About Content?
One common aspect of site migrations I haven’t discussed is content. For most non-news websites, site migration projects also involve changing the site’s content in some form.
Stakeholders can get entranced by the shine of a new website and forget what made their old site successful. Best performing content is at risk of being left behind. This may cause drastic traffic losses which are hard to recuperate.
For publishers, this is less common. News websites especially shouldn’t want to go back into their archives and alter article content. After all, those stories were the news at the time they were published and are part of historical record.
In the context of SEO, changing and/or removing content can incur other risks. Google uses a metric called topic authority to determine the editorial specialities of a publisher. A key component of topic authority is the history of original journalism that the site has published on a given topic.
If you change or remove large numbers of articles in your archive, you risk undermining your topic authority signals, which will impact your rankings and traffic. Best to leave older content alone.
Sometimes a level of content pruning is necessary, for example when a replatforming project is unable to import all the old site’s content. In this case, it’s important to ensure the site’s most valuable content in terms of traffic, external links, and Googlebot crawl effort is definitely migrated to the new platform.
Access to the old site’s server logs can be very helpful here, as this will tell you which articles Googlebot crawls the most frequently. Make sure you migrate those across to the new platform.
More on Site Migrations
There are several types of migration projects I haven’t discussed, such as site mergers, site divisions, and partial migrations. At the upcoming 2024 News and Editorial SEO Summit on 29 & 30 October, I will be delivering a detailed session on site migrations for publishers.
If you attend the live online NESS event you’ll be able to ask me specific questions about your own site migration challenges. Make sure you get your ticket!
The Future of Media Technology Conference
On September 12, the 2024 edition of The Future of Media Technology Conference organised by Press Gazette will be held in London, UK. I’m delighted to have been invited to be part of the event.
I will be part of the panel discussion on “Platforms versus publishers: sign or sue?”, moderated by Charlotte Tobitt and also featuring ex-Googler Madhav Chinnappa, Mark Watkins from AWS, and one more panelist yet to be confirmed.
The event will boast speakers and attendees from most major publishers in the UK plus many from abroad, and is a great opportunity to network and stay up to date on the latest developments in the media technology landscape.
Details and tickets available on the event website here.
Miscellanea
I may have taken a bit of a break, the world of SEO and digital publishing sure hasn’t. Here’s a roundup of the most interesting pieces of content of the last two months.
Official Google Docs & Announcements:
Interesting Articles:
Digital News Report 2024 - Reuters Institute
Newsroom themes for 2024: Reuters Digital News Report at a glance - Press Gazette
‘Google Is a Monopolist,’ Judge Rules in Landmark Antitrust Case - NYT
Generative AI Hype Cycle Is Hitting ‘Trough of Disillusionment’ - 404 media
Websites are Blocking the Wrong AI Scrapers (Because AI Companies Keep Making New Ones) - 404 media
Google cancels plans to kill off cookies for advertisers - CNBC
OpenAI testing prototype search engine with news publishers - Press Gazette
Latest in SEO:
2024 Zero-Click Search Study - Sparktoro
SEO in the newsroom: Tips from the SEO for News meetup - Wix
Half the world votes in 2024. Our guide to election SEO - WTF is SEO
SEO for News Publishers With Jessie and Shelby [Video] - Vixen Digital
Google Leak: 71 News SEO Ranking Factors Revealed & Impact on News Publishers - NewzDash
20 SEOs Share Their Key Takeaways From the Google API Leaks - Moz
How Google handles JavaScript throughout the indexing process - Vercel
Google On How It Manages Disclosure Of Search Incidents - SEJ
Google: Our Search Results Do Not Always Show Original Source - SER
An Analysis of AI Overview Visibility in Google Search for Trending News - NewzDash
AI Overviews Research: Comparing pre and post-rollout results on 100K keywords - SE Ranking
Navigating the AI Search Revolution: Vector Search and Knowledge Bases - Man of Many
The Best SEO Tools For Content Publishers - Rankalyzer
Let me conclude by shamelessly promoting some of my own recent articles and podcast appearances:
Google and publishers: An unpredictable animal that could eat you at any time - Press Gazette
News SEO and Google News Optimisation with Barry Adams [Video] - Don’t Panic It’s Organic
SEO, Google and web traffic: is there a light at the end of the tunnel? [Podcast] - I Need More Coffee And Better Ideas
News SEO for Publishers with Barry Adams [Podcast] - Earned Media
That’s it for another edition. Thanks for reading, and mega thanks to all 10,145 of you (as of right now) who’ve subscribed to this newsletter. Your support means the world to me.
Hello, what are your thought on large scale content removals of a website, i have a website that has content in three languages, i want to just keep one language. Do i have to redirect other two languages to the corrosponding main language after removal?
It does not make sense if someone look for a russian content and we redirect it to Farsi, i am a little bit confused.
During migration, we also reviewed our taxonomy and lots of old urls were mapped to new topics.
As a result, we ended up with a huge list of redirect chains for internal links in the body of the article as on top of existing redirects on the legacy site we now also have redirects to the new urls on the new platform.
Should I try to fix them all? Fix only for recent (one-year-old content) stories? Or just leave it as it is and don't worry about it?