The Real Impact of AI Overviews
AIO will likely mean less traffic for many sites, but news appears to be exempt (for now). Plus, a major leak might have revealed many of Google search's internal workings.
Earlier today I presented to a full room at the WAN-IFRA World News Media Congress about the impact of AI Overviews. The first part of this newsletter is a summary and expansion of that talk. You can view the slides from my presentation here.
Since Google announced the rollout of AI Overviews at the I/O event earlier this month, there has been a deluge of stories and thought-pieces speculating what this new feature will mean for publishers.
The general tone of these articles has been one of doom, with some proclaiming the death of search as a viable traffic acquisition channel.
I believe this is inaccurate and easily disproven, and actively harmful to a healthy debate. Hyperbolic criticism of AIO makes it easy for Google to dismiss all criticism of AIO as merely panicked shouting.
Yet there is a real issue here, but it’s not a new issue. Google has been slowly chipping away at traffic to websites for years, with one new search feature after another, increasing the time users spend on Google’s own results and properties and reducing the clicks to websites that provide the actual information.
The search engine started out as a highway connecting users to websites. It didn’t take long for that highway to start featuring toll booths in the form of ads.
Then universal search was introduced in 2007. Google’s results started integrating various different elements in addition to the 10 blue links we were used to. Every new feature that appeared on Google has been another brick that helped turn Google into a walled garden.
That should be the real story. Not how AIO will destroy search (it won’t), but how AIO is another step on Google’s path to becoming the end destination.
Google’s mission is to organise the world’s information and make it universally accessible. There is no mention of sending traffic to the creators of the world’s information. Google simply doesn’t see it as part of their remit to generate traffic to publishers. The sooner we fully grasp this distinction, the more effectively we can hold them to account.
So let’s look at what AI Overviews actually mean for search.
What are AI Overviews?
An AI Overview is an AI-generated answer to the user’s query that shows at the top of a Google result. This is typically what it would look like:
Clicking on the ‘Show more’ button folds out the AI Overview, providing more detailed information:
As part of most AI Overviews, there will be one or more source articles cited for users to click on:
AIO will show between 0 and 10 source links, with an average of around 6 cited sources.
When do AI Overviews show?
At the time of writing this, AI Overviews only trigger if the following conditions are met:
The user is based in the USA (no signs of a global rollout yet)
The user is logged in to their Google account
The user is on the Chrome browser
The current browsing session is not in Incognito mode
Because of these conditions, 3rd party search data providers are struggling to get accurate large-scale information about AI Overviews. To avoid personalised results, most data providers scrape Google’s results without a Google account and using an Incognito browsing session, so they won’t trigger an AI Overview.
The data we do have is mostly anecdotal, based on relatively small query sets, and may not be statistically significant. Nonetheless, the data is starting to paint an interesting picture:
Estimates of how often AI Overviews (AIO) are shown range between 1.5% and 42% of Google queries;
On average 14% of queries show AIO by default
28% of queries have a Generate option to create an AIO
58% of queries have no AIO feature
These numbers are very tentative and likely not fully reflective of the reality. The presence of AIO depends entirely on the intent of the query and whether an AIO would be beneficial to the user’s experience.
A small study done by Kevin Indig showed that informational queries that begin with ‘how’ and ‘best’ are much more likely to show an AIO than commercial queries that start with ‘buy’:
There’s a whole separate issue about the accuracy of AI Overviews, which I won’t dig into here.
AI Overviews and News
Using a VPN, I’ve been experimenting with AI Overviews since its launch, trying to trigger an AIO in combination with a Top Stories box. And I didn’t manage that.
I’ve tried hundreds of news-related queries, and I didn’t get a single AI Overview. It got to the stage where I had to doublecheck I was getting AI Overviews for non-news queries (I was), because I simply could not get them to show for news topics that featured a Top Stories newsbox.
My theory is that AIO and QDF are mutually exclusive.
QDF (Query Deserves Freshness) is a classifier that Google uses to determine if a query / topic requires a news box of some description. With the QDF classifier, searches on that topic will always show a Top Stories or Latest News box.
I suspect that when QDF is on, AIO is off for that query.
This makes sense, as the Gemini LLM that powers AIO isn’t being trained in real-time on the latest news (yet). For news topics, the AIO would not be able to provide accurate up-to-date information. Hence why Google defaults to a Top Stories box instead.
Traffic Impact of AIO
Again the following is pure speculation based on early anecdotal data. In Kevin Indig’s latest Growth Memo, his small study showed that AIO has a small negative impact on traffic:
If AIO is present and your page is not cited as a source, on average you see a 2.8% decrease in Google traffic.
If AIO is present and your page is a cited source, on average you see a 8.9% decrease in your Google traffic.
This indicates that AIOs may satisfy a user’s query in many cases. While users will be inclined to click through to other sources, they are likely to skip past the AIO’s cited pages and click on a regular organic result.
On the other hand, Aleyda Solis shared on X that product queries may actually benefit from being part of AIO. Her data shows that AIOs where the site is included as a source, traffic can increase by as much as 23%:
This could mean that for product-related searches it may be beneficial to show up in an AI Overview, and pages that are not part of the AIO will lose out.
You can prevent your site from being used as a source in AI Overviews, as explained here, but this may have additional repercussions for your visibility in Google’s wider ecosystem.
What This Means For Publishers
If your site is focused heavily on news content, you’re probably going to be fine (for now). The fact that AI Overviews don’t seem to show for news queries means that traffic from Top Stories is likely unaffected.
However, many publishers also get traffic from evergreen content like buying guides and explainers. Traffic to that type of content is likely to decrease due to the presence of AIO.
Is it time to panic? No.
AIO is not going to destroy search. Nonetheless, it is another brick in Google’s wall. For years Google has been trying to capture its users and keep them within Google’s own ecosystem for maximum monetisation. AIO is another step on that path.
We need to have a proper discussion about the power Google wields and how the media can effectively hold them to account. Panicked proclamations about how AIO is going to destroy publishing are not helpful. Nuanced, fact-based arguments need to be put forward.
So keep a close watch on AI Overviews and what their long term impact is, and try to stay calm.
Google Search API Data Leak
Another story I want to share with you is a purported leak of API data that relates directly to the inner workings of Google search. Rather than re-hash the whole story, please read Rand Fishkin’s article on the leak which includes his initial analysis.
If you have an appetite for a deeper dive into the leak, Michael King has published a lengthy article exploring many of the leaked attributes and what they could potentially mean.
In the next few weeks we’ll undoubtedly see many more amateur analyses of the leaked data. Read these with caution; we simply don’t know how accurate the leaked data is, how (or even if) Google uses the data in their algorithms, and whether the SEO industry can use the data for more effective tactics.
With that caveat out of the way, I do believe the leak is genuine and that it tells us some interesting things about Google search.
Two things I want to highlight from the leak which Rand and Mike also noticed:
As it turns out, Google likely has a siteAuthority attribute that indicates a site’s overall authority score. This is despite the fact that Googlers have been vocally denying such a score exists.
A site-wide authority score is likely used to provide a newly published page with an initial authority score, inherited from the site, until Google is able to calculate an individual authority score for the URL once enough link data has been gathered.
This explains why articles published on highly authoritative websites tend to easily rank in Google; they inherit their ranking potential from the site itself, even without having acquired any direct links themselves.
Another attribute worth noting is sourceType, which reflects the indexing tier a webpage is stored in.
Google has tiered storage for its index of the web (something I explained in this conference talk last year). It seems like the sourceType attribute indicates the value of links on a webpage depends on which indexing tier the page is stored in.
Basically, links from pages that are stored in the highest indexing tier (RAM storage) carry more weight than links from pages stored in lower indexing tiers (SSD and HDD).
News articles, due to their immediate value and ranking potential, initially tend to be stored in that highest indexing tier. As the story loses its news value and drops out of Top Stories, I suspect it eventually also moves from the first indexing tier to a lower tier. At which point any links the article contains will also lose much of their SEO value.
This shows the importance of ensuring your article has the right links - internal and external - before it’s published. Adding links to older articles isn’t as valuable, though may still be worthwhile in some contexts.
As we explore the data in his leak, expect more findings and confirmations (or refutations) of the validity of established SEO tactics.
Google Search Console Masterclass
There’s still time to book your ticket to the upcoming Google Search Console Masterclass that I’m doing together with WTF is SEO? on June 11th.
In this masterclass we’ll be digging into the most useful reports in Search Console, and what you can do with the data. Book your seat today!
Miscellanea
Here’s a list of recent interesting articles, developments in SEO, and new Google docs.
Official Google Docs:
Robots meta tag, data-nosnippet, and X-Robots-Tag specifications [updated]
Structured data for subscription and paywalled content [updated]
Interesting Articles:
Google CEO Sundar Pichai on AI-powered search and the future of the web - The Verge
AI, search, and publishers... How worried should we be? - Baekdal
Google AI Overviews breaks search giant’s grand bargain with publishers - Press Gazette
Google Researchers Say AI Now Leading Disinformation Vector - 404 Media
Google adds “web” filter to only show text-based links in Google Search results - SEL
Latest in SEO:
Lastly, the early bird offer for the 2024 News and Editorial SEO Summit ends this week. If you want to attend our virtual online event dedicated entirely to SEO for news publishers, early bird tickets are the cheapest tickets we’re going to sell so make sure you get yours today.
That’s it for another edition. Thanks for reading and subscribing, and I’ll see you at the next one.
Appreciate the link, mate!
Love your angle about QDF being mutually exclusive with AIOs. It makes sense. Could you see Google ever launching a “news version” of AIOs, like an AI summary?
Woo, I hadn't seen your presentation about the tiers were pages are store and it makes so much sense! Thank you for your newsletters, always a joy to read.
Also it seems they are testing AOI in UK already, I have seen it popping up for what is, meaning of type of queries already, seems they have given access already to some test accounts :(