The Impact of Article Comments on SEO
Allowing readers to post comments and contribute to your site is great for engagement and loyalty, but does come with some SEO repercussions.
For this newsletter I’d intended to write about Discover, collecting everything I know about maximising visibility in Google’s personalised recommendation feed. As part of that process I reached out to Lily Ray, who is one of the industry’s true experts on all things Discover.
Turns out she had an article in the works about Discover, which has now been published on Moz with the title How to Drive Traffic in Google Discover: The Ultimate Guide.
And yes, it’s an ultimate guide. It covers everything I wanted to cover and loads more. So for now I don’t need to write about Discover, as Lily’s piece is quite simply the definitive piece on that topic.
So instead I moved on to the next topic on my list, which is an area I get many questions about whenever I deliver training for clients: user-generated content, and specifically comments underneath articles.
UGC is intrinsic to the web
User-generated content (UGC for short) has been a staple of the internet since the beginning. From bulletin boards to guestbooks and comments, the internet and the web have always allowed users to post and participate in various ways.
On the modern web, UGC is everywhere: comments, Q&As, forums and bulletin boards, guest articles, wikis, reviews - UGC comes in an endless variety of flavours.
Social media platforms are pretty much entirely UGC. Sites like Reddit and Quora are built on the contributions of their audience, with (almost) all their content generated by their users. Every page on Wikipedia is a combined effort from thousands of volunteers.
Essentially, UGC is open source for the web’s content.
UGC can serve as a deliberate tactic to generate vast amounts of content - which can rank in Google and drive traffic to the website - with relatively little effort. Instead of having to pay journalists and writers, you can just outsource the production of content to your audience.
The biggest risk is the potential abuse of UGC for unsavoury purposes. Oversight is required, which means you still need to invest in moderation resources to safeguard the quality of your site’s UGC.
For the purpose of this newsletter, I’ll focus on the most common form of user-generated content on publisher sites: reader comments underneath articles.
Can Google see comments?
The first question I get asked most often is ‘can Google see comments’? And the answer is, it depends.
You can easily test if Google can ‘see’ comments on your site. Simply copy a sentence from one of your user comments, paste it in a Google search in double quotes, and add the site:[example.com]
operator to limit the results to your own domain.
An example:
This query is looking for a specific sentence from a comment on this article, and indeed Google shows it:
You may need to try snippets from comments on several different articles, and should always select articles that are at least a few weeks old. For Google to see and index comments, it usually needs to render pages as part of its tiered indexing process. The rendering phase of content is often delayed - it can take weeks for Google to fully render a newly published webpage.
So pick an old(ish) article that was relatively popular (so has a higher probability of being rendered), copy a snippet from a top comment underneath the article, paste it in a Google search box with quotes, and use the site:
operator to focus the search on your own website.
If Google shows the article the comment belongs to, then yes Google indexes your comments.
Should Google see comments?
When we’ve answered the first question, the logical follow-up question is whether Google should be able to index comments. And, again, it depends.
Google has confirmed that it sees comments as part of the webpage’s content. In fact, Google has gone on record to say they like seeing comments. In the words of Google’s John Mueller:
“What I think is really useful there with those comments is that oftentimes people will write about the page in their own words and that gives us a little bit more information on how we can show this page in the search results. So from that point of view I think comments are a good thing on a page.”
I’ve had this experience myself, where one of my blog posts ranked top in Google results for a specific query that appeared only in the comments and not in the actual piece:
When I migrated my site to a new platform I chose not to migrate the comments across, and as a result I lost this particular ranking (which I’m genuinely a bit sad about).
But this also brings up a second point: Not all comments add value.
If you moderate your comments closely, you can ensure the published comments adhere to minimum standards and avoid spam, abuse, and profanity. With good moderation practices, it’s definitely worth allowing Google to index your comments as these can add value and enable broader rankings for your article in Google’s ‘ten blue links’ evergreen rankings.
However, if you have imperfect (or no) moderation, or your comment section has a certain community culture that features language which may be interpreted as abusive and profane, it might be safer to make your comments invisible for Google.
I have seen websites suffer from algorithm updates when their comment sections had specific features that were acceptable within the cultural norms of that site’s community, but to outsiders could appear to be abusive and harmful.
Google prefers to err on the side of caution. When Google sees an abundance of content on a site that could be interpreted as harmful, it can downgrade that website’s visibility in its search results - including in Top Stories if it’s a news publisher.
So you’ll have to take an objective look at the comments underneath your articles, and judge whether they’re acceptable for a broad audience. If the answer is ‘maybe not’, you should consider removing your comments from Google’s sight.
How can I block my comments from Google?
Which brings us to the third question: How can Google be prevented from seeing the comments on your site? The answer is, you guessed it… it depends.
Blocking Google from indexing comments is a technical issue. The best approach depends on how your site has implemented its comments function.
A common approach is to load comments through JavaScript. When specific JS files need to be loaded for your comments to appear on your articles, you could prevent Google from loading those JS files by blocking them with a robots.txt disallow rule. For example:
User-agent: Googlebot
Disallow: /*comments.js
Some websites load comments with a separate URL attached to an article, for example with a URL parameter added to the article URL or in a subfolder (like Substack’s comment function) . These too can be blocked in robots.txt:
User-agent: Googlebot
Disallow: /*?comments=yes
Disallow: /*/comments
Alternatively, if your comments are on a separate URL you can add a noindex meta tag to this URL to prevent Google from indexing them:
<meta name="robots" content="noindex,nofollow">
If your comments are implemented on your site with a hashed URL, for example /article-url-here#comments
, then blocking becomes more challenging. Google ignores hashes (and everything after a hash symbol) in URLs, so as a first step you’ll need to test whether Google can see the comments.
If Google indexes your comments, you may need to change how comments are implemented on your site to enable a specific blocking mechanism.
Some sites use a third-party comments feature like Disqus. I’m generally not a fan of those: Not only are such external comment features often terribly slow and have a negative impact on your site’s core web vitals, they can also come with additional problematic ‘features’.
The only upside is that, by virtue of trying to hide their footprint from Google, 3rd party comment features frequently make themselves unindexable for search engines. I’m not sure that payoff is worth it, though.
Blocking Google may not be enough
After the email edition of this newsletter went out, Will reached out and reminded me that comments are explicitly mentioned as page quality signals in Google’s Quality Raters Guidelines.
In these guidelines, Google states: “If a specific page on a website has unrelated ‘spammed’ comments, the page should be ratest Lowest.”
So even if Google can’t see bad comments but they’re shown to users, it can still impact your site’s quality signals as perceived by the machine learning systems that the quality raters are improving.
If you lack the resources to properly moderate your comments, it might be safer to get rid of comments altogether.
Comments and Crawling
So far I haven’t discussed an additional aspect to consider with comments: How it impacts Googlebot’s crawling of your site. Once again, this depends on exactly how comments have been implemented.
Some websites generate a unique URL for every comment that is posted. This could create enormous amounts of new URLs for Googlebot to discover, crawl, and potentially index. This represents a potential crawl waste issue, and adds an additional element to your SEO considerations.
If you’re finding that Google is spending an inordinate amount of crawl effort on comment URLs, you could consider blocking Googlebot from your comments with aforementioned robots.txt disallow rules or a similar mechanism.
Comments on Comments
If you have any thoughts of your own on the value of comments and how you would approach it, please leave a comment. ;)
NESS 2023 Wrap-Up
The third annual News and Editorial SEO Summit was held last month on October 11th and 12th, and I’m proud to say it was another smashing success. We had 546 registered attendees, of which 490 joined the live event at various stages.
The event had a truly global audience, with attendees from 53 different countries; from New Zealand to the USA, from Norway to Argentina! Every continent was represented at NESS 2023 (except for Antarctica - perhaps one day!). It's amazing to have such a diverse audience from around the world be part of the event.
The chat was very lively on both days - there were over 1600 messages posted in the chat in total, and a whopping 578 questions asked with the Q&A feature across all sessions! Amazing engagement from all our attendees.
The ever-awesome Jessie & Shelby have written roundups of both days of the conference, which you can read on their excellent WTF is SEO? website:
We’re already planning the 2024 edition and have dates reserved: October 29th & 30th 2024. Save the dates in your calendar and keep an eye on the NewsSEO.io website for details.
Miscellanea
As usual I’ll end with a roundup of the most interesting updates, resources, and stories since the last newsletter.
Official Google Docs:
Mobile-first indexing has landed - thanks for all your support
Googlebot (updated: “When crawling from IP addresses in the US, the timezone of Googlebot is Pacific Time.”)
Interesting Articles:
Pervasive Unauthorized Use of Publisher Content to Power Generative AI Technologies - News Media Alliance
Why Google’s generative AI gamble is a game of chicken it could lose - Press Gazette
From headlines to algorithms: how newsrooms are approaching creating AI guidelines - The Fix
Publishers' page views plummet 40% after latest Google tweak - Future Media
20 most profitable Google queries revealed during antitrust trial - SEL
Latest in SEO:
How to Drive Traffic in Google Discover: The Ultimate Guide - Moz
7 must-see Google Search ranking documents in antitrust trial exhibits - SEL
What is an Explainer? - WTF is SEO?
Inside Google’s massive 2023 E-E-A-T Knowledge Graph update - SEL
Does Google AMP Still Matter for Publishers in 2023? - NewsDashboard
SEO & Site Migrations: Common Pitfalls To Avoid [Podcast] - SEJ
That’s it for this edition. Thanks for reading and subscribing, and I’ll see you at the next one!
This was a great read - I don't see comments getting mentioned much within the industry. UGC is great too - Tory did a great post on this at Moz recently: https://moz.com/blog/ugc-strategy-guide-for-seo
On the comments itself I've noticed some publishers disabling comments and preferring to move discussion over to Twitter/X or such. I guess with what's happening there this may not be the smartest move, with people moving off the platform (yourself included).
I feel that there's a gap in the market for a proper/decent commenting system. Substack seem to do this nicely, but have never been a fan of Disqus and some of the others. I think a simple Reddit style system (upvote / downvote comments) would work nicely - I think it would add a very interesting angle to lots of articles.
nice article