How to use canonical tags for SEO
Duplicate content is one of the most frequently discussed topics in SEO. Due to the immense popularity of content management systems like WordPress and Magento, many websites have pages that are accessible using several different URLs.
This can present a problem for search engines. Since the same content is accessible at two or more different URLs, which version should Google display in its results to provide the most relevant content for searchers?
In 2009, Google (and other search engines) introduced a new tag intended to solve the issue of content accessible via more than one URL. It’s called the canonical tag, and its purpose can often be a point of confusion for people new to SEO.
In this guide, we’ll explain how the canonical tag works, how you should use it for SEO and share best practices for using canonical tags on your website.
What are canonical tags?
The canonical tag is a short snippet of code that informs search engines as to which version of a page is the original, or canonical, copy. When used properly, a canonical tag prevents the wrong version of your page from appearing in search results.
If all of the URLs on your website direct people to static content – for example, your entire website is developed using simple HTML and CSS without any dynamic URLs – it’s obvious to Google which URL should be used for each page.
However, if your website uses dynamic URLs or serves the exact same content for the www subdomain as it does for the plain root domain, Google’s indexing system might not be able to determine which version of a page is the canonical one.
This can lead to Google viewing your website’s content as duplicate content, leading to reduced search engine visibility and significantly less link juice from links to your page’s numerous different URLs.
The canonical tag lets you tell Google which version of your page is the master copy – the URL that should appear in search engines and receive link juice from links that point towards different URLs leading to the same content.
Sound confusing? While canonical tags can seem complicated at first, in practice it’s fairly simple to use them effectively to help search engines identity the best version of your content to display.
Canonical tags and duplicate content
Blogs and e-commerce websites often use dynamic URLs – URLs created after a user sends a query to a website’s database – to display content. Dynamic URLs mean that a single page of content can sometimes be accessed using several different URLs.
For example, an e-commerce website that sells men’s shirt might have a static page, a search result and a category page all listing the same items at different URLs:
Since all three pages have the same content, it’s difficult for Google to know which one is the canonical copy that should appear in the search results for keywords like “men’s shirts”.
By adding a canonical tag to one of the pages above (in this case, you’d probably use the static HTML page) you can signal to Google which page is the original – the page that should appear in search results for its target keywords.
Canonical tags go inside the <head> section of the canonical page and any variations that would lead to the same content. For the page above, we’d use the following tag to signal to Google that the first URL is the original:
Canonical URLs aren’t just used for dynamic URLs. If your server is configured to use the same content inside the www subdomain, canonical tags can be used to show the search engines that the version without www is the original URL.
Reminder: If you switch your site to https:, make sure you update your rel=canonical tags as well.
— Dr. Pete Meyers (@dr_pete) August 18, 2014
Can canonical tags hurt your rankings?
Google usually pays attention to canonical tags and uses them to establish the URLs to list in its search results. However, canonical tags that are used incorrectly will be ignored, putting your site at risk of being penalised for duplicate content.
Seemingly small errors are often all it takes to invalidate a canonical tag. We found an example of incorrect canonical tags on a client’s website, where “.html” had been appended to the end of each tag, resulting in the tag referencing non-existent URLs.
Self-referencing canonical tags
Canonical tags are often added to URLs that refer to themselves. For example, a tag on “http://example.com/shirts.html” referencing it as the original page for a certain selection of content will look like the following:
From Google’s perspective, this is fine. In this case, the canonical tag achieves two things: it lists the static URL as the original content on this specific domain, and it also lists the page as the original source if the content appears on another website.
In e-commerce, it’s quite common for your website’s content to appear on a third-party website. A scraped product feed, for example, might source its descriptions and images from your website.
Self-referencing canonical URLs are only effective when all of the pages containing the same content reference one canonical URL. If each version of the page contains a self-referencing URL, it’s impossible to Google to determine which is the original.
Canonical tags and “noindex”
Adding the canonical tag to pages that contain duplicate content helps Google find the original version of the content on your website. As well as using the canonical tag, it’s also worth adding the “noindex” value to the page’s robots meta tag.
This prevents Google from indexing pages that contain duplicate content, reducing the risk of your website facing a penalty due to the same content appearing at more than one URL.
Does your website use canonical tags correctly?
It’s easy to make mistakes when adding canonical tags, especially if you’re installed a canonical URLs extension. Many extensions automatically generate canonical tags that reference different URLs as sources of the same “original” content.
If your website uses dynamic URLs or your server is configured to display the same content on multiple subdomains, it’s important that you use canonical tags to show Google which version of your content is the copy you’d like it to index and display.