What exactly is ‘duplicate content’?
When blocks of content – and more specifically text content – appear in different places on the internet, identical or mostly identical, then we are dealing with double or duplicate content. By ‘different places’ we mean different urls, or on the same website (the domain name will be identical, for example https://www.domainnname.co.uk/webpage1.html and https://www. domainnname.co.uk /identical-to-webpage1.html) or on different websites (for example https://www.website1.co.uk/webpage1.html en https://www.website2.co.uk/identical-to-webpage1.html)
But it doesn’t end here. Sometimes there is both a version with ‘www’ and one without, with (mostly) the same content. This also qualifies as duplicate content. The same goes for a version of a website with an http:-protocol, existing next to a version with an https:-protocol.
Sometimes duplicate content across different website is unavoidable
Its an issue that webshops often struggle with, when they literally copy technical and / or other content from the websites of their product suppliers. In reality Google easily recognises these scenarios and will not consider them as duplicate content. But the urls with the content of the suppliers are the original ones and will usually rank higher in the search results. To reach high ranking positions, web shop owners will have to invest in extra unique content, to optimise their webpages.
The same applies to press releases which are adopted by different website and even for blog posts: as long as Google can identify the original version, this will not cause problems.
Why does Google consider duplicate content to be an issue?
For search engines such as Google, every url represents a separate webpage. Two or more urls with near identical content is something Google isn’t fond of. The attempt to manipulate the search results through the use of duplicate content isn’t the only reason why Google frowns at content duplication. This strategy doesn’t work anymore anyway, meantime Google has become a lot smarter.
More importantly is that Google has to decide which page will get the highest ranking position in the search results. Positioning pages in each other’s proximity isn’t an option: Google users that click a link, review the content, return to the search results, click the next link and then see the same content, have – what Google dubs as – a ‘bad user experience’. This is something the search engine tries to avoid.
Best case scenario, Google will see to it that the one page or url – usually the oldest one, gets a normal and deserved ranking. The other page or url with duplicate content will only appear in the search results a few search result pages down. If there is a surprising amount of pages or urls with duplicate content, especially within the same website, Google will become very suspicious, which will definitely impact the value of the website.
A bunch of websites with different domain names, but with identical content, is that a good idea?
Some website owners believe to have discovered the holy grail to dominate the first page of the Google search results: create one website with outstanding content for vital keywords and then use that content on 9 other websites with different domain names. This will however not result in multiple listings on the first page in Google for those keywords, it simply doesn’t work.
With a bit of luck the oldest domain might reach a higher up position in Google, but all other pages will definitely be pushed back by the search engine, frequently different pages simultaneously.
Accidental duplicate content
A few cases where website owners with the best intentions struggle with duplicate content:
1. A new website with a new domain name
Sometimes a drastic change to a website cannot be avoided, for example, when you want to render your site mobile friendly. Some people immediately opt for a new domain name. Often the notion is: “I will leave the old website up, once the new one goes online, it will gradually climb in the search results until it catches up with the old one.” Unfortunately Google has a different point of view and will keep the new website as far away from the old one as possible.
Changing the domain name when your website already holds high ranking positions for important keywords , is in fact never a good idea: chances are you will irrevocably lose these good positions. In this case, it is a better idea to leave online the urls of the old website, WITHOUT the original content, but with a permanent redirect (301) to the similar url of then new website.
If all goes well, in due time Google will register that the old url has in fact been replaced by a new one and will itself make the switch. Mind that this doesn’t always work, and if it does, there will likely be serious fluctuations in the positions.
When setting up a new website, it is actually far more opportune to thoroughly optimise the textual content.
2. Shifting from a www. -website to one without, or from an http:-protocol to an https:-protocol
In the meantime you could consider the www. -story to be ancient history, most browsers are smart enough to add the www. -prefix themselves, should it be required at all. But, often website-owners want to get rid of the www. -version of their site, whilst keeping the www. –version of their page on line. Idem when a switch occurs from a website using an http:-protocol to one with the safer https:-protocol.
Both scenarios work out as described above: Google “sees” two different urls, one with ‘www-‘ and one without, and will give the highest site value to the oldest one, the one with ‘www-‘, while the website owner wants the one without ‘www-‘ to rank on top. The same goes for ‘http’ versus ‘https’. Also here the best approach is to keep both versions of the url, the one ‘www-‘ and / or ‘http’ WITHOUT content and WITH a permanent redirect to the version without ‘www-‘ and / or with ‘https’. But here as well positions might fluctuate quite a bit.
3. Unavoidable duplicate content within your website / web shop
Sometimes it is near impossible to avoid featuring segments on different webpages of your website with near identical content. This can be caused by the CMS (content management system) you use: not every CMS is apt enough to avoid duplicate content. But there are plenty of other scenarios imaginable where it is hard to avoid duplicate content.
No need to worry, this can be resolved with the so-called ‘canonical-tag’, but more about this in another blog article