How does Google decide what content is "similar" or "duplicate"?
-
Hello all,
I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this:
http://www.eteach.com/Employer.aspx?EmpNo=26626
http://www.eteach.com/Employer.aspx?EmpNo=36986
and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them.
But my question is...
If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages?
e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
Something like that...
Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
-
Hi Virginia,
Maybe this whiteboard Friday can help you out.
-
Hey Virginia
That is essentially what we call near duplicates and is the kind of content that can easily be created by pulling fields out of a database and dynamically creating the pages and dropping name, address etc into the placeholders.
Unique content is essentially that, unique content so this approach is probably not going to cut it. You could have certain elements pulled like this such as the address but you need to either remove these duplicate blocks and keep it more simple (like a business directory) and ideally add some unique elements to each page.
These kinds of pages often still rank for very specific queries and also often well thought out landing pages that link to pages like this that have value for users but are not search friendly can be a strategy.
So, assess how well these work as landing pages from search or are they coming in elsewhere? If they come in elsewhere you could no index these pages or block them in robots.txt. Then, target the bigger search terms higher up the tree and create good search landing pages that link to these other pages for users.
This is a real good read to get a better handle on duplicate content types and the relevant strategies:
http://moz.com/blog/fat-pandas-and-thin-content
Hope that helps
Marcus
-
Hi Virginia,
If you take your pages as a whole, code and all, the only slight difference in those pages is the
tag and the sidebar info with school address. The rest of the page code is exactly the same.
If you were to create 5 templates similar to:
[School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
If all you are doing is changing the [school name] ans [location] etc, I'm sure Google will still flag these pages as duplicate content.
Unique content is the best way. If theres not a lot of competition for the school name and the page has enough content about each individual school, head teacher etc, then "templates" might work. You can try it out but I'd say unique content is the best way. It's the nature of the beast with so many pages.
Hope this helps.
Robert
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why google is catching my website late
Hello, I hope you all guys are doing great. Recently, I published my over my website and within almost 10 mins, it was indexed completely and I also personally checked it in google search console. The URL was indexed but the problem is, it does not appear in Google Search. Sometimes in search result I notice Google shows a result who is published 10-30 mins ago but this is not the case with my website. All articles just show in Google SERP after 1-2 days. What can be the reason behind this, although DA, PA is good (28-31).
White Hat / Black Hat SEO | | HansiAliya0 -
Competitors with duplicate sites for backlinks
Hello all, In the last few months, my company has seen some keywords we historically rank well for fall off the first page, and there are a couple competitors that have appeared that use backlinks from seemingly the same site. For fairness, our site has slow page load speeds that we are working on changing, as well as not being mobile friendly yet. The sites that are ranking are mobile friendly and load fast, but we have heaps of other words still ranking well, and I'm more curious about this methodology. For example, these two pages: http://whiteboards.com.au/
White Hat / Black Hat SEO | | JustinBSLW
http://www.glasswhiteboards.com.au/ In OSE, glasswhiteboards has the majority of links from whiteboards, and the content between the sites is the same. My page has higher domain authority & page authority, but less backlinks. However, if you take away the backlinks from the duplicate site, they are the same. Isn't this type of content supposed to be flagged? My question is about whether this kind of similar site on different domains is a good idea to build links, as all my research shows that it's poor in the long run, but it seems to be working with these guys. Another group of sites that has been killing us uses this same method, with multiple sites that look the same that all link to each other to build up backlinks. These sites do have different content. It seems instead of building different categories within their own site, they have purchased multiple domains that act as their categories. Here's just a few: http://www.lockablenoticeboards.com.au/
http://www.snapperframes.com/
http://www.snapperdisplay.com.au/
http://www.light-box.com.au/
http://www.a-frame-signs.com.au/
http://www.posterhangers.com.au/0 -
Duplicate categories how to make sure I don't get penalized for this
Hi there How would I go about fixing duplicate categories? My products sell in multiple category areas and some overlap the other - how can I go about making sure that I don't get penalised for this? Each category and content is unique but my advisors offer different tools and insights.
White Hat / Black Hat SEO | | edward-may0 -
The wrath of Google's Hummingbird, a big problem, but no quick solution?
One of our websites has been wrongfully tagged for penalty and has literally disappeared from Google. After lot's of research, it seems the reason was due to a ton of spammy backlinks and irrelevant anchor text. I have disavowed the links, but the results are still not rebounding back. Any idea how long the wrath of Google gods will last?
White Hat / Black Hat SEO | | Mouneeb0 -
Google Local Listing Verification - Is there a way to skip this?
Hi, We are running 2 types of service in our company. 1.) Dry Cleaning 2.) Laundry Services The problem is we have 2 website but only 1 office address.
White Hat / Black Hat SEO | | chanel27
It is not recommended to put same address for the both websites
both doing laundry & dry cleaning services. Is there any tip on how we can get listed on Google place without using the same address for both website?0 -
Google turned me down, don't know why...
Hello, I'm experiencing decreasing on some of my keywords. I'm aware of some things which could be responsible for it. So I'd like to asi you, if my thoughts are right, and what to do with it. 1. I put backlinks leading onto my website. Those backlinks are on website I also own (they are on the same server). But nothing happened. Than I put other backlikns on this webiste. Those links also led to webistes I own. So could Google "punnished" those websites I'm linking to? 2. I offered my content to another website, which has a higher authority. This content had been published on my website weeks ago, I put it on this (another site). Co could Google punnished me for "duplicate" content? 3. In the past, we outsorced our SEO, and the company which was responsible for our SEO put backlinks leading to our website almost everywhere, I mean, those websites, they put links leading to our webistes fos focused on almost everything but our field (finance). But everything seemed to be fine, till now 4. Couple of days ago, I put our RSS on many RSS agregators and put our webiste on many catalogs. My website URL is www.penizenavic.cz Could you help me out? 🙂 Thanks a lot Petr
White Hat / Black Hat SEO | | petr.rozkosny0 -
Multiple doamin with same content?
I have multiple websites with same content such as http://www.example.com http://www.example.org and so on. My primary url is http://www.infoniagara.com and I also placed a 301 on .org. Is that enough to keep away my exampl.org site from indexing on google and other search engines? the eaxmple.org also has lots of link to my old html pages (now removed). Should i change that links too? or will 301 redirection solve all such issues (page not found/crawl error) of my old webpages? i would welcome good seo practices regarding maintaining multiple domains thanks and regards
White Hat / Black Hat SEO | | VipinLouka780