Best method for blocking a subdomain with duplicated content
-
Hello Moz Community
Hoping somebody can assist.
We have a subdomain, used by our CMS, which is being indexed by Google.
http://www.naturalworldsafaris.com/
https://admin.naturalworldsafaris.com/The page is the same so we can't add a no-index or no-follow.
I have both set up as separate properties in webmaster toolsI understand the best method would be to update the robots.txt with a user disallow for the subdomain - but the robots text is only accessible on the main domain. http://www.naturalworldsafaris.com/robots.txt Will this work if we add the subdomain exclusion to this file?
It means it won't be accessible on https://admin.naturalworldsafaris.com/robots.txt (where we can't create a file). Therefore won't be seen within that specific webmaster tools property.
I've also asked the developer to add a password protection to the subdomain but this does not look possible.
What approach would you recommend?
-
Many thanks for taking the time to reply.
Ryan - I am talking to CMS Source about these options; it's not possible with the current configuration but it looks like we may be able to change this with a bit of development work.
Sean - Thanks, I had considered the canonical, which would at least prevent a duplicate content issue although I think we need to take measures to stop this admin. subdomain being accessible which this won't help with.
Lynn - Thanks, I'll do that as an interim measure.
Thanks
Kate -
HI,
If you have the subdomain set up in GWT then you could do a url removal request (Google Index -> Remove URLs). Just put in the root admin subdomain (ie https://admin.naturalworldsafaris.com/). This usually works pretty quickly and given your description will likely be the quickest way to drop admin urls that are showing up in the serps.
It is not an ideal solution and the tool will tell you that for permanent exclusion you should do it through your robots.txt file. in regards your question about that though - the robots.txt file for the subdomain has to be placed on the subdomain, adding the subdomain to the main domains robot.txt file wont work.
-
One alternative you could try is to place a
while this is not as ideal as adding the noindex nofollow it could help tell Google that the admin is not the preferred domain.
-
That's very strange behavior for the admin portion of a site to not be behind password protection. If the current developer is unable to do so, I'd ask around on that because if you block the admin portion via password protection, even if it's still indexed in Google it will rank much lower.
Are you able to contact someone at CMS Source to get some support? That might help resolve this as well as provide guidance on getting the robots.txt uploaded and being able to add noindex to admin only.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Possible duplicate content issue
Hi, Here is a rather detailed overview of our problem, any feedback / suggestions is most welcome. We currently have 6 sites targeting the various markets (countries) we operate in all websites are on one wordpress install but are separate sites in a multisite network, content and structure is pretty much the same barring a few regional differences. The UK site has held a pretty strong position in search engines the past few years. Here is where we have the problem. Our strongest page (from an organic point of view) has dropped off the search results completely for Google.co.uk, we've picked this up through a drop in search visibility in SEMRush, and confirmed this by looking at our organic landing page traffic in Google Analytics and Search Analytics in Search Console. Here are a few of the assumptions we've made and things we've checked: Checked for any Crawl or technical issues, nothing serious found Bad backlinks, no new spammy backlinks Geotarggetting, this was fine for the UK site, however the US site a .com (not a cctld) was not set to the US (we suspect this to be the issue, but more below) On-site issues, nothing wrong here - the page was edited recently which coincided with the drop in traffic (more below), but these changes did not impact things such as title, h1, url or body content - we replaced some call to action blocks from a custom one to one that was built into the framework (Div) Manual or algorithmic penalties: Nothing reported by search console HTTPs change: We did transition over to http at the start of june. The sites are not too big (around 6K pages) and all redirects were put in place. Here is what we suspect has happened, the https change triggered google to re-crawl and reindex the whole site (we anticipated this), during this process, an edit was made to the key page, and through some technical fault the page title was changed to match the US version of the page, and because geotargetting was not turned on for the US site, Google filtered out the duplicate content page on the UK site, there by dropping it off the index. What further contributes to this theory is that a search of Google.co.uk returns the US version of the page. With country targeting on (ie only return pages from the UK) that UK version of the page is not returned. Also a site: query from google.co.uk DOES return the Uk version of that page, but with the old US title. All these factors leads me to believe that its a duplicate content filter issue due to incorrect geo-targetting - what does surprise me is that the co.uk site has much more search equity than the US site, so it was odd that it choose to filter out the UK version of the page. What we have done to counter this is as follows: Turned on Geo targeting for US site Ensured that the title of the UK page says UK and not US Edited both pages to trigger a last modified date and so the 2 pages share less similarities Recreated a site map and resubmitted to Google Re-crawled and requested a re-index of the whole site Fixed a few of the smaller issues If our theory is right and our actions do help, I believe its now a waiting game for Google to re-crawl and reindex. Unfortunately, Search Console is still only showing data from a few days ago, so its hard to tell if there has been any changes in the index. I am happy to wait it out, but you can appreciate that some of snr management are very nervous given the impact of loosing this page and are keen to get a second opinion on the matter. Does the Moz Community have any further ideas or insights on how we can speed up the indexing of the site? Kind regards, Jason
Intermediate & Advanced SEO | | Clickmetrics0 -
Mixing up languages on the same page + possible duplicate content
I have a site in English hosted under .com with English info, and then different versions of the site under subdirectories (/de/, /es/, etc.) Due to budget constraints we have only managed to translate the most important info of our product pages for the local domains. We feel however that displaying (on a clearly identified tab) the detailed product info in English may be of use for many users that can actually understand English, and may help us get more conversions to have that info. The problem is that this detailed product info is already used on the equivalent English page as well. This basically means 2 things: We are mixing languages on pages We have around 50% of duplicate content of these pages What do you think that the SEO implications of this are? By the way, proper Meta Titles and Meta Descriptions as well as implementation of href lang tag are in place.
Intermediate & Advanced SEO | | lauraseo0 -
Duplicate content with URLs
Hi all, Do you think that is possible to have duplicate content issues because we provide a unique image with 5 different URLs ? In the HTML code pages, just one URL is provide. It's enough for that Google don't see the other URLs or not ? Example, in this article : http://www.parismatch.com/People/Kim-Kardashian-sa-securite-n-a-pas-de-prix-1092112 The same image is available on: http://cdn-parismatch.ladmedia.fr/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize1-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize2-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize3-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg Thank you very much for your help. Julien
Intermediate & Advanced SEO | | Julien.Ferras0 -
Best anchor text strategy for embeddable content
Hi all We provide online services, and as part of this we provide our clients with a javascript embeddable 'widget' to place on their website. This is fairyly popular (100s-1000s of inserts on websites). The main workings of this are javascript (they spit html iframe onto the page) but we also include both a <noscript>portion (which is purely customer focused, it deep links into a relevant page on our website for the user to follow) and also a plain <p><a href=''></a></p> at the bottom, under the JS. This is all generated and inserted by the website owner. Therefore, after insertion we can dynamically update whatever the Javascript renders out, but the <noscript> and <a> at the bottom are there forever.</p> <p>Previously, this last plain link has been used for optimisation, with it randomly selecting 1 out of a bank of 3 different link anchor texts when the widget html is first generated.</p> <p>We've also recently split our website into B2B and B2C portions, so this will be linking to a newer domain with much established backlinks than the existing domain. I think we could get away with optimised keyword links on the old domain but the newer domain they will be more obvious.</p> <p>In light of recent G updates, we're afraid this may look spammy. We obviously want to utilise the link as best as possible, as it is used by hundreds of our clients, but don't want it to cause any issues. </p> <p>So my question, would you just focus on using brand name anchor text for this? Or could we mix it up with a few keyword optimised links also? If so, what sort of ratio would you suggest?</p> <p>Many thanks</p></noscript>
Intermediate & Advanced SEO | | benseb0 -
Duplicate content URLs from bespoke ecommerce CMS - what's the best solution here?
Hi Mozzers Just noticed this pattern on a retail website... This URL product.php?cat=5 is also churning out products.php?cat=5&sub_cat= (same content as product.php?cat=5 but from this different URL - this is a blank subcat - there are also unique subcat pages with unique content - but this one is blank) How should I deal with that? and then I'm seeing: product-detail.php?a_id=NT001RKS0000000 and product-detail.php?a_id=NT001RKS0000000&cont_ref=giftselector (same content as product-detail.php?a_id=NT001RKS0000000 but from this different URL) How should I deal with that? This is a bespoke ecommerce CMS (unfortunately). Any pointers would be great 🙂 Best wishes, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Best-of-the-web content: Graphical Tips
This question is for EGOL (if he's willing) and anyone else who wants to partake. EGOL is the best content writer I've ever run into, really. I'm wondering what his top 3 to 5 tips are on how to use graphical layout (font, images, graphics, organization, menu, etc) to make content irresistable. A couple of assumptions: The content is written really well from a perspective of authority. Also, we're not including video on this one. Again, anyone is welcome to answer this. Thanks!
Intermediate & Advanced SEO | | BobGW1 -
How much (%) of the content of a page is considered too much duplication?
Google is not fond of duplication, I have been very kindly told. So how much would you suggest is too much?
Intermediate & Advanced SEO | | simonberenyi0 -
How are they avoiding duplicate content?
One of the largest stores in USA for soccer runs a number of whitelabel sites for major partners such as Fox and ESPN. However, the effect of this is that they are creating duplicate content for their products (and even the overall site structure is very similar). Take a look at: http://www.worldsoccershop.com/23147.html http://www.foxsoccershop.com/23147.html http://www.soccernetstore.com/23147.html You can see that practically everything is the same including: product URL product title product description My question is, why is Google not classing this as duplicate content? Have they coded for it in a certain way or is there something I'm missing which is helping them achieve rankings for all sites?
Intermediate & Advanced SEO | | ukss19840