Google indexing staging / development site that is redirected...
-
Hi Moz Fans! - Please help.
We had a acme.stagingdomain.com while a site was in development, when it went live it redirected (302) to acmeprofessionalservices.com (real names redacted!!)
no known external links to staging site
although staging site url has been emailed from Google Apps(!!!)
now found that staging site is in the index even though it redirects to the proper public site.
and some (but not all) of the pages are in the index too. They all redirect to the proper public site when visited.
It is convenient to have a redirect from the staging site to the new one for the team, Chrome etc. remember frequently visited sites. Be a shame to lose that.
Yes, these pages can be removed using webmaster tools.
But how did they get in the index to start with?And if we're building a new site, and a customer has an existing site is there a danger of duplicate content etc. penalties caused by the staging site?
We had a similar incident recently when a PDF that was not linked anywhere on the site appeared in the index. The link had been emailed through Google Apps, and visited in Chrome, but that was it.
So 3 questions.
Why is the staging site still in the index despite the redirects?
How did they get in the index in the first place?
Will the new staging site affect the rank of the existing site, eg. duplicate content penalties?
-
Hi There
1. It could still be in the index because they are 302 redirect and not 301. 302 is temporary, and therefore Google may not de-index those URLs. It also takes time. I've seen Google take months to noindex redirecting URLs. Also, make sure you are not blocking crawling of the dev site, or Google will not see the redirects.
2. I am not sure how they got there to begin with. I pretty much always can find some sort of error - maybe someone tweeted a staging URL, maybe crawling wasn't blocked, maybe there was one link to staging from the live site etc etc. Regardless - somehow Google crawled it
To prevent this in the future always block crawling of staging servers well before you ever put anything on them.
3. Usually Google tries to sort this out. They won't give you a penalty for "technical" duplicate content (penalties are more for "malicious" duplicate content ie: stealing people's content). So you won't get penalized, but the more you can help Google out by sorting it out, the more time Google can spend crawling the correct site etc.
What I would do now is, if you do want the staging URLs to redirect (which might not be the best solution if you want to ever go back and work on the staging server again) - but if you do, use 301 redirects and make sure you are allowing crawling of the staging site. Keep it registered in webmaster tools and this way you can monitor the indexation levels.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have a question about the impact of a root domain redirect on site-wide redirects and slugs.
I have a question about the impact (if any) of site-wide redirects for DNS/hosting change purposes. I am preparing to redirect the domain for a site I manage from https://siteImanage.com to https://www.siteImanage.com. Traffic to the site currently redirects in reverse, from https://www.siteImanage.com to https://siteImanage.com. Based on my research, I understand that making this change should not affect the site’s excellent SEO as long as my canonical tags are updated and a 301 redirect is in place. But I wanted to make sure there wasn’t a potential consequence of this switch I’m not considering. Because this redirect lives at the root of all the site’s slugs and existing redirects, will it technically produce a redirect chain or a redirect loop? If it does, is that problematic? Thanks for your input!
Technical SEO | | mollykathariner_ms0 -
Google Webmaster Image Index Issue
I submitted the image sitemap in GWT and only few of them get indexed in google, but now the indexed images are also getting de-index. Any solution for it? See the attached E4hPDQE
Technical SEO | | tigersohelll0 -
Page missing from Google index
Hi all, One of our most important pages seems to be missing from the Google index. A number of our collections pages (e.g., http://perfectlinens.com/collections/size-king) are thin, so we've included a canonical reference in all of them to the main collection page (http://perfectlinens.com/collections/all). However, I don't see the main collection page in any Google search result. When I search using "info:http://perfectlinens.com/collections/all", the page displayed is our homepage. Why is this happening? The main collection page has a rel=canonical reference to itself (auto-generated by Shopify so I can't control that). Thanks! WUKeBVB
Technical SEO | | leo920 -
Google Seeing Way More Pages Than My Site Actually Has
For one of my sites, A-1 Scuba Diving And Snorkeling Adventures, Google is seeing way more pages than I actually have. It sees almost 550 pages but I only have about 50 pages in my XML. I am sure this is an error on my part. Here is the search results that show all my pages. Can anyone give me some guidance on what I did wrong. Is it a canonical url problem, a redirect problem or something else. Built on Wordpress. Thanks in advance for any help you can give. I just want to make sure I am delivering everything I can for the client.
Technical SEO | | InfinityTechnologySolutions0 -
Google is not indexing my new URL structure. Why not?
Hi all, We launched a new website for a customer on April 29th. That same day we resubmitted the new sitemap & asked Google to fetch the new website. Screenshot is attached of this (GWT Indexed). However, when I look at Google Index (see attachment - Google Index), Automated Production's old website URL's still appear. It's been two weeks. Is it normal for Google's index to take this long to update? Thanks for your help. Cole VoLPjhy vfxVUsO
Technical SEO | | ColeLusby0 -
How to fix Google index after fixing site infected with malware.
Hi All Upgraded a Joomla site for a customer a couple of months ago that was infected with malware (it wasn't flagged as infected by google). Site is fine now but still noticing search queries for "cheap adobe" etc with links to http://domain.com/index.php?vc=201&Cheap_Adobe_Acrobat_xi in web master tools (about 50 in total). These url's redirect back to home page and seem to be remaining in the index (I think Joomla is doing this automatically) Firstly, what sort of effect would these be having on on their rankings? Would they be seen by google as duplicate content for the homepage (moz doesn't report them as such as there are no internal links). Secondly what's my best plan of attack to fix them. Should I setup 404's for them and then submit them to google? Will resubmitting the site to the index fix things? Would appreciate any advice or suggestions on the ramifications of this and how I should fix it. Regards, Ian
Technical SEO | | iragless0 -
A site is not being indexed by Google Yahoo or Bing
This site - http://adoptionconnection.org/ is not being indexed by any of the search engines. I checked the easy stuff - robots text is: <meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">all, index, follow</a>" /> <meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">noodp</a>" /> <meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">noydir</a>" /> I have checked what I can determine would cause the issue but have found nothing to prevent it from being indexed. I'm thinking it may be re-directs etc. Any answer would be great. Thanks in advance,
Technical SEO | | Intergen0 -
How can I get Google to crawl my site daily?
I was wndering if there was a trick to getting google to crawl my website daily?
Technical SEO | | labradoodlelocator0