How did my dev site end up in the search results?
-
We use a subdomain for our dev site. I never thought anything of it because the only way you can reach the dev site is through a vpn. Google has somehow indexed it. Any ideas on how that happened? I am adding the noindex tag, should I used canonical? Or is there anything else you can think of?
-
Personally, I'd still recommend using robots.txt to disallow all crawlers, even if more steps are taken.
-
Don't use tool removal, it can go bad indeed. Now, are you sure that there are no external links coming from anywhere?
For now I'd recommend putting noindex, nofollow on that dev subdomain and do manual recrawl through GWT.
-
It just uses internal links. Do you think I should try the webmaster tools removal? That seems like it could go wrong.
-
I never used screaming frog, does it check both external and internal links?
-
I have ran screaming frog to see if there are any links to any pages and but couldn't see any. Even if Google did try to follow it the firewall would stop them. It is so strange.
-
Then my first assumption is that it's linked from somewhere - read my comment a little above.
-
Then there is a leak somewhere - Google bots can "see" your subdomain.
Or it's been simply linked from somewhere. Then Google will try to follow the link and that would make it indexed.
-
They are telling me that there are no holes, and I have tried getting to the pages but can not do it unless I am on my vpn.
-
We never updated the robots.txt because the site was behind a firewall. If you click on any of the results it will not load the page unless on my VPN.
-
Robots.txt won't help anyhow. Bots still can see that there is such directory, they just won't see what's inside of those directories/subdomains.
-
Hi there.
If what you say is true, then there are only two answers: you got a leak somewhere or your settings/configuration is messed up.I'd say go talk to your system admin and make sure that everything what's supposed to be closed is closed, IPs, which are supposed to be open for use are open and those IPs only.
-
Have you updated the dev sites robots.txt to disallow everything? It is up to the bot to listen, but that combined with removing all of the dev URLs from Google Webmaster tools should do the trick.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is this on SERP results?
Hi Guys, Does anyone know what this is called: https://d.pr/i/RA6RsG And how Google pulls it from a page? Do you need some kind of markup? Cheers.
Intermediate & Advanced SEO | | nattyhall0 -
Dfferent url of some other site is shown by Google in cace copy of our site's page
Hi, When i check cached copy of url of my site http://goo.gl/BZw2Zz , the url in cache copy shown by Google is of some other third party site. Why is Google showing third party url in our site's cached url. Did any of you guys faced any such issue. Regards,
Intermediate & Advanced SEO | | vivekrathore0 -
URL Spoof Issue in Search Results
Hello! We could use some assistance diagnosing an issue. In order to avoid asking a convoluted question, I will try to break it down below: 1. A random foreign site is hacked and a subdirectory is added that is completely irrelevant to the root. a). i.e. http://www.um.org/prom_dresses/ 2. http://www.um.org/prom_dresses/ is just a phishing prom dress page 3. When you search "prom dress shop", the website that used to rank first (for good reason) was www.promdressshop.com. 4. www.promdressshop.com's home page has now been replaced by: um.org/prom_dresses/ – who is using prom dress shop's title tag and meta description. How is it possible that this hacked page (on um.org) is not only ranking above us, but is also starting to replace www.promdressshop.com's pages in search results. We do not believe www.promdressshop.com has been hacked but are open to any ideas. Please let me know if you would like any additional info. Thanks in advance! new
Intermediate & Advanced SEO | | LogicalMediaGroup0 -
SEO question regarding rails app on www.site.com hosted on Heroku and www.site.com/blog at another host
Hi, I have a rails app hosted on Heroku (www.site.com) and would much prefer to set up a Wordpress blog using a different host pointing to www.site.com/blog, as opposed to using a gem within the actual app. Whats are peoples thoughts regarding there being any ranking implications for implementing the set up as noted in this post on Stackoverflow: "What I would do is serve your Wordpress blog along side your Rails app (so you've got a PHP and a Rails server running), and just have your /blog route point to a controller that redirects to your Wordpress app. Add something like this to your routes.rb: _`get '/blog', to:'blog#redirect'`_ and then have a redirect method in your BlogController that simply does this: _`classBlogController<applicationcontrollerdef redirect="" redirect_to="" "url_of_wordpress_blog"endend<="" code=""></applicationcontrollerdef>`_ _Now you can point at yourdomain.com/blog and it will take you to the Wordpress site._
Intermediate & Advanced SEO | | Anward0 -
Some site's links look different on google search. For example Games.com › Flash games › Decoration games How can we do our url's like this?
For example Games.com › Flash games › Decoration games How can we do our url's like this?
Intermediate & Advanced SEO | | lutfigunduz0 -
Site re-design, full site domain A/B test, will we drop in rankings while leaking traffic
We are re-launching a client site that does very well in Google. The new site is on a www2 domain which we are going to send a controlled amount of traffic to, 10%, 25%, 50%, 75% to 100% over a 5 week period. This will lead to a reduction in traffic to the original domain. As I don't want to launch a competing domain the www2 site will not be indexed until 100% is reached. If Google sees the traffic numbers reducing over this period will we drop? This is the only part I am unsure of as the urls and site structure are the same apart from some new lower level pages which we will introduce in a controlled manner later? Any thoughts or experience of this type of re-launch would be much appreciated. Thanks Pete
Intermediate & Advanced SEO | | leshonk0 -
Internal Search Results Appear in Google SERPS
My friend is running an ecommerce store selling apparels. How can we make internal search results to appear in Google SERPS and rank them? For example: the query is "peplum dress". You type the query into the internal search box and it returns a set of results. In this case, it's product listing. How can we optimize and rank it so it appears in Google SERP? Do we do it the traditional way in terms of links? Say URL is: http://www.asos.com/search/peplum-top?q=peplum+top&r=2 And we build links to it? Some of you may ask why not create a dedicated page for this, the reason being we'd have too many categories if we were to create one for each. Thoughts?
Intermediate & Advanced SEO | | WayneRooney0 -
Duplicate site (disaster recovery) being crawled and creating two indexed search results
I have a primary domain, toptable.co.uk, and a disaster recovery site for this primary domain named uk-www.gtm.opentable.com. In the event of a disaster, toptable.co.uk would get CNAMEd (DNS alias) to the .gtm site. Naturally the .gtm disaster recover domian is an exact match to the toptable.co.uk domain. Unfortunately, Google has crawled the uk-www.gtm.opentable site, and it's showing up in search results. In most cases the gtm urls don't get redirected to toptable they actually appear as an entirely separate domain to the user. The strong feeling is that this duplicate content is hurting toptable.co.uk, especially as .gtm.ot is part of the .opentable.com domain which has significant authority. So we need a way of stopping Google from crawling gtm. There seem to be two potential fixes. Which is best for this case? use the robots.txt to block Google from crawling the .gtm site 2) canonicalize the the gtm urls to toptable.co.uk In general Google seems to recommend a canonical change but in this special case it seems robot.txt change could be best. Thanks in advance to the SEOmoz community!
Intermediate & Advanced SEO | | OpenTable0