Removing indexed pages
-
Hi all, this is my first post so be kind
- I have a one page Wordpress site that has the Yoast plugin installed. Unfortunately, when I first submitted the site's XML sitemap to the Google Search Console, I didn't check the Yoast settings and it submitted some example files from a theme demo I was using. These got indexed, which is a pain, so now I am trying to remove them. Originally I did a bunch of 301's but that didn't remove them from (at least not after about a month) - so now I have set up 410's - These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?
Thanks in advance for any suggestions. -
Thanks for all the responses!
At the moment I am serving the 410's using the .htaaccess file as I removed the actual pages a while ago. The pages don't show in most searches, however, two of them do show up in some instances under the sitelinks which is the main pain. I manually asked for them to be removed using 'remove urls' however that only last a couple of months and they are now back.
So I guess the best way is to recreate the pages and insert a noindex?
Thanks again for everyone time, it's much appreciated.
-
I agree with ViviCa1's methods, so go with that.
One thing I just wanted to bring up though, is that unless people are actually visiting those pages you don't want indexed, or it does some type of brand damage, then you don't really need to make it a priority.
Just because they're indexed doesn't mean they're showing up for any searches - and most likely they aren't - so people will realistically never see them. And if you only have a one-page site, you're not wasting much crawl budget on those.
I just bring this up since sometimes we (I'm guilty of it too) can get bogged down by small distractions in SEO that don't really help much, when we should be creating and producing new things!
"These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?"
There was a good related response from Google employee Susan Moskwa:
“The best way to stop Googlebot from crawling URLs that it has discovered in the past is to make those URLs (such as your old Sitemaps) 404. After seeing that a URL repeatedly 404s, we stop crawling it. And after we stop crawling a Sitemap, it should drop out of your "All Sitemaps" tab.”
A bit older, but shows how Google discovers URLs through the sitemap. Take a look at the rest of that thread as well.
-
I'd suggest adding a noindex robots meta tag to the affected pages (see how to do this here: https://support.google.com/webmasters/answer/93710?hl=en) and until Google recrawls use the remove URLs tool (see how to use this here: https://support.google.com/webmasters/answer/1663419?hl=en).
If you use the noindex robots meta tag, don't disallow the pages through your robots.txt or Google won't even see the tag. Disallowing Google from crawling a page doesn't mean it won't be indexed (or removed from the index), it just means Google won't crawl the page.
-
Couple of ideas spring to mind
- Use the robots.txt file
- Demote the site link in Google search console (see https://support.google.com/webmasters/answer/47334)
Example of robots.txt file...
Disallow: /the-link/you-dont/want-to-show.html
Disallow: /the-link/you-dont/want-to-show2.htmlDon't include the domain just the link to the page, Plenty of tutorials out there worthwhile having a look at http://www.robotstxt.org
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results. Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/ Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page. I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed. Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...?
Technical SEO | | d.bird0 -
Why would Google not index all submitted pages?
On Google Search console we see that many of our submitted pages weren't indexed. What could be the reasons? | Web pages |
Technical SEO | | Leagoldberger
| 130,030 Submitted |
| 87,462 Indexed |0 -
Google Indexing Pages with Made Up URL
Hi all, Google is indexing a URL on my site that doesn't exist, and never existed in the past. The URL is completely made up. Anyone know why this is happening and more importantly how to get rid of it. Thanks 🙂
Technical SEO | | brian-madden0 -
Pages not being indexed
Hi Moz community! We have a client for whom some of their pages are not ranking at all, although they do seem to be indexed by Google. They are in the real estate sector and this is an example of one: http://www.myhome.ie/residential/brochure/102-iveagh-gardens-crumlin-dublin-12/2289087 In the example above if you search for "102 iveagh gardens crumlin" on Google then they do not rank for that exact URL above - it's a similar one. And this page has been live for quite some time. Anyone got any thoughts on what might be at play here? Kind regards. Gavin
Technical SEO | | IrishTimes0 -
My site was Not removed from google, but my most visited page was. what does that mean?
Help. My most important page http://hoodamath.com/games/ has disappeared from google, why the rest of my site still remains. i can't find anything about this type of ban. any help would be appreciated ( i would like to sleep tonight)
Technical SEO | | hoodamath0 -
Indexed pages and current pages - Big difference?
Our website shows ~22k pages in the sitemap but ~56k are showing indexed on Google through the "site:" command. Firstly, how much attention should we paying to the discrepancy? If we should be worried what's the best way to find the cause of the difference? The domain canonical is set so can't really figure out if we've got a problem or not?
Technical SEO | | Nathan.Smith0 -
How to best remove old pages for SEO
I run an accommodation web site, each listing has its own page. When a property is removed what is the best way to handle this for SEO because the URL will no longer be valid and there will be a blank page.
Technical SEO | | JamieHibbert0 -
Page rank 2 for home page, 3 for service pages
Hey guys, I have noticed with one of our new sites, the home page is showing page rank two, whereas 2 of the internal service pages are showing as 3. I have checked with both open site explorer and yahoo back links and there are by far more links to the home page. All quality and relevant directory submissions and blog comments. The site is only 4 months old, I wonder if anyone can shed any light on the fact 2 of the lesser linked pages are showing higher PR? Thanks 🙂
Technical SEO | | Nextman0