After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
-
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there.
How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped.
Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
-
Google Search Console actually has a URL removal tool built into it, unfortunately it's not really scaleable (mostly it's one at a time submissions) and in addition to that the effect of using the tool is only temporary (the URLs come back again)
In your case I reckon' that changing the status code of the 'gone' URLs from 404 ("temporarily not found, but will be returning soon") to 410 ("GONE!") might be a good idea. Google might digest that better as it's a harder indexation directive and a very strong crawl directive ("go away, don't come back!")
You could also serve the Meta no-index directive on those URLs. Obviously you're unlikely to have access to the HTML of non-existent pages, but did you know Meta no-index can also be fired through x-robots, through the HTTP header? So it's not impossible
https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404
(Ctrl+F for "X-Robots-Tag HTTP header")
Another option is this form to let Google know outdated content is gone, has been removed, and isn't coming back:
https://www.google.com/webmasters/tools/removals
... but again, URLs one at a time is going to be mega-slow. It does work pretty well though (at least in my experience)
In any eventuality I think you're looking at, a week or two for Google to start noticing in a way that you can see visually - and then maybe a month or two until it rights itself (caveat: it's different for all sites and URLs, it's variable)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My url disappeared from Google but Search Console shows indexed. This url has been indexed for more than a year. Please help!
Super weird problem that I can't solve for last 5 hours. One of my urls: https://www.dcacar.com/lax-car-service.html Has been indexed for more than a year and also has an AMP version, few hours ago I realized that it had disappeared from serps. We were ranking on page 1 for several key terms. When I perform a search "site:dcacar.com " the url is no where to be found on all 5 pages. But when I check my Google Console it shows as indexed I requested to index again but nothing changed. All other 50 or so urls are not effected at all, this is the only url that has gone missing can someone solve this mystery for me please. Thanks a lot in advance.
Intermediate & Advanced SEO | | Davit19850 -
Pages canonicaled to another appearing before the canonical on google searches
Hello, When I do this google search, this page(amandine roses category) appears before the one it is canonical-ed to(this multi-product version of amandine roses). This happens often with this multi-product template, where they don't rank as well as their category version(that are canonical to the multi-product version). Can someone maybe point us in the right direction on what the issue may be? What can be improved?
Intermediate & Advanced SEO | | globalrose.com0 -
What should I do after a failed request for validation (error with noindex, nofollow) in new Google Search Console?
Hi guys, We have the following situation: After an error message in new google search console for a large amount of pages with noindex, nofollow tag, a validation is requested before the problem is fixed. (it's incredibly stupid decision taken before asking the SEO team for advice) Google starts the validation, crawls 9 URLs and changes the status to "Failed". All other URLs are still in "pending" status. The problem has been fixed for more than 10 days, but apparently Google doesn't crawl the pages and none of the URLs is back in the index. We tried pinging several pages and html sitemaps, but there is no result. Do you think we should request for re-validation or wait more time? It there something more we could do to speed up the process?
Intermediate & Advanced SEO | | ParisChildress0 -
Google still listing old domain
Hi We moved to a new domain back in March 2014 and redirected most pages with a 301 and submitted change of domain request through Google Webmaster tools. A couple of pages were left as 302 redirect as they had rubbish links pointing to them and we had previously had a penalty. Google was still indexing the old domain and our rankings hadn't recovered. Last month we took away the 302 redirects and just did a blanket 301 approach from old domain to new in the the thinking that as the penalty had been lifted from the old domain there was no harm in sending everything to new domain. Again, we submitted the change of domain in webmaster tools as the option was available to us but its been a couple of weeks now and the old domain is still indexed Am I missing something? I realise that the rankings may not have recovered partly due to the disavowing / disregarding of several links but am concerned this may be contributing
Intermediate & Advanced SEO | | Ham19790 -
Google Webmaster Remove URL Tool
Hi All, To keep this example simple.
Intermediate & Advanced SEO | | Mark_Ch
You have a home page. The home page links to 4 pages (P1, P2, P3, P4). ** Home page**
P1 P2 P3 P4 You now use Google Webmaster removal tool to remove P4 webpage and cache instance. 24 hours later you check and see P4 has completely disappeared. You now remove the link from the home page pointing to P4. My Question
Does Google now see only pages P1, P2 & P3 and therefore allocate link juice at a rate of 33.33% each. Regards Mark0 -
How do I Improve Google Local search position
Hi, I think its called local search position, what I'm referring to is when you do a search on a keyword and google lists not only the best matches but also usually the second match is a group of 3 businesses with telephone numbers, google reviews and at the bottom of the group it will say something like: "See results for <your keyword="">on a map. This is what I'm referring to. in anycase my question is if I click on the link to see more results on a map I'm listed as number 3, however on the search page before where the link is displayed which I just clicked on I'm not being listed and instead one business name is being listed three times, each of the listings uses the same address but a different telephone number, In addtion the business that is being listed three times is also listed in the results being returned above in this case position #1 for the keyword I have searched. I assume this has something to do with them also being listed in the group of local businesses below three time.. The business I'm interested in getting listed in this group of results is currently being listed page 2 position 5 for the keyword..</your> Any suggestions would be greatly appreciated.. Thanks in advance..
Intermediate & Advanced SEO | | robdob11 -
Do Q&A 's work for SEO
If I create a good community in my particular field on my SEO site and have a quality Q&A section like this etc (ripping of MOZ's idea here sorry, I hope it's ok) will the long term returns be worth the effort of creating and man ageing this. Is the user created content of as much use as I think it will be?
Intermediate & Advanced SEO | | mark_baird0 -
Best solution to get mass URl's out the SE's index
Hi, I've got an issue where our web developers have made a mistake on our website by messing up some URL's . Because our site works dynamically IE the URL's generated on a page are relevant to the current URL it ment the problem URL linked out to more problem URL's - effectively replicating an entire website directory under problem URL's - this has caused tens of thousands of URL's in SE's indexes which shouldn't be there. So say for example the problem URL's are like www.mysite.com/incorrect-directory/folder1/page1/ It seems I can correct this by doing the following: 1/. Use Robots.txt to disallow access to /incorrect-directory/* 2/. 301 the urls like this:
Intermediate & Advanced SEO | | James77
www.mysite.com/incorrect-directory/folder1/page1/
301 to:
www.mysite.com/correct-directory/folder1/page1/ 3/. 301 URL's to the root correct directory like this:
www.mysite.com/incorrect-directory/folder1/page1/
www.mysite.com/incorrect-directory/folder1/page2/
www.mysite.com/incorrect-directory/folder2/ 301 to:
www.mysite.com/correct-directory/ Which method do you think is the best solution? - I doubt there is any link juice benifit from 301'ing URL's as there shouldn't be any external links pointing to the wrong URL's.0