Website blog is hacked. Whats the best practice to remove bad urls
-
Hello
So our site was hacked which created a few thousand spam URLs on our domain. We fixed the issue and changed all the spam urls now return 404. Google index shows a couple of thousand bad URLs.
My question is-
What's the fastest way to remove the URLs from google index. I created a site map with sof the bad urls and submitted to Google. I am hoping google will index them as they are in the sitemap and remove from the index, as they return 404.
Any tools to get a full list of google index? ( search console downloads are limited to 1000 urls). A Moz site crawl gives larger list which includes URLs not in Google index too. Looking for a tool that can download results from a site: search.
Any way to remove the URLs from the index in bulk? Removing them one by one will take forever.
Any help or insight would be very appreciated.
-
Technically 404 means "temporarily unavailable but coming back later" so you might want to consider Status 410 instead of 404. You could also supplement it with Meta no-index, if you can't use the HTML implementation then fire the no-index directive through the HTTP header using X-robots:
https://developers.google.com/search/reference/robots_meta_tag (scroll down a little to find the relevant part)
E.g:
"HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: noindex
(…)"... something like that.
You can't use Search Console to remove URLs from Google at all. The remove URL tool, only removes URLs one at a time and it only does so 'temporarily', the URLs pop back again after a bit. The best thing you can do is give Google some harsher directives and hope they listen, in a month or two most of those should be gone
Don't use robots.txt on the URLs as, if Google can't crawl them it won't find the 410s or the no-index directives
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding https version of website: how best to redirect
If I have 4 versions of my site http://www
Technical SEO | | bhsiao
http://
https://www
https:// What is the best way to redirect without losing seo positions? i have been mainly using http://www but have recently added my ssl so https works also. I heard at Moz Con that I should get the https working. All of my marketing and ads are going to http://www 301 redirect 3 of them? Which 3? If https is becoming important, should that be my main url? will it hurt my seo to switch? Thank you so much in advance!0 -
URL structure
Hello Guys, Quick Question regarding URL strucutre One of our client is an hotel chain, thye have a group site www.example.com and each property is located in a subfolder: www.example.com/example-boston.html , www.example.com/example-ny.html etc. My quesion is : where is better to place the language extension at a subfolder level?
Technical SEO | | travelclickseo
Should i go for www.example.com/en/example-ny.html or it is preferable to specify the language after the property name www.example.com/example-ny/en/accommodation.html? Thanks and Regards, Alessio0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
Best way to implement noindex tags on archived blogs
Hi, I have approximately 100 old blogs that I believe are of interest to web browsers that I'd potentially like to noindex due to the fact that they may be viewed poorly by Google, but I'd like to keep on our website. A lot of the content in the blogs is similar to one another (as we blog about the same topics quite often), which is why I believe it may be in our interests to noindex older blogs that we have newer content for on more recent blogs. Firstly does that sound like a good idea? Secondly, can I use Google Tag Manager to implement noindex tags on specific blog pages? It's a hassle to get the webmaster to add in the code, and I've found no mention of whether you can implement such tags on Tag Manager on the usual SEO blogs. Or is there a better way to implement noindex tags en masse? Thanks!
Technical SEO | | TheCarnage0 -
Changed URL of all web pages to a new updated one - Keywords still pick the old URL
A month ago we updated our website and with that we created new URLs for each page. Under "On-Page", the keywords we put to check ranking on are still giving information on the old urls of our websites. Slowly, some new URLs are popping up. I'm wondering if there's a way I can manually make the keywords feedback information from the new urls.
Technical SEO | | Champions0 -
How can I best find out which URLs from large sitemaps aren't indexed?
I have about a dozen sitemaps with a total of just over 300,000 urls in them. These have been carefully created to only select the content that I feel is above a certain threshold. However, Google says they have only indexed 230,000 of these urls. Now I'm wondering, how can I best go about working out which URLs they haven't indexed? No errors are showing in WMT related to these pages. I can obviously manually start hitting it, but surely there's a better way?
Technical SEO | | rango0 -
What's the best format for a e-commerce URL product page
We have over 2000 non branded experiences and activities sold through our website. The website is having a face lift with the a new look and a stronger focus on SEO. As part of this, I am keen to establish what the best practice is for product based URLs. I've researched the market and come up with a few alternatives that are used: domain/category/subcategory/activity_name domain/activity_name/category/subcategory/activity_reference domain/generic_term/activity_reference/activity_name domain/category/activity_location/activity_name Activities are location based but the location can change (say once every 2 years). Activity names, category, subcategory and activity_reference rarely change. Are there any thoughts/ research on the best method? (If there is one) Many thanks in advance for your insights.
Technical SEO | | philwill0 -
What are the SEOmoz-suggested best practices for limiting the number of 301 redirects for a given site?
I've read some vague warnings of potential problems with having a long list of 301 redirects within an htaccess file. If this is a problem, could you provide any guidance on how much is too much? And if there is a problem associated with this, what is that problem exactly?
Technical SEO | | roush0