How to find all 404 deadlinks - webmaster only allows 1000 to be downloaded...
-
Hi Guys
I have a question...I am currently working on a website that was hit by a spam attack.
The website was hacked and 1000's of adult censored pages were created on the wordpress site.
The hosting company cleared all of the dubious files - but this has left 1000's of dead 404 pages.
We want to fix the dead pages but Google webmaster only shows and allows you to download 1000.
There are a lot more than 1000....does any know of any Good tools that allows you to identify all 404 pages?
Thanks, Duncan
-
The Moz crawl report will also show 404s. I sometimes find that different spiders may find different things. Between the Search Console report, Screaming Frog (great investment) and Moz, you should have a nice collection of things to fix.
-
I must second Dirk's suggestion of screaming frog, great tool and I use it daily, a license is well worth the cost. Although spider crawl of the site will only point out 404's that have are links from an existing page, so if the hosting company cleaned up the not all of these 404's will surface.
One approach I would suggest is run the current 1000 404's in GWT through Screaming frog as a manually added list, (do it in 2 batches if you have the free version), start a spreadsheet of the resulting 404's and start working through that list. Once you have the 404's mark those as fixed as GWT tools set a reminder to check back in a few days and after a few days export the new list of 1000 404's and run these through screaming frog adding the resulting list to your spreadsheet. Keep doing this until you get the 404's errors in GWT down a manageable level.
I hope that helps, good luck.
-
Probably the easiest solution is to buy a licence from Screaming Frog & to crawl your site locally. The tool can do a lot of useful stuff to audit sites and will show you not only the full list of 4xx errors but also the pages that link to them.
There is also a free version but that allows you to crawl only 500 pages - which in your case is probably not sufficient but it would allow you to see how it works.
Hope this helps,
Dirk
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to Handle a Soft 404 error to an admin page in WordPress
I'm seeing this error on Google Webmaster Console: | URL: | http://www.awlwildlife.com/wp-admin/admin-ajax.php | | | Error details | Linked from | | |
Intermediate & Advanced SEO | | aj613
| Last crawled: 11/15/16First detected: 11/15/16 The target URL doesn't exist, but your server is not returning a 404 (file not found) error. Learn more Your server returns a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404). This creates a poor experience for searchers and search engines. More information about "soft 404" errors | Any ideas what I should do about it? Thanks!0 -
URL Parameter Setting Recommendation - Webmaster Tools, Breadcrumbs & 404s
Hi All, We use a parameter called "breadCrumb" to drive the breadcrumbs on our ecommerce product pages that are categorized in multiple places. For example, our "Blue Widget" product may have the following URLs: http://www.oursite.com/item3332/blue-widget
Intermediate & Advanced SEO | | Doug_G
http://www.oursite.com/item3332/blue-widget_?breadCrumb=BrandTree_
http://www.oursite.com/item3332/blue-widget_?breadCrumb=CategoryTree1_
http://www.oursite.com/item3332/blue-widget_?breadCrumb=CategoryTree2_ We use a canonical tag pointing back to the base product URL. The parameter only changes the breadcrumbs. Which of the following, if any, settings would you recommend for such a parameter in GWT: Does this parameter change page content seen by the user? Options: Yes/No
How does this parameter affect page content? Options: Narrows/Specifies/Other Currently, google decided to automatically assign the parameter as "Yes/Other/Let Googlebot Decide" without notifying us. We noticed a drop in rankings around the suspected time of the assignment. Lastly, we have a consistent flow of products that are discontinued that we 404. As a result of the breadcrumb parameter, our 404s increase significantly (one for each path). Would 800 404 crawl errors out of 18k products cause a penalty on a young site? We got an "Increase in '404' pages' email from GWT, shortly after our rankings seemed to drop. Thank you for any advice or suggestions! Doug0 -
350 (Out the 750) Internal Links Listed by Webmaster Tools Dynamically Generated-Best to Remove?
Greetings MOZ Community: When visitors enter real estate search parameters in our commercial real estate web site, the parameters are somehow getting indexed as internal links in Google Webmaster Tools. About half are 700 internal links are derived from these dynamic URLs. It seems to me that these dynamic alphanumeric URL links would dilute the value of the remaining static links. Are the dynamic URLs a major issue? Are they high priority to remove? The dynamic URLs look like this: /listings/search?fsrepw-search-neighborhood%5B%5D=m_0&fsrepw-search-sq-ft%5B%5D=1&fsrepw-search-price-range%5B%5D=4&fsrepw-search-type-of-space%5B%5D=0&fsrepw-search-lease-type=1 These URLs do not show up when a SITE: URL search is done on Google!
Intermediate & Advanced SEO | | Kingalan10 -
Webmaster Tools HTML Improvements Page Blank / Site Not Ranking Well
I have an ecommerce site that is not ranking well currently. It has about 1,000 pages indexed in Google but very few appear to be ranking. I normally find issues in Webmaster Tools HTML Improvements but for some reason it does not see a problem with the site. There are problems, trust me. Moz shows many issues. Google nothing! There is a problem somewhere but I am not seeing it. Why are HTML Improvements blank and the site not ranking? Am I in the dreaded sandbox? Any ideas? Sean We didn't detect any content issues with your site. As we crawl your site, we check it to detect any potential issues with content on your pages, including duplicate, missing, or problematic title tags or meta descriptions. These issues won't prevent your site from appearing in Google search results, but paying attention to them can provide Google with more information and even help drive traffic to your site. For example, title and meta description text can appear in search results, and useful, descriptive text is more likely to be clicked on by users.
Intermediate & Advanced SEO | | optin0 -
New domain purchase 301 and 404 issues. Please help!
We recently purchased www.carwow.com and 301 redirected the site to www.carwow.co.uk (our main domain). The problem is that carwow.com had URLs indexed like www.carwow.com/a-b-c the 301 sends them to carwow.co.uk/a-b-c which obviously doesn't exist so is a 404! What should be done in this situation? Should it be ignored and not re-directed at all, or is there a way to delete/disavow these dead pages? An SEO has advised we redirect all pages to the homepage, but won't that mess up the link profile? Any advice would be great!
Intermediate & Advanced SEO | | JamesPursey0 -
Google webmaster tool (GWT) owner removal issue
Hi! I have a new client, the former agency added the client property with the agency account so we had to create a new GA account (as you can’t transfer ownership at the account level) but we also kept access to the former account to keep historical data. We were granted owner access to the GWT (which is more flexible, you can remove owners and creators) and we now want to remove former agency users. We have 3 adresses. One was verified with delegation method (no pb for removal), one with meta tag (no pb) and one with Google Analytics. Here it becomes tricky as Google says regarding GA verif method “If this account was verified using a Google Analytics tracking code, you should make sure that the user you want to unverify is no longer an administrator on the Analytics account. Otherwise, removal may not be permanent”. The thing is that this user has the same email address as the one used to create the agency GA account (no ownership transfer) so I basically can’t remove admin rights. The other possibility, as Google mentions when I try to unlink this user, is “remove the administrator status in Google Analytics or delete the Google Analytics tracking code on the website”. But we don’t want to remove the code as we still want to track data with the former account for historical analysis purposes. Has anyone ever faced this situation? Do you know how to handle this? Do you think that unlinking the GWT and the GA accounts will unverify the GA method? Many thanks in advance ! Ennick
Intermediate & Advanced SEO | | ennick0 -
Killing 404 errors on our site in Google's index
Having moved a site across to Magento, obviously re-directs were a large part of that, ensuring all the old products and categories linked up correctly with the new site structure. However, we came up against an issue where we needed to add, delete, then re-add products. This, coupled with a misunderstanding of the csv upload processing, meant that although the old urls redirected, some of the new Magento urls changed and then didn't redirect: For Example: mysite/product would get deleted re-added and become: mysite/product-1324 We now know what we did wrong to ensure it doesn't continue to happen if we weret o delete and re-add a product, but Google contains all these old URLs in its index which has caused people to search for products on Google, click through, then land on the 404 page - far from ideal. We kind of assumed, with continual updating of sitemaps and time, that Google would realise and update the URL accordingly. But this hasn't happened - we are still getting plenty of 404 errors on certain product searches (These aren't appearing in SEOmoz, there are no links to the old URL on the site, only Google, as the index contains the old URL). Aside from going through and finding the products affected (no easy task), and setting up redirects for each one, is there any way we can tell Google 'These URLs are no longer a thing, forget them and move on, let's make a fresh start and Happy New Year'?
Intermediate & Advanced SEO | | seanmccauley0 -
301 vs. 404
If a listing on a website is no longer available to display is it better to resolve to a 301 redirect or use a 404? I know from an SEO point of view a 301 will pass on the link value, but is that as valuable as saying tto the user hey that page is no lonoger available try something else?
Intermediate & Advanced SEO | | AU-SEO0