How to find all 404 deadlinks - webmaster only allows 1000 to be downloaded...
-
Hi Guys
I have a question...I am currently working on a website that was hit by a spam attack.
The website was hacked and 1000's of adult censored pages were created on the wordpress site.
The hosting company cleared all of the dubious files - but this has left 1000's of dead 404 pages.
We want to fix the dead pages but Google webmaster only shows and allows you to download 1000.
There are a lot more than 1000....does any know of any Good tools that allows you to identify all 404 pages?
Thanks, Duncan
-
The Moz crawl report will also show 404s. I sometimes find that different spiders may find different things. Between the Search Console report, Screaming Frog (great investment) and Moz, you should have a nice collection of things to fix.
-
I must second Dirk's suggestion of screaming frog, great tool and I use it daily, a license is well worth the cost. Although spider crawl of the site will only point out 404's that have are links from an existing page, so if the hosting company cleaned up the not all of these 404's will surface.
One approach I would suggest is run the current 1000 404's in GWT through Screaming frog as a manually added list, (do it in 2 batches if you have the free version), start a spreadsheet of the resulting 404's and start working through that list. Once you have the 404's mark those as fixed as GWT tools set a reminder to check back in a few days and after a few days export the new list of 1000 404's and run these through screaming frog adding the resulting list to your spreadsheet. Keep doing this until you get the 404's errors in GWT down a manageable level.
I hope that helps, good luck.
-
Probably the easiest solution is to buy a licence from Screaming Frog & to crawl your site locally. The tool can do a lot of useful stuff to audit sites and will show you not only the full list of 4xx errors but also the pages that link to them.
There is also a free version but that allows you to crawl only 500 pages - which in your case is probably not sufficient but it would allow you to see how it works.
Hope this helps,
Dirk
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Alternate page with proper canonical tag Status: Excluded in Google webmaster tools.
In Google Webmaster Tools, I have a coverage issue. I am getting this error message: Alternate page with proper canonical tag Status: Excluded. It gives the below blog post page as an example. Any idea how to resolve? At one time, I was using handl utm grabber, but the plugin is deactivated on my website. https://www.savacations.com/turrialba-costa-ricas-garden-city/?utm_source=deleted&utm_medium=deleted&utm_term=deleted&utm_content=deleted&utm_campaign=deleted&gclid=deleted5.
Intermediate & Advanced SEO | | Alancito0 -
How do i rank for 1000 keywords?
i have dr 25 and 200 referring domains and ranking for 90 kws in usa. i saw this trend that if you rank for more kws then chances are that you can rank for those high traffic kws in 1 to 5 positions. what i mean is that it increases your odds ? possible answer1 :increase dr and da both and ur and pa ( ahrefs and moz) i know pagerank matters but these are some metrics we can look at for right now possible answer 2 : get a lot of backlinks maybe from same site but how does my backlinks can help me to rank for 1000 kws so that i can have at least 100 kws to rank in position 1 to 5? detailed answers will defi be appreciated
Intermediate & Advanced SEO | | Sam09schulz0 -
Moved company 'Help Center' from Zendesk to Intercom, got lots of 404 errors. What now?
Howdy folks, excited to be part of the Moz community after lurking for years! I'm a few weeks into my new job (Digital Marketing at Rewind) and about 10 days ago the product team moved our Help Center from Zendesk to Intercom. Apparently the import went smoothly, but it's caused one problem I'm not really sure how to go about solving: https://help.rewind.io/hc/en-us/articles/*** is where all our articles used to sit https://help.rewind.io/*** is where all our articles now are So, for example, the following article has now moved as such: https://help.rewind.io/hc/en-us/articles/115001902152-Can-I-fast-forward-my-store-after-a-rewind- https://help.rewind.io/general-faqs-and-billing/frequently-asked-questions/can-i-fast-forward-my-store-after-a-rewind This has created a bunch of broken URLs in places like our Shopify/BigCommerce app listings, in our email drips, and in external resources etc. I've played whackamole cleaning many of these up, but these old URLs are still indexed by Google – we're up to 475 Crawl Errors in Search Console over the past week, all of which are 404s. I reached out to Intercom about this to see if they had something in place to help, but they just said my "best option is tracking down old links and setting up 301 redirects for those particular addressed". Browsing the Zendesk forms turned up some relevant-ish results, with the leading recommendation being to configure javascript redirects in the Zendesk document head (thread 1, thread 2, thread 3) of individual articles. I'm comfortable setting up 301 redirects on our website, but I'm in a bit over my head in trying to determine how I could do this with content that's hosted externally and sitting on a subdomain. I have access to our Zendesk admin, so I can go in and edit stuff there, but don't have experience with javascript redirects and have read that they might not be great for such a large scale redirection. Hopefully this is enough context for someone to provide guidance on how you think I should go about fixing things (or if there's even anything for me to do) but please let me know if there's more info I can provide. Thanks!
Intermediate & Advanced SEO | | henrycabrown1 -
My website is ranking well on most of keywords. How do I find more keywords in order to drive more traffic to my website?
I have a website which is ranking well on some good keywords ie generic and long tail. It is also ranking for some really competitive keywords. and now getting constant traffic. I want to increase organic traffic to my website. What are the best possible ways to do this? How to research more keywords and how to identify that they will really work? Please help, I am confused.
Intermediate & Advanced SEO | | rishi.ast0 -
When migrating website platforms but keeping the domain name how best do we add the new site to google webmaster tools? Best redirect practices?
We are moving from BigCommerce to Shopify but maintaining our domain name and need to make sure that all links redirect to their corresponding links. We understand the nature of 301s and are fine with that, but when it comes to adding the site to google webmaster tools, not losing link juice and the change of address tool we are kind of lost. Any advice would be most welcome. Thank you so much in advance!
Intermediate & Advanced SEO | | WNL0 -
Getting a Sitemap for a Subdomain into Webmaster Tools
We have a subdomain that is a Wordpress blog, and it takes days, sometimes weeks for most posts to be indexed. We are using the Yoast plugin for SEO, which creates the sitemap.xml file. The problem is that the sitemap.xml file is located at blog.gallerydirect.com/sitemap.xml, and Webmaster Tools will only allow the insertion of the sitemap as a directory under the gallerydirect.com account. Right now, we have the sitemap listed in the robots.txt file, but I really don't know if Google is finding and parsing the sitemap. As far as I can tell, I have three options, and I'd like to get thoughts on which of the three options is the best choice (that is, unless there's an option I haven't thought of): 1. Create a separate Webmaster Tools account for the blog 2. Copy the blog's sitemap.xml file from blog.gallerydirect.com/sitemap.xml to the main web server and list it as something like gallerydirect.com/blogsitemap.xml, then notify Webmaster Tools of the new sitemap on the galllerydirect.com account 3. Do an .htaccess redirect on the blog server, such as RewriteRule ^sitemap.xml http://gallerydirect.com/blogsitemap_index.xml Then notify Webmaster Tools of the new blog sitemap in the gallerydirect.com account. Suggestions on what would be the best approach to be sure that Google is finding and indexing the blog ASAP?
Intermediate & Advanced SEO | | sbaylor0 -
Handful of internal pages penguin penalized. 302 them or let them 404?
We have a site that is for the most part doing great, but the internal pages that received too much link building received some penguin penalties (no warning in WMT) but it's fairly obvious. Has anyone tried letting these pages 404 and just creating new URL's? Or 302 redirecting the old URL's to new ones?
Intermediate & Advanced SEO | | iAnalyst.com0 -
Generating 404 Errors but the Pages Exist
Hey I have recently come across an issue with several of a sites urls being seen as a 404 by bots such as Xenu, SEOMoz, Google Web Tools etc. The funny thing is, the pages exist and display fine. This happens on many of the pages which use the Modx CMS, but the index is fine. The wordpress blog in /blog/ all works fine. The only thing I can think of is that I have a conflict in the htaccess, but troubleshooting this is difficult, any tool I have found online seem useless. Have tried to rollback to previous versions but still does not work. Anyone had any experience of similar issues? Many thanks K.
Intermediate & Advanced SEO | | Found0