Are 17000+ Not Found (404) Pages OK?
-
Very soon, our website will go a rapid change which would result in us removing 95% or more old pages (Right now, our site has around 18000 pages indexed).
It's changing into something different (B2B from B2C) and hence our site design, content etc would change.
Even our blog section would have more than 90% of the content removed.
What would be the ideal scenario be?
- Remove all pages and let those links be 404 pages
- Remove all pages and 301 redirect them to the home page
- Remove all unwanted pages and 301 redirect them to a separate page explaining the change (Although it wouldn't be that relevant since our audience has completely changed)- I doubt it would be ideal since at some point, we'd need ot remove this page as well and again do another redirection
-
Mohit,
Tom's advice will help you determine which pages are worth redirecting and which should just go to a 404 page (which should be customized instead of the browser/host default, and should also return a 404 response code in the http header!). My guess is that pages with links only from scraper sites aren't going to pass the tests laid out by Tom and thus would just go to a 404 page. However, any that have decent external links would fit the criteria and would be candidates for a 301 redirect.
-
Just to add a little to this great reply...
Here is how I would determine if it was worth my time to keep some of the old pages.
If the industry is the same but the end user is different, I would make EVERY attempt to keep those old pages. AuthorRank will matter in the future if you can contribute that information into a particular rel=publisher then I think it will be totally worth the time.
If, however, the information has nothing to do with the industry, then I wouldn't even consider taking the time to figure all of this out. I would have a kick ass 404 page to help people find your new stuff though.
Remember too that when you 301 redirect you do in fact loose some "link juice". (I really hate that phrase) So if the incoming links are of little to now value then a 301 will provide even less.
-
Hi Tom.. Thank you for your advice.
The thing is, we don't want to retain the users. They are not going to serve our cause anymore (We used to spend thousands of dollars every month on server costs just to keep up with teh load. now we are cutting it down- so unwanted users are not really something we want as it would result in load increase)
I'll surely follow your advice on OSE. The thing is, we have lot of link to the pages from scraper sites. I am not sure if it's worth keeping though.
-
Hi there
17,000 is quite a lot. I would look at maybe redirecting some of the URLs and I would do this based on certain criteria.
First of all, it helps to have a complete list of your current URLs. Screaming Frog is a great tool for this and is free.
Once you have your URLs, go into your analytics data and see which pages are attracting users. Take a sample size of about 2-3 months. If you're using Google analytics, click on traffic sources -> sources -> all traffic on the left-hand side.
When the dashboard loads, next to the "Primary Dimension" click other, and from the drop down menu click traffic sources, then landing page.
Any page with more than 5 or 10 visitors could be one worth redirecting. If these are pages that visitors might frequently use to get to your site, ensuring they are redirected might help to not interrupt their user journey. A 404 might put them off and go elsewhere.
Next, I'd look at what pages you might want to save to keep your SEO "strength". Put your URL into OpenSiteExplorer and then once done, click on "top pages". We're interested in the "Inbound Links" column here. Export the file into a CSV then sort the URL list in Excel by the Inbound Link total. You can filter here the pages with less links, so for instance you could remove the pages with 3 inbound links or less. It's a general way of doing things and isn't foolproof, but you will be left with a list of pages that could be getting decent PageRank/link equity. Manually check those pages and their backlinks and if you think they're acceptable, make sure you put in a 301 redirect.
Anything that doesn't match either of these criteria I would leave for a 404. You may be left with a lot, but Google knows that 404s are an accepted part of the course and won't penalise you for them. Check out this webmasters blog link.
Hope this helps with your decision making!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the benefit of directory pages?
I recently started at a new job running ecommerce websites. We sell yoga equipment and on 2 of our sites we built directory pages for yoga studios to list their calendars and whatnot. They are pretty old and out of date, but my question is, is there any benefit to these types of directories? If they do, we need to look at refreshing them. But if not, then they need to go. One of them is here. http://www.everythingyoga.com/studios.aspx Like I said, it is out of date.
Intermediate & Advanced SEO | | ShockoeCommerce0 -
Google indexing only 1 page out of 2 similar pages made for different cities
We have created two category pages, in which we are showing products which could be delivered in separate cities. Both pages are related to cake delivery in that city. But out of these two category pages only 1 got indexed in google and other has not. Its been around 1 month but still only Bangalore category page got indexed. We have submitted sitemap and google is not giving any crawl error. We have also submitted for indexing from "Fetch as google" option in webmasters. www.winni.in/c/4/cakes (Indexed - Bangalore page - http://www.winni.in/sitemap/sitemap_blr_cakes.xml) 2. http://www.winni.in/hyderabad/cakes/c/4 (Not indexed - Hyderabad page - http://www.winni.in/sitemap/sitemap_hyd_cakes.xml) I tried searching for "hyderabad site:www.winni.in" in google but there also http://www.winni.in/hyderabad/cakes/c/4 this link is not coming, instead of this only www.winni.in/c/4/cakes is coming. Can anyone please let me know what could be the possible issue with this?
Intermediate & Advanced SEO | | abhihan0 -
Page Rank Worse After Optimization
For a long time, we had terrible on page SEO. No keyword targeting, no meta titles or descriptions. Just a brief 2-4 sentence product description and shipping information. Strangely, we weren't ranking too bad. For one product, we were ranking on page 1 of Google for a certain keyword. My goal to reach the top of page 1 would be easy (or so I thought). I have now optimized this page to rank better for the same keyword. I have a 276 word description with detailed specifications and shipping information. I have a strong title and meta description with keywords and modifers. I have also included a video demonstration, additional photos and an PDF of the owners manual. In my eyes, the page is 100% better than it ever was. In the eyes of MOZ, it's better also. I've got an A with the On-Page Grader. Why is this page now ranking on page 8 of Google? What have I done wrong? What can I do to correct it?
Intermediate & Advanced SEO | | dkeipper0 -
Rel=Alternate on Paginated Pages
I've a question about setting up the rel=alternate & rel=canonical tags between desktop and a dedicated mobile site in specific regards to paginated pages. On the desktop and mobile site, all paginated pages have the rel=canonical set towards a single URL as per usual. On the desktop site though, should the rel=alternate be to the relevant paginated page on the mobile site (ie a different rel=alternate on every paginated page) or to a single URL just as it is vice versa. Cheers chaps.
Intermediate & Advanced SEO | | eventurerob1 -
Google is showing 404 error. What should I do?
Dear Experts, Though few of my website pages are accessible, Google is showing 404 error. What should I do? Even moz reports gives me the same. Problems:
Intermediate & Advanced SEO | | Somanathan
1. Few of my Pages are not yet catched in Google. (Earlier all of them were catched by Google)
2. Tried to fetch the those pages, but Google says, page not found.
3. Included them in sitemap, the result is the same. Please advice: Note: I have recently changed my hosting server.0 -
Using two 404 NOT FOUND pages
Hi all, I was wondering if any of you can advise whether it's no issue to use two separate custom 404 pages. The 404 pages would be different for different parts of the site. For instance, if you're on /community/ and you enter a non-existing page on: www.sample.com/community/example/ it would give you a different 404 page than someone who runs into a non existing page at: www.sample.com/definition/example/ Does anybody have experience with this and would this be fine?
Intermediate & Advanced SEO | | RonFav0 -
Why duplicate content for same page?
Hi, My SEOMOZ crawl diagnostic warn me about duplicate content. However, to me the content is not duplicated. For instance it would give me something like: (URLs/Internal Links/External Links/Page Authority/Linking Root Domains) http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110516 /1/1/31/2 http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110711 0/0/1/0 http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110811 0/0/1/0 http://www.nuxeo.com/en/about/contact?utm_source=enews&utm_medium=email&utm_campaign=enews20110911 0/0/1/0 Why is this seen as duplicate content when it is only URL with campaign tracking codes to the same content? Do I need to clean this?Thanks for answer
Intermediate & Advanced SEO | | nuxeo0 -
Noindex term for pages
I have a client who has a virtuemart site and all of the "ask a question on this product" had been indexed on google. I have managed to get a noindex meta tag into the ask a question page, will these be dropped from the index next time they are crawled and google sees the noindex?
Intermediate & Advanced SEO | | webseoservices0