404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redesign Just Starting - Should I Leave The Previous Incomplete Site or Setup A Temporary Holding Page and Redirect Previous URL'S?
Hi All I've picked up a new website project and wanted to ask about the best way to proceed with the current site during the development process. The current site is incomplete although it has been live for a while and has over 80 pages in the sitemap. Link to site https://tinyurl.com/ychwftup The business owner wants to take down the current site and simply add a landing page stating "new website coming soon". From an SEO perspective, am I better to keep the current site live until the new site is ready? Or would it not make any difference if I setup the landing page and add 301 redirects from each page in the sitemap to the landing page. Many Thanks In Advance For Any Assistance
Web Design | | ruislip180 -
How to make my site title H1?
Hi In my Header.php I have the following php code for my title: <title><br /><?php<br />// Generate Page Title dynamically<br />if (is_home()) {<br /> bloginfo('name'); ?> - <?php bloginfo('description');<br />} elseif (is_category()) {<br /> single_cat_title(); ?> - <?php bloginfo('name');<br />} elseif (is_single()) {<br /> single_post_title();<br />} elseif (is_page()) {<br /> bloginfo('name'); ?>: <?php single_post_title();<br />} elseif (is_404()) {<br /> bloginfo('name'); ?> - <?php _e("Page not found", "fungames");<br />} elseif (is_search()) {<br /> bloginfo('name'); ?> - <?php _e("Search results for", "fungames"); echo esc_html($s, 1);<br />}<br />?><br /></title> This generates a good title different for every page/post I have on my site. But is now H1. I want the same code if, but with H1 tag somewhere in it. Cant figure it out how to do it! Can u help please?
Web Design | | Catinas970 -
Lots of Listing Pages with Thin Content on Real Estate Web Site-Best to Set them to No-Index?
Greetings Moz Community: As a commercial real estate broker in Manhattan I run a web site with over 600 pages. Basically the pages are organized in the following categories: 1. Neighborhoods (Example:http://www.nyc-officespace-leader.com/neighborhoods/midtown-manhattan) 25 PAGES Low bounce rate 2. Types of Space (Example:http://www.nyc-officespace-leader.com/commercial-space/loft-space)
Web Design | | Kingalan1
15 PAGES Low bounce rate. 3. Blog (Example:http://www.nyc-officespace-leader.com/blog/how-long-does-leasing-process-take
30 PAGES Medium/high bounce rate 4. Services (Example:http://www.nyc-officespace-leader.com/brokerage-services/relocate-to-new-office-space) High bounce rate
3 PAGES 5. About Us (Example:http://www.nyc-officespace-leader.com/about-us/what-we-do
4 PAGES High bounce rate 6. Listings (Example:http://www.nyc-officespace-leader.com/listings/305-fifth-avenue-office-suite-1340sf)
300 PAGES High bounce rate (65%), thin content 7. Buildings (Example:http://www.nyc-officespace-leader.com/928-broadway
300 PAGES Very high bounce rate (exceeding 75%) Most of the listing pages do not have more than 100 words. My SEO firm is advising me to set them "No-Index, Follow". They believe the thin content could be hurting me. Is this an acceptable strategy? I am concerned that when Google detects 300 pages set to "No-Follow" they could interpret this as the site seeking to hide something and penalize us. Also, the building pages have a low click thru rate. Would it make sense to set them to "No-Follow" as well? Basically, would it increase authority in Google's eyes if we set pages that have thin content and/or low click thru rates to "No-Follow"? Any harm in doing this for about half the pages on the site? I might add that while I don't suffer from any manual penalty volume has gone down substantially in the last month. We upgraded the site in early June and somehow 175 pages were submitted to Google that should not have been indexed. A removal request has been made for those pages. Prior to that we were hit by Panda in April 2012 with search volume dropping from about 7,000 per month to 3,000 per month. Volume had increased back to 4,500 by April this year only to start tanking again. It was down to 3,600 in June. About 30 toxic links were removed in late April and a disavow file was submitted with Google in late April for removal of links from 80 toxic domains. Thanks in advance for your responses!! Alan0 -
What To Do When Improved Site Speed & Layout Result In Higher Bounce Rates & Lower Time On Site
We launched a new Bootstrap 3.0 site template 2 weeks ago. The site loads 5x faster and has a much improved layout (utilizing most common above the fold recommendations ). It's only been two weeks, but our bounce rate has increased 5-10% and our avg time on site decreased by 10-18%. Here is the page for one of our most common products so you can see the general experience: <a>http://www.jwsuretybonds.com/surety-bonds/commercial-bonds/auto_dealer_bond.htm</a> (here is the old version: <a>http://199.119.123.134/surety-bonds/commercial-bonds/auto_dealer_bond.htm</a>) We spent two months implementing the new design and working on a speedy load time. We had anticipated a drastic improvement, not mild downturn in user behavior. I'm hopeful that the Analytics metrics aren't showing the true picture on the keywords we care about (can't see anymore due to "Not Provided" listed as most keywords now. Argh!) and perhaps some of the more important/accurate user behavior metrics that we can't see are improving. We know our industry and our clients needs VERY well. We THOUGHT our new content/layout was perfect so it will be tough for us to try to make improvements at this point. We believe our best plan of action now is to add more content on each page and A/B test it along with other subtle changes. The problem is that our new content is very concise and hits on all of the primary visitor intentions, so additions of content could be redundant and making concise answers more "fluffy", which is what we tried to get away from. What do you think? Is there reason for panic? What would your plan of attack be if your "sure shot" new design didn't provide the improvements you "knew" it would? 🙂
Web Design | | TheDude0 -
404's and a drop in Rank - Site maps? Data Highlighter?
I managed an old (2006 design) ticket site that was hosted and run by the same company that handled our point of sale. (Think, really crappy, customer had to click through three pages to get to the tickets, etc.) In Mid February, we migrated that old site to a new, more powerful site, built by a company that handles sites exclusively for ticket brokers. (My site: TheTicketKing. - dot - com) Before migration, I set up 301's for all the pages that we had currently ranked for, and had inbound links pointing to, etc. The CMS allowed me to set every one of those landing pages up with fresh content, so I created unique content for all of them, ran them through the Moz grader before launch, etc. We launched the site in Mid February, and it seemed like Google responded well. All the pages that we had 301's set up for stayed up fairly well in rank, and some even reached higher positions, while some took a few weeks to get back up to where they were before. Google was also giving us an average of 8-10K impressions per day, compared to 3000 per day with the old site. I started to notice a slow drop in impressions in mid April (after two months of love from Google,) and we lost rank on all our non branded pages around 4/23. Our branded terms are still fine, we didn't get a message from Google, and I reached out to the company that manages our site, asking if they had any issues with their other clients. They suggested that I resubmit our sitemaps. I did, and saw everything bump back up (impressions and rank) for just one week. Now we're back in the basement with all the non branded terms once again. I realize that Google could have penalized us without giving us a message, but what got me somewhat optimistic was the fact that resubmitting our sitemaps did bring us back up for around a week. One other thing that I was working on with the site just before the drop was Google's data highlighter. I submitted a set of pages that now come back with errors, after Google seemed to be fine with the data set before I submitted it. So now I'm looking at over 300 data highlighter errors when I'm in WMT. I deleted that set, but I still get the error listings in WMT, as if Google is still trying to understand those pages. Would that have an effect on our rank? Finally I do see that our 404's have risen steadily since the migration, to over 1000 now, and the people who manage the CMS tell me that it would have no effect on rank overall. And we're going to continue to get 404's as the nature of a ticket site would dictate? (Not sure on that, but that's what I was told.) Would anyone care to chime in on these thoughts, or any other clues as to my drop?
Web Design | | Ticket_King0 -
Changing Page Extenstions
Hi all, I have a 10 year old website done in Classic ASP, which is fast becoming outdated and we are going to convert it to PHP so all of our pages would be changed from a '/page.asp' extension to a '/page.php' extension. I am familiar with the need to setup 301 redirects for this and I understand there will probably be a short term drop in our Google rankings. Naturally, I don't want to have to go through this again in the future so here is my question. Is having NO page extension, like '/aboutus/history' the wave of the future? Does having no page extension effect SEO at all? I have seen more websites using this technique in 2013 goes on so I am thinking this is the way we should plan our site update. I haven't looked into how to actually do this yet, but it would seem to make sense to me so that if we needed to change from PHP to say .NET or something else later on, we would not have to do 301 redirects again or have another drop in our rankings. Do any of you have an opinion or experience with this?
Web Design | | jacksghost0 -
SEO page length 4500+ words
I have read varying discussions on this... some say it is good or rather it does not really matter (as long as not stuffed with keywords) and some say more than 1000+ words is bad! I have a travel site and I want to add an historical page about the zone. It is very interesting (very organic, not written for SEO purposes as such). It adds flavor and details to a site that is really all about sales. Does anyone have an opinion whether this is detrimental to SEO or not?
Web Design | | Llanero0 -
Home page redirect - will this cause an SEO problem
Hello, We are using Wordpress to build a wiki site. The wiki plugin we're using (Wordpress Wiki lite) can only be set up on an internal page like nlpwiki(dot)org/wiki Can we redirect the home page to the /wiki subdirectory and use nlpwiki(dot)org/wiki as our home page? I've never done that, just wondering if it will be indexed as the home page or if there are any connonical issues. Thanks!
Web Design | | BobGW0