404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Location of body text on page - at top or bottom - does it matter for SEO?
Hi - I'm just looking at the text on a redesigned homepage. They have moved all the text to the very bottom of the page (which is quite common with lots of designers, I notice - I usually battle to move the important text back up to the top). I have always ensured the important text comes at the top, to some extent - does it matter where on the page the text comes, for SEO? Are there any studies you can point me to? Thanks for your help, Luke
Web Design | | McTaggart1 -
Can wordpress actually be bad for sites if it static?
Hello, So last year I did rank for my website. Yet I switched from Adobe Muse to wordpress. I thought it would be great for updating and blobbing if I ever do it. So I got a theme, and went for it I have Yoast and that's it for plugins. But if I take say another couple of years to blog, am I hurting myself with wordpress? Like Google knows I am using wordpress so it expects me to be creating content? I know its an odd question, just had to ask
Web Design | | Berner0 -
Too Many Outbound Links on the Home Page - Bad for SEO?
Hello Again Moz community, This is my last Q of the day: I have a LOT of outbound links on the home page of www.web3.ca Some are to clients projects, most are to other pages on the website. Can reducing this to the core pages have a positive impact on SEO? Thanks, Anton
Web Design | | Web3Marketing870 -
Duplicate page title caused by Shopify CMS
Hi, We have an ecommerce site set up at devlinsonline.com.au using Shopify and the MOZ crawl is returning a huge number (hundreds!) of Duplicate Page Title errors. The issue seems to be the way that Shopify uses tagging to sort products. So, using the 'Riedel' collection as an example, the urls devlinsonline.com.au/collections/riedel-glasses/ devlinsonline.com.au/collections/riedel-glasses/decanters devlinsonline.com.au/collections/riedel-glasses/vinum all have the exact same page title. We are also having the same issue with the blog and other sections of our site. Is this something that is actually a serious issue or, perhaps, is Google's algorithm intelligent enough to recognise that this is part of Shopify's layout so it will not negatively affect our rankings and can, essentially, be ignored? Thanks.
Web Design | | SimonDevlin0 -
Link colour on page?
I always thought that the link colour has to be different from text colour? I have come across a site http://www.printandpackaging.co.uk/ and it has made me question this belief, they seem to only have bolded the link which would be very nice if this is fine.
Web Design | | BobAnderson0 -
Sites went from page 1 to page 40 + in results
Hello all We are looking for any insight we can get as to why all (except 1) of our sites were effected very badly in the rankings by Google since the Panda updates. Several of our sites londonescape.com dublinescape.com and prague, paris, florence, delhi, dubai and a few others (all escape.com urls) have had major drop in their rankings. LondonEscape.net (now.com (changed after rank drop) ), was ranked between 4th & 6th but is now down around 400th and DelhiEscape.net and MunichEscape.com were both number 1 for several years for our main key words We also had two Stay sites number 1 , AmsterdamStay and NewYorkstay both .com ranked number 1 for years , NewYork has dropped to 10th place so far the Amsterdam site has not been effected. We are not really sure what we did wrong. MunichEscape and DelhiEcape should never have been page 1 sites ) just 5 pages and a click thru to main site WorldEscape) but we never did anything to make them number 1. London, NewYork and Amsterdam sites have had regular new content added, all is checked to make sure its original. **Since the rankings drop ** LondonEscape.com site We have redirected the.net to the .com url Added a mountain of new articles and content Redesigned the site / script Got a fair few links removed from sites, any with multiple links to us. A few I have not managed yet to get taken down. So far no result in increased rankings. We contacted Google but they informed us we have NOT had a manual ban imposed on us, we received NO mails from Google informing us we had done anything wrong. We were hoping it would be a 6 month ban but we are way past that now. Anyone any ideas ?
Web Design | | WorldEscape0 -
Splash Pages For App Downlowds
Hi, We currently have a very simple splash page that Android and iPhone users see when they land on our homepage. The screen gives them the option to download our app or move on to the full website. If they choose to go to the site they are redirected to our homepage. Is this going to have any negative impacts on our rankings? I'm not sure how the Google bot treats this type of page. We have also talked about replacing the splash page with a modal window, but I'm concerned that this will increase the load time of the home page on mobile devices. Does anyone have any experience with a similar situation or any advice? Thanks in advance!
Web Design | | Cash4Books0 -
Site Ranks on Page 1 - Would launching new site hurt that
Hello, I currently have a website ranking in the top 7 for my main keyword. The website was built in 2004 and is definitely outdated, yet still ranks very high and brings in business. If i launched a new site on this domain, what would happen to my rankings? Would they drop? would they rise? If i don't launch the new site, will this site eventually drop due to being old and outdated? Any advice would be helpful...
Web Design | | Prime850