404 page not found after site migration
-
Hi,
A question from our developer.
We have an issue in Google Webmaster Tools.
A few months ago we killed off one of our e-commerce sites and set up another to replace it. The new site uses different software on a different domain. I set up a mass 301 redirect that would redirect any URLs to the new domain, so domain-one.com/product would redirect to domain-two.com/product. As it turns out, the new site doesn’t use the same URLs for products as the old one did, so I deleted the mass 301 redirect.
We’re getting a lot of URLs showing up as 404 not found in Webmaster tools. These URLs used to exist on the old site and be linked to from the old sitemap. Even URLs that are showing up as 404 recently say that they are linked to in the old sitemap. The old sitemap no longer exists and has been returning a 404 error for some time now. Normally I would set up 301 redirects for each one and mark them as fixed, but there are almost quarter of a million URLs that are returning 404 errors, and rising.
I’m sure there are some genuine problems that need sorting out in that list, but I just can’t see them under the mass of errors for pages that have been redirected from the old site. Because of this, I’m reluctant to set up a robots file that disallows all of the 404 URLs.
The old site is no longer in the index. Searching google for site:domain-one.com returns no results.
Ideally, I’d like anything that was linked from the old sitemap to be removed from webmaster tools and for Google to stop attempting to crawl those pages.
Thanks in advance.
-
I agree that the 301 redirect would be your best option as you can pass along not only users but the bots to the right page.. You may need to get a developer in to write some regular expressions to parse the incoming request and then automatically find the correct new URL. I have worked on sites with a large number of pages and using some sort of automation is the only way to go.
That said, if you simply want to kill the old URLs you can show the 404s or 410s. As you mention, then you end up with a bunch of 404 errors in GWT. I have been there too, it's like damned if you do, damned if you don't. We had some URLs that were tracking URLs from an old site and we are now here a year later (been showing 410s for over a year on the old tracking URLs) they still show up in GWT as errors.
We are trying a new solution for how to remove these URLs from the index without getting 404 errors. We show a 200 and then we put up a minimal html page with the meta robots noindex tag.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. "
So, we allow Google to find the page, get a 200 (so no 404 errors), but then use the meta noindex tag to tell Google to remove it from the index and stop crawling the page.
Remember, this is the "nuclear" option. You only want to do this to remove the pages from the Google index. Someone mentioned using GWT to remove URLs, but if I remember correctly, you only have so many pages you can do this with at a time.
If you list the files within the robots.txt. Google will not spider the files, but then if you remove the page from robots.txt file, they will start to try spidering again. I have seen Google come back a year later on URLs when I take them out of robots. This is what happened to us and so we tried just showing the 410/404, but Google still keeps crawling. We recently moved to this option with the 200/noindexmeta and it seems to be working.
Good luck!
-
You can but the 404s should stop being crawled on their own. There's a webmaster tool that you can use to make that happen faster as well
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=64033
-
Yeah it's a 404 http://www.tester.co.uk/17th-edition-equipment/multifunction-testers/fluke-1651b-multifunction-installation-tester
with over 200,000 404's its a lot to go through and 301. For some reason they it got migrated they just pointed the old url to a new one replacing the root domain name without creating matching url's. Doh.
I was thinking about robot.txt filling them all?
-
A 404 should cause Google to de-index the content. Go to one of the bad URLs and view the headers to make sure that your webserver is returning a status 404 and not just a 404 "page".
As hard and time consuming as it might be, I would still pursue a 301 option. It's the cleanest way to resolve the issue. Just start nibbling at it and you can make a dent. Doing nothing just lets the problem grow.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will there be problems in the future with a mobile dedicated site?
Just wanted everyone's input/opinion on this article that basically states Google will move to a solely mobile index in the future https://www.nngroup.com/articles/mobile-vs-responsive/ That seems like it would negatively impact sites that have a separate URL for their mobile site. In this particular case I'm talking about... the mobile site URL is this layout: www.site.com/MobileView/MobileHome.aspx Any thoughts/input would be enormously appreciated.
Web Design | | AliMac260 -
How do you optimize for online catalog PDFs in regards to Page load time?
Does anyone have any experience with online widgets or apps that can support catalog pdfs? We have tons of catalog PDFs on one page for the website and the more we add, the worse the page load time gets. Any thoughts would be appreciated. Cheers!
Web Design | | FullMedia900 -
Should I Use An Animated Javascript Responsive Site
Hi, hope someone might be able to help me with this. I am setting my son up with a website for his small painting and decorating company. However, I am a wordpress stalwart and he has seen a theme which is a javascript animated responsive theme from template monster. Its not my choice just he is adamant that he wants it. However, I am slightly concerned that Google cannot index as well with these kind of sites as they would with a standard HTML site. I would be grateful if someone could confirm if they can be indexed etc. The content appears in what I can only describe as lightboxes. Thanks
Web Design | | denismilton0 -
How much does on-site duplicated content affect SERPs?
Hi, We've recently gotten into Moz, with our E-commerce websites, and discovered that it's crawler takes note of about 2500 pages which it thinks are the same (duplicated). We've now begun to completely rewrite every description of every product (including Meta Title/Description) so that this number may be reduced. Since this is the biggest issue Moz spots I'm wondering what the effect of fixing it will be on our position in the SERP (mainly Google). Does anybody have some stories or experience about this topic? Thanks in Advance! 🙂 Alexander
Web Design | | WebmasterAlex0 -
Comparing the site structure/design of my live site to my new design
Hi SEOmoz team, for the last few months I've been working on a new design for my website, the old, live design can be viewed at http://www.concerthotels.com - it is primarily focused on helping users find hotels close to concert venues throughout North America. The old structure was built in such a way that each concert venue had a number of different pages associated with it (all connected via tabs) - a page with information about the venue, a page with nearby hotels to the venue, a page of upcoming events, a page of venue reviews. An example of these pages can be seen at: http://www.concerthotels.com/venue/madison-square-garden/304484 http://www.concerthotels.com/venue-hotels/madison-square-garden-hotels/304484 http://www.concerthotels.com/venue-events/madison-square-garden-events/304484 http://www.concerthotels.com/venue-reviews/madison-square-garden-reviews/304484 The /venue-hotels/ pages are the most important pages on my website - and there is one of these pages for each concert venue - they are the landing pages for about 90% of the traffic on the website. I decided that having four pages for each venue was probably a poor design, since many of the pages ended up having little or no useful, unique content. So my new design attempts to bring a lot of the venue information together into fewer pages. My new website redesign is temporarily situated at: (not currently launched to the public) http://www.concerthotels.com/frontend The equivalent pages for Madison Square Garden are now: http://www.concerthotels.com/frontend/venue/madison-square-garden/304484 (the page above contains venue information, events and reviews) and http://www.concerthotels.com/frontend/venue-hotels/madison-square-garden-hotels/304484 I would really appreciate any feedback from you guys, based on what you think of the new site design compared to the old design from an SEO point of view. Of course, any feedback on site speed, easy of use etc compared to the old design would also be greatly appreciated. 🙂 My main fear is that when I launch the new design (the new URLs will be identical to the old ones), Google will take a dislike to it - I currently receive a large percentage of my traffic through Google organic search, so I don't want to launch a design that might damage that traffic. My gut instinct tells me that Google should prefer the new design - vastly reduced number of pages, each page now contains more unique content, and it's very much designed for users, so I'm hoping bounce rate, conversion etc will improve too. But my gut has been wrong in the past! 🙂 But I'd love to hear your thoughts, and thanks in advance for any feedback, Cheers Mike
Web Design | | mjk260 -
How can I reduce my warnings for excesive links on our site?
Our campaign overview shows well over 100 warnings that could be hurting our google ranking based on excessive links on pages. Each page listed, however, is simply due to listing the brands we carry, and linking to the products. Is there a way to do this without hurting our ranking? A better way than linking, perhaps? Thanks in advance!
Web Design | | guycochran0 -
Multiple Sites, multiple locations similar / duplicate content
I am working with a business that wants to rank in local searches around the country for the same service. So they have websites such as OURSITE-chicago.com and OURSITE-seattle.com -- All of these sites are selling the same services, but with small variations in each state due to different legal standards in the state. The current strategy is to put up similar "local" websites with all the same content. So the bottom line is that we have a few different sites with the same content. The business wants to go national and is planning a different website for each location. In my opinion the duplicate content is a real problem. Unfortunately the nature of the service makes it so that there aren't many ways to say the same thing on each site 50 times without duplicate content. Rewriting content for each state seems like a daunting task when you have 70+ pages per site. So, from an SEO standpoint we have considered: Using the canonocalization tag on all but the central site... I think this would hurt all of the websites SERPs because none will have unique content. Having a central site with directories OURSITE.com/chicago -- but this creates a problem because we need to link back to the relevant content in the main site and ALSO have the unique "Chicago" content easily accessable to Chicago users while having Seattle users able to access their Seattle data. The best way we thought to do this was using a frame with a universal menu and a unique state based menu... Also not a good option because of frames will also hurt SEO. Rewrite all the same content 50 times. You can see why none of these are desirable options. But I know that plenty of websites have "state maps" on their main site. Is there a way to accomplish this in a way that doesn't make our copywriter want to kill us?
Web Design | | SysAdmin190 -
How much content is too much? Best Pages For Content?
To my understanding content has a lot to do with organic rankings if written correctly. My question is, how much content is too much and what pages are best to place content. Our company sells very costly products. Our customers call to purchase, we do not have an eCommerce site. Write now we have on average 350 words per page. We have about 200+ pages. Each page is written for that general category and each product has its own unique content. It seems to me that the pages with less content, tend to rank a bit better. As we are in the process of redoing our website, is there any recommendations on writing content, or adjusting the amount of text. I am thinking a lot of our text is informative only to a certain extent. Would writing content just for the main category page be better, and then on the actual product page, have only about 250 words as a description? Are there any other recommendations for SEO that are fairly new? Besides the Title, Description, Heading Tags, Image Alts, URLS etc.
Web Design | | hfranz0