Site architecture change - +30,000 404's in GWT
-
So recently we decided to change the URL structure of our online e-commerce catalogue - to make it easier to maintain in the future.
But since the change, we have (partially expected) +30K 404's in GWT - when we did the change, I was doing 301 redirects from our Apache server logs but it's just escalated.
Should I be concerned of "plugging" these 404's, by either removing them via URL removal tool or carry on doing 301 redirections? It's quite labour intensive - no incoming links to most of these URL's, so is there any point?
Thanks,
Ben
-
Hi Ben,
The answer to your question boils down to usability and link equity:
- Usability: Did the old URLs get lots of Direct and Referring traffic? E.g., do people have them bookmarked, type them directly into the address bar, or follow links from other sites? If so, there's an argument to be made for 301 redirecting the old URLs to their equivalent, new URLs. That makes for a much more seamless user experience, and increases the odds that visitors from these traffic sources will become customers, continue to be customers, etc.
- Link equity: When you look at a Top Pages report (in Google Webmaster Tools, Open Site Explorer, or ahrefs), how many of those most-linked and / or best-ranking pages are old product URLs? If product URLs are showing up in these reports, they definitely require a 301 redirect to an equivalent, new URL so that link equity isn't lost.
However, if (as is common with a large number of ecommerce sites), your old product URLs got virtually zero Direct or Referring traffic, and had virtually zero deep links, then letting the URLs go 404 is just fine. I think I remember a link churn report in the early days of LinkScape when they reported that something on the order of 80% of the URLs they had discovered would be 404 within a year. URL churn is a part of the web.
If you decide not to 301 those old URLs, then you simply want to serve a really consistent signal to engines that they're gone, and not coming back. Recently, JohnMu from Google suggested recently that there's a tiny difference in how Google treats 404 versus 410 response codes - 404s are often re-crawled (which leads to those 404 error reports in GWT), whereas 410 is treated as a more "permanent" indicator that the URL is gone for good, so 410s are removed from the index a tiny bit faster. Read more: http://www.seroundtable.com/google-content-removal-16851.html
Hope that helps!
-
Hi,
Are you sure these old urls are not being linked from somewhere (probably internally)? Maybe the sitemap.xml was forgotten and is pointing to all the old urls still? I think that for 404's to show in GWT there needs to be a link to them from somewhere, so in the first instance in GWT go to the 404s and have a look at where they are linked from (you can do this with moz reports also). If it is an internal page like a sitemap, or some forgotten menu/footer feature or similar that is still linking to old pages then yes you certainly want to clear this up! If this is the case, once you have fixed the internal linking issues you should have significantly reduced list of 404s and can then concentrate on these on a more case by case basis (assuming they are being triggered by external links).
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is this campaign of spammy links to non-existent pages damaging my site?
My site is built in Wordpress. Somebody has built spammy pharma links to hundreds of non-existent pages. I don't know whether this was inspired by malice or an attempt to inject spammy content. Many of the non-existent pages have the suffix .pptx. These now all return 403s. Example: https://www.101holidays.co.uk/tazalis-10mg.pptx A smaller number of spammy links point to regular non-existent URLs (not ending in .pptx). These are given 302s by Wordpress to my homepage. I've disavowed all domains linking to these URLs. I have not had a manual action or seen a dramatic fall in Google rankings or traffic. The campaign of spammy links appears to be historical and not ongoing. Questions: 1. Do you think these links could be damaging search performance? If so, what can be done? Disavowing each linking domain would be a huge task. 2. Is 403 the best response? Would 404 be better? 3. Any other thoughts or suggestions? Thank you for taking the time to read and consider this question. Mark
White Hat / Black Hat SEO | | MarkHodson0 -
Does ID's in URL is good for SEO? Will SEO Submissions sites allow such urls submissions?
Example url: http://public.beta.travelyaari.com/vrl-travels-13555-online It's our sites beta URL, We are going to implement it for our site. After implementation, it will be live on travelyaari.com like this - "https://www.travelyaari.com/vrl-travels-13555-online". We have added the keywords etc in the URL "VRL Travels". But the problems is, there are multiple VRL travels available, so we made it unique with a unique id in URL - "13555". So that we can exactly get to know which VRL Travels and it is also a solution for url duplication. Also from users / SEO point of view, the url has readable texts/keywords - "vrl travels online". Can some Moz experts suggest me whether it will affect SEO performance in any manner? SEO Submissions sites will accept this URL? Meanwhile, I had tried submitting this URL to Reddit etc. It got accepted.
White Hat / Black Hat SEO | | RobinJA0 -
Two sites, heavily cross linking, targeting the same keyword - is this a battle worth fighting?
Hi Mozzers, Would appreciate your input on this, as many people have differing views on this when asked... We manage 2 websites for the same company (very different domains) - both sites are targeting the same primary keyword phrase, however, the user journey should incorporate both websites, and therefore the sites are very heavily cross linked - so we can easily pass a user from one site to another. Whilst site 1 is performing well for the target keyword phrase, site 2 isn't. Site 1 is always around 2 to 3 rank, however we've only seen site 2 reach the top of page 2 in SERPs at best, despite a great deal of white hat optimisation, and is now on the decline. There's also a trend (all be it minimal) of when site 1 improves in rank, site 2 drops. Because the 2 sites are so heavily inter-linked could Google be treating them as one site, and therefore dropping site 2 in the SERPs, as it is in Google's interests to show different, relevant sites?
White Hat / Black Hat SEO | | A_Q0 -
I have plenty of backlinks but the site does not seem to come up on Google`s first page.
My site has been jumping up and down for many months now. but it never stays on Google first page. I have plenty of back-links, shared content on social media. But what could i be doing wrong? any help will be appreciated. Content is legit. I have recently added some internal links is this might be the cause? Please help .
White Hat / Black Hat SEO | | samafaq0 -
Who's still being outranked by spam?
Over the past few months, through Google Alerts, I've been watching one of our competitors kick out crap press releases, and links to their site have been popping up all over blog networks with exact match anchor text. They now outrank us for that anchor text. Why is this still happening? Three Penguin updates later and this still happens. I'm trying so hard to do #RCS and acquire links that will ensure our site's long-term health in the SERPs. Is anyone else still struggling with this crap?
White Hat / Black Hat SEO | | UnderRugSwept2 -
Beating the file sharing sites in SERPs - Can it be done and how?
Hi all, A new client of mine is an online music retailer (CD, vinyls, DVD etc) who is struggling against file sharing sites that are taking precedence over the client's results for searches like "tropic of cancer end of things cd" If a site a legal retailer trying to make an honest living who's then having to go up against the death knell of the music industry - torrents etc. If you think about it, with all the penalties Google is fond of dealing out, we shouldn't even be getting a whiff of file sharing sites in SERPs, right? How is it that file sharing sites are still dominating? Is it simply because of the enormous amounts of traffic they receive? Does traffic determine ranking? How can you go up against torrents and download sites in this case. You can work on the onsite stuff, get bloggers to mention the client's pages for particular album reviews, artist profiles etc, but what else could you suggest I do? Thanks,
White Hat / Black Hat SEO | | Martin_S0 -
Big loss in Google traffic recently, but can't work out what the problem is
Since about May 17 my site - http://lowcostmarketingstrategies.com - has suffered a big drop in traffic from Google, presumed from the dreaded Penguin update. I am at a loss why I have been hit when I don't engage in any black hat SEO tactics or do any link building. The site is high quality, provides a good experience for the user and I make sure that all of the content is unique and not published elsewhere. The common checklist of potential problems from Penguin (such as keyword stuffing, web spam and over optimisation in general) don't seem relevant to my site. I'm wondering if someone could take a quick look at my site to see any obvious things that need to be removed to get back in Google's good books. I was receiving around 200 - 250 hits per day, but that has now dropped down to 50 - 100 and I fee that I have been penalised incorrectly. Any input would be fantastic Thanks 🙂
White Hat / Black Hat SEO | | ScottDudley0 -
NYT article on JC Penny's black hat campaign
Saw this article on JC Penny receiving a 'manual adjustment' to drop their rankings by 50+ spots: http://www.nytimes.com/2011/02/13/business/13search.html Curious what you guys think they did wrong, and whether or not you are aware of their SEO firm SearchDex? I mean, was it a simple case of low-quality spam links or was there more to it? Anyone study them in OpenSiteExplorer?
White Hat / Black Hat SEO | | scanlin0