Site architecture change - +30,000 404's in GWT
-
So recently we decided to change the URL structure of our online e-commerce catalogue - to make it easier to maintain in the future.
But since the change, we have (partially expected) +30K 404's in GWT - when we did the change, I was doing 301 redirects from our Apache server logs but it's just escalated.
Should I be concerned of "plugging" these 404's, by either removing them via URL removal tool or carry on doing 301 redirections? It's quite labour intensive - no incoming links to most of these URL's, so is there any point?
Thanks,
Ben
-
Hi Ben,
The answer to your question boils down to usability and link equity:
- Usability: Did the old URLs get lots of Direct and Referring traffic? E.g., do people have them bookmarked, type them directly into the address bar, or follow links from other sites? If so, there's an argument to be made for 301 redirecting the old URLs to their equivalent, new URLs. That makes for a much more seamless user experience, and increases the odds that visitors from these traffic sources will become customers, continue to be customers, etc.
- Link equity: When you look at a Top Pages report (in Google Webmaster Tools, Open Site Explorer, or ahrefs), how many of those most-linked and / or best-ranking pages are old product URLs? If product URLs are showing up in these reports, they definitely require a 301 redirect to an equivalent, new URL so that link equity isn't lost.
However, if (as is common with a large number of ecommerce sites), your old product URLs got virtually zero Direct or Referring traffic, and had virtually zero deep links, then letting the URLs go 404 is just fine. I think I remember a link churn report in the early days of LinkScape when they reported that something on the order of 80% of the URLs they had discovered would be 404 within a year. URL churn is a part of the web.
If you decide not to 301 those old URLs, then you simply want to serve a really consistent signal to engines that they're gone, and not coming back. Recently, JohnMu from Google suggested recently that there's a tiny difference in how Google treats 404 versus 410 response codes - 404s are often re-crawled (which leads to those 404 error reports in GWT), whereas 410 is treated as a more "permanent" indicator that the URL is gone for good, so 410s are removed from the index a tiny bit faster. Read more: http://www.seroundtable.com/google-content-removal-16851.html
Hope that helps!
-
Hi,
Are you sure these old urls are not being linked from somewhere (probably internally)? Maybe the sitemap.xml was forgotten and is pointing to all the old urls still? I think that for 404's to show in GWT there needs to be a link to them from somewhere, so in the first instance in GWT go to the 404s and have a look at where they are linked from (you can do this with moz reports also). If it is an internal page like a sitemap, or some forgotten menu/footer feature or similar that is still linking to old pages then yes you certainly want to clear this up! If this is the case, once you have fixed the internal linking issues you should have significantly reduced list of 404s and can then concentrate on these on a more case by case basis (assuming they are being triggered by external links).
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is this campaign of spammy links to non-existent pages damaging my site?
My site is built in Wordpress. Somebody has built spammy pharma links to hundreds of non-existent pages. I don't know whether this was inspired by malice or an attempt to inject spammy content. Many of the non-existent pages have the suffix .pptx. These now all return 403s. Example: https://www.101holidays.co.uk/tazalis-10mg.pptx A smaller number of spammy links point to regular non-existent URLs (not ending in .pptx). These are given 302s by Wordpress to my homepage. I've disavowed all domains linking to these URLs. I have not had a manual action or seen a dramatic fall in Google rankings or traffic. The campaign of spammy links appears to be historical and not ongoing. Questions: 1. Do you think these links could be damaging search performance? If so, what can be done? Disavowing each linking domain would be a huge task. 2. Is 403 the best response? Would 404 be better? 3. Any other thoughts or suggestions? Thank you for taking the time to read and consider this question. Mark
White Hat / Black Hat SEO | | MarkHodson0 -
We have a site with a lot of international traffic, can we split the site some way?
Hello, We have a series of sites and one, in particular, has around 75,000 (20%) monthly users from the USA, but we don't currently offer them anything as our site is aimed at the UK market. The site is a .com and though we own the .co.uk the .com is the primary domain. We have had a lot of success moving other sites to have the .co.uk as the primary domain for UK traffic. However, in this case, we want to keep both the UK traffic and the US traffic and if we split it into two sites, only one can win right? What could do? It would be cool to have a US version of our site but without affecting traffic too much. On the other sites, we simply did 301 redirects from the .com page to the corresponding .co.uk page. Any ideas?
White Hat / Black Hat SEO | | AllAboutGroup0 -
Backlinks from customers' websites. Good or bad? Violation?
Hi all, Let's say a company holds 100 customers and somehow getting a backlink from all of their websites. Usually we see "powered by xyz", etc. Is something wrong with this? Is this right backlinks strategy? Or violation of Google guidelines? Generally most of the customers's websites do not have good DA; will it beneficial getting a backlinks from such average below DA websites? Thanks
White Hat / Black Hat SEO | | vtmoz0 -
Pages mirrored on unknown websites (not just content, all the HTML)... blackhat I've never seen before.
Someone more expert than me could help... I am not a pro, just doing research on a website... Google Search Console shows many backlinks in pages under unknown domains... this pages are mirroring the pages of the linked website... clicking on a link on the mirror page leads to a spam page with link spam... The homepage of these unknown domain appear just fine... looks like that the domain is partially hijacked... WTF?! Have you ever seen something likes this? Can it be an outcome of a previous blackhat activity?
White Hat / Black Hat SEO | | 2mlab0 -
Sudden influx of 404's affecting SERP's?
Hi Mozzers, We've recently updated a site of ours that really should be doing much better than it currently is. It's got a good backlink profile (and some spammy links recently removed), has age on it's side and has been SEO'ed a tremendous amount. (think deep-level, schema.org, site-speed and much, much more). Because of this, we assumed thin, spammy content was the issue and removed these pages, creating new, content-rich pages in the meantime. IE: We removed a link-wheel page; <a>https://www.google.co.uk/search?q=site%3Asuperted.com%2Fpopular-searches</a>, which as you can see had a **lot **of results (circa 138,000). And added relevant pages for each of our entertainment 'categories'.
White Hat / Black Hat SEO | | ChimplyWebGroup
<a>http://www.superted.com/category.php/bands-musicians</a> - this page has some historical value, so the Mozbar shows some Page Authority here.
<a>http://www.superted.com/profiles.php/wedding-bands</a> - this is an example of a page linking from the above page. These are brand new URLs and are designed to provide relevant content. The old link-wheel pages contained pure links (usually 50+ on every page), no textual content, yet were still driving small amounts of traffic to our site.
The new pages contain quality and relevant content (ie - our list of Wedding Bands, what else would a searcher be looking for??) but some haven't been indexed/ranked yet. So with this in mind I have a few questions: How do we drive traffic to these new pages? We've started to create industry relevant links through our own members to the top-level pages. (http://www.superted.com/category.php/bands-musicians) The link-profile here _should _flow to some degree to the lower-level pages, right? We've got almost 500 'sub-categories', getting quality links to these is just unrealistic in the short term. How long until we should be indexed? We've seen an 800% drop in Organic Search traffic since removing our spammy link-wheel page. This is to be expected to a degree as these were the only real pages driving traffic. However, we saw this drop (and got rid of the pages) almost exactly a month ago, surely we should be re-indexed and re-algo'ed by now?! **Are we still being algor****hythmically penalised? **The old spammy pages are still indexed in Google (138,000 of them!) despite returning 404's for a month. When will these drop out of the rankings? If Google believes they still exist and we were indeed being punished for them, then it makes sense as to why we're still not ranking, but how do we get rid of them? I've tried submitting a manual removal of URL via WMT, but to no avail. Should I 410 the page? Have I been too hasty? I removed the spammy pages in case they were affecting us via a penalty. There would also have been some potential of duplicate content with the old and the new pages.
_popular-searches.php/event-services/videographer _may have clashed with _profiles.php/videographer, _for example.
Should I have kept these pages whilst we waited for the new pages to re-index? Any help would be extremely appreciated, I'm pulling my hair out that after following 'guidelines', we seem to have been punished in some way for it. I assumed we just needed to give Google time to re-index, but a month should surely be enough for a site with historical SEO value such as ours?
If anyone has any clues about what might be happening here, I'd be more than happy to pay for a genuine expert to take a look. If anyone has any potential ideas, I'd love to reward you with a 'good answer'. Many, many thanks in advance. Ryan.0 -
Month old site and alreasdy ranks 3 for competitive keyword
I know this individual does this with several sites and then offers them for sale to his competitors. Obviously spammy thru and thru, but how can google reward a site thats not even two months old, with 1900 + links with a ranking of #3 for a highly competitive keyword? Please dont post the actual name or url of the website as we dont want to give him any more credit but this blows my mind as he has done this several times with other sites and never gets penalized. http://tinyurl.com/b9jysa5 Any ideas as to how he can accomplish this besides almost 2000 links in less than 2 months? How is that even remotely natural? I know his other sites have been reported to google but they never did anything about it. Thanks for any feedback.
White Hat / Black Hat SEO | | anthonytjm0 -
Why doesn't Google find different domains - same content?
I have been slowly working to remove near duplicate content from my own website for different locals. Google seems to be doing noting to combat the duplicate content of one of my competitors showing up all over southern California. For Example: Your Local #1 Rancho Bernardo Pest Control Experts | 858-352 ... <cite>www.pestcontrolranchobernardo.com/</cite>CachedYou +1'd this publicly. UndoPest Control Rancho Bernardo Pros specializes in the eradication of all household pests including ants, roaches, etc. Call Today @ 858-352-7728. Your Local #1 Oceanside Pest Control Experts | 760-486-2807 ... <cite>www.pestcontrol-oceanside.info/</cite>CachedYou +1'd this publicly. UndoPest Control Oceanside Pros specializes in the eradication of all household pests including ants, roaches, etc. Call Today @ 760-486-2807. The competitor is getting high page 1 listing for massively duplicated content across web domains. Will Google find this black hat workmanship? Meanwhile, he's sucking up my business. Do the results of the competitor's success also speak to the possibility that Google does in fact rank based on the name of the url - something that gets debated all the time? Thanks for your insights. Gerry
White Hat / Black Hat SEO | | GerryWeitz0 -
Pages For Products That Don't Exist Yet?
Hi, I have a client that makes products that are accessories for other company's popular consumer products. Their own products on their website rank for other companies product names like, for made up example "2011 Super Widget" and then my client's product... "Charger." So, "Super Widget 2011 Charger" might be the type of term my client would rank for. Everybody knows the 2012 Super Widget will be out in some months and then my client's company will offer the 2012 Super Widget Charger. What do you think of launching pages now for the 2012 Super Widget Charger. even though it doesn't exist yet in order to give those pages time to rank while the terms are half as competitive. By the time the 2012 is available, these pages have greater authority/age and rank, instead of being a little late to the party? The pages would be like "coming soon" pages, but still optimized to the main product search term. About the only negative I see is that they'lll have a higher bounce rate/lower time on page since the 2012 doesn't even exist yet. That seems like less of a negative than the jump start on ranking. What do you think? Thanks!
White Hat / Black Hat SEO | | 945010