Google Cache can't keep up with my 403s
-
Hi Mozzers,
I hope everyone is well.
I'm having a problem with my website and 403 errors shown in Google Webmaster Tools. The problem comes because we "unpublish" one of the thousands of listings on the site every few days - this then creates a link that gives a 403. At the same time we also run some code that takes away any links to these pages. So far so good.
Unfortunately Google doesn't notice that we have removed these internal links and so tries to access these pages again. This results in a 403.
These errors show up in Google Webmaster Tools and when I click on "Linked From" I can verify that that there are no links to the 403 page - it's just Google's Cache being slow.
My question is
a) How much is this hurting me?
b) Can I fix it?
All suggestions welcome and thanks for any answers!
-
Hi Ray-pp,
Thanks for this. I think we will redirect to similar pages.
Much appreciated!
-
So... why return a 403 Forbidden? A 404 Not Found is what you should return. That sends a stronger signal than a 403. Either way, both will eventually lead to the pages being de-indexed. If you need the pages gone faster, there is a way to manually de-index a page using Webmaster Tools.
-
Hi HireSpace,
a) The negative impact depends on:
- Is there traffic landing on this page from any outside channel (organic, referral, paid marketing)
If so, then yes it is probably hurting your site. If a visitor sees a 403 page a common response is to go directly back to the referring page, i.e. they leave your site.
- Did the 403'd page have external links pointing to the page?
If yes, then a 403 error would cause the link authority to drop, since you do not redirect that page to another page on your site.
- As far as SEO is concerned, no this isn't negatively impacting your site.
When Google sees a 403 error they pretty much handle it like any other 400 error. They wont penalize you, however, having a lot of 400 errors could be an indication of poor usability and we know how Google loves to introduce new ranking factors for the SERPs.
b) Can I fix it?
Yes, I suggest, for any page removed from your site, that you 301 the page to its closest related page. This tells G that the page is permanently moved to a new page, pass any authority to that page, and anyone landing on the old page is automatically redirected to the new page. You'll see the 403 errors decrease as G crawls your site and recognizes the 301 redirect.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to get into Google's Tops Stories?
Hi All, I have been doing research for a few weeks and I cannot for the life of me figure out why I cannot get my website (Racenet) into the top stories in Google. We are in Google News, have "news article" schema, have AMP pages. Our news articles also perform quite well organically and we typically dominate the Google News section. We have two main competitors (Punters and Just Horse Racing) who are both in top stories and I cannot find anything that we are doing that they aren't. Apparently the AMP "news article" schema is incorrect and that could be the reason why we aren't showing up in Google Top Stories, but I can't find anything wrong with the schema and it looks the same as our competitors. For example: https://search.google.com/structured-data/testing-tool/u/0/#url=https%3A%2F%2Fwww.racenet.com.au%2Fnews%2Fblake-shinn-booked-to-ride-doncaster-handicap-favourite-alizee-20190331%3FisAmp%3D1 Does anyone have any ideas of why I cannot get my site into Google Top Stories? Any and all help would be greatly appreciated. Thanks! 🙂
Technical SEO | | Saba.Elahi.M.0 -
Geo ip filtering / Subdomain can't be crawled
My client has "load balancing" site traffic in the following way: domain: www.example.com traffic from US IP redirected to usa.example.com traffic from non-US IP redirected to www2.example.com The reason for doing this is that site contents on the www2 contains herbal medicine info banned by FDA."usa.example.com" is a "cleaned" site. Using HK IP, when I google an Eng keyword, I can see that www.example.com is indexed. When googling a Chi keyword, nothing is indexed - neither the domain or www2 subdomain. From Google Search Console, it shows a Dell Sonicwall geo ip filtering alert for www2 (Connection initiated from country: United States). GSC data also confirms that www2 has never been indexed by Google. Questions: Is geo ip filtering the very reason why www2 isn't indexed? What should I do in order to get www2 to be indexed? Thanks guys!
Technical SEO | | irene7890 -
Duplicate Content issue in Magento: The product pages are available true 3 URL's! How can we solve this?
Right now the product page "gedroogde goji bessen" (Dutch for: dried goji berries) is available true 3 URL's! **http://www.sportvoeding.net/gedroogde-goji-bessen ** =>
Technical SEO | | Zanox
By clicking on the product slider on the homepage
http://www.sportvoeding.net/superfood/gedroogde-goji-bessen =>
First go to sportvoeding.net/superfood (main categorie) and than clicking on "gedroogde Goji bessen"
http://www.sportvoeding.net/superfood/goji-bessen/gedroogde-goji-bessen =>
When directly go to the subcategorie "Goji Bessen" true the menu and there clicking on "gedroogde Goji Bessen" We want to have the following product URL:
http://www.sportvoeding.net/superfood/goji-bessen/gedroogde-goji-bessen Does someone know´s a good Exetension for this issue?0 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
I have a 404 error on my site i can't find.
I have looked everywhere. I thought it might have just showed up while making some changes, so while in webmaster tools i said it was fixed.....It's still there. Even moz pro found it. error is http://mydomain.com/mydomain.com No idea how it even happened. thought it might be a plugin problem. Any ideas how to fix this?
Technical SEO | | NateStewart0 -
Javascript to manipulate Google's bounce rate and time on site?
I was referred to this "awesome" solution to high bounce rates. It is suppose to "fix" bounce rates and lower them through this simple script. When the bounce rate goes way down then rankings dramatically increase (interesting study but not my question). I don't know javascript but simply adding a script to the footer and watch everything fall into place seems a bit iffy to me. Can someone with experience in JS help me by explaining what this script does? I think it manipulates the reporting it does to GA but I'm not sure. It was supposed to be placed in the footer of the page and then sit back and watch the dollars fly in. 🙂
Technical SEO | | BenRWoodard1 -
Can Google read text in Javascript?
We have just completed the redesign of our product page, which you can see here: http://www.uksoccershop.com/p-19045/2011-12-Chelsea-Adidas-Away-Football-Shirt.html Because we want the select size / add to basket section to appear prominently, you can see we are showing only a snippet of the product description in this section and then user has to click "more" to see it. My question is, can Google read the product description here since it's in Javascript? The code is as follows: 2011-12 Chelsea Adidas Away Football Shirt £44.99 Item Code:379606 Brand new, official Chelsea away shirt for the 2011/12 Premiership season, available to buy in adult sizes S, M, L, XL, XXL, XXXL. This football shirt is manufactured by Adidas and is black in colour.[ More...](javascript:void(0);) Brand new, official Chelsea away shirt for the 2011/12 Premiership season, available to buy in adult sizes S, M, L, XL, XXL, XXXL. This football shirt is manufactured by Adidas and is black in colour. Cheer on the Blues in style in the new adidas Chelsea Away Shirt, featuring a striking blue blocked design on an imposing black background complete with the club crest and adidas logo embroidery across the chest for a great style on or off the pitch. The new Chelsea Away Shirt is designed with adidas' ClimaCool technology to bring moisture away from your skin, keeping you cool, comfortable and performing at your best as you emulate the skills of Frank Lampard, Fernando Torres and John Terry on the pitch. Customise your shirt with Premiership shirt printing for your favourite Chelsea stars or choose your own custom name and number. Adult Football Shirt
Technical SEO | | ukss1984
Short sleeves soccer jersey
Chelsea club crest to left chest
adidas logo and stripes
Print sponsor to centre
ClimaCool technology
Machine washable Product code: 379606 The 2011/12 Chelsea away football kit is released on 7th July 2011. <form name="currenychange" action="http://www.uksoccershop.com/p-19045/2011-12-Chelsea-Adidas-Away-Football-Shirt.html" method="get">
<select class="topselectbox" onchange="this.form.submit();" name="currency" style="float:right;"> <option value="USD">US Dollars</option> <option value="EUR">Euro</option> <option value="GBP" selected="selected">UK Sterling</option> <option value="AUD">Australian Dollars</option> </select>
</form> Available Now [Be the first to ask a question](javascript:void(0); "Ask a Question")
[Be the first to review this product](javascript://) Rating: 5 out of 5 stars <form name="cart_quantity" action="http://www.uksoccershop.com/p-19045/2011-12-Chelsea-Adidas-Away-Football-Shirt.html?number_of_uploads=0&action=add_product" method="post" enctype="multipart/form-data"> Which parts of this is Google going to be able to read? Should we make the product title our H1 header for this page and can it currently read that within the code above? </form>0 -
How Best to Handle 'Site Jacking' (Unauthorized Use of Someone else's Dedicated IP Address)
Anyone can point their domain to any IP address they want. I've found at least two domains (same owner) with two totally unrelated domains (to each other and to us) that are currently pointing their domains to our IP address. The IP address is on our dedicated server (we control the entire physical server) and is exclusive to only that one domain (so it isn't a virtual hosting misconfiguration issue) This has caused Google to index their two domains with duplicate content from our site (found by searching for site:www.theirdomain.com) Their site does not come up in the first 50 results though for any of the keywords we come up for so Google obviously knows THEY are the dupe content, not us (our site has been around for 12 years - much longer than them.) Their registration is private and we have not been able to contact these people. I'm not sure if this is just a mistake on the DNS for the two domains or it is someone doing this intentionally to try to harm our ranking. It has been going on for a while, so it is most likely not a mistake for two live sites as they would have noticed long ago they were pointing to the wrong IP. I can think of a variety of actions to take but I can find no information anywhere regarding what Google officially recommends doing in this situation, assuming you can't get a response. Here's my ideas. a) Approach it as a Digital Copyright Violation and go through the lengthy process of having their site taken down. Pro: Eliminates the issue. Con: Sort of a pain and we could be leaving possibly some link juice on the table? b) Modify .htaccess to do a 301 redirect from any URL not using our domain, to our domain. This means Google is going to see several domains all pointing to the same IP and all except our domain, 301 redirecting to our domain. Not sure if THAT will harm (or help) us? Would we not receive link juice then from any site out there that was linking to these other domains? Con: Google will see the context of the backlinks and their link text will not be related at all to our site. In addition, if any of these other domains pointing to our IP have backlinks from 'bad neighborhoods' I assume it could hurt us? c) Modify .htaccess to do a 404 File Not Found or 403 forbidden error? I posted in other forums and have gotten suggestions that are all over the map. In many cases the posters don't even understand what I'm talking about - thinking they are just normal backlinks. Argh! So I'm taking this to "The Experts" on SEOMoz.
Technical SEO | | jcrist1