Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
-
I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before?
-
These extensions look like they are attachments. Go into GWT, click on the 404 link, then a box with pop up, click on the "Linked From" tab.. Go to the page and Ctrl U to see the source code. Do a CTRL F and search for the broken link. When you find it in your source code you should be able to figure out what's triggering that response. If you can't find the URLs in your source code, mark them as fixed and it should take care of the problem. Especially if they are older. It looks like it could be a shipment status, a product out of stock message, and a PDF of train schedules.
I would check the linked from pages and make sure that there isn't some erroneous code that is creating a page when it doesn't need to.
-
I think you could have is the following problem:
You got a Domain wich earned Links bevore and these Links are still out there. Google can see them and you will again and again see them in WMT.
You cant just say solved and than they are gone, caused by the backlinks. They come again.
Check that in WMT - I dont know how it is called in english versions (I just see the german), you can click on the 404 and than take a look at "what is linking to that page"
An easy solution in that case may be to disallow "/fs" for bots. (if you dont use fs in your url structure) -
The 404s start with the correct url for the site but are then suffixed after the forward slash with urls such as;
fs/201410/a_Royal_Mail_Item_Is_Currently_Being_Processed_For_Delivery_.html
/fs/201410/a_When_will_John_lewis_restock_the_Dr_Dre_beats_solo_hd_.html
fs/201410/a_Food_stalls_at_train_stations_in_London_.html
There are 10,000+ of these in my WMT account. I have never seen this before - any ideas?
-
Need more data to help, eh!
-
What do you mean "pages that have nothing to do with my site" are these not on your domain or are they on your domain but you are not familiar with them?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any SEO-wizards out there who can tell me why Google isn't following the canonicals on some pages?
Hi, I am banging my head against the wall regarding the website of a costumer: In "duplicate title tags" in GSC I can see that Google is indexing a whole bunch parametres of many of the url's on the page. When I check the rel=canonical tag, everything seems correct. My costumer is the biggest sports retailer in Norway. Their webshop has approximately 20 000 products. Yet they have more than 400 000 pages indexed by Google. So why is Google indexing pages like this? What is missing in this canonical?https://www.gsport.no/herre/klaer/bukse-shorts?type-bukser-334=regnbukser&order=price&dir=descWhy isn't Google just cutting off the ?type-bukser-334=regnbukser&order=price&dir=desc part of the url?Can it be the canonical-tag itself, or could the problem be somewhere in the CMS? Looking forward to your answers Sigurd
Technical SEO | | Inevo0 -
Received A Notice Regarding Spammy Structured Data. But we don't have any structured data or do we?
Got a message that we have spammy structured data on our site via webmaster tools and have no idea what they are referring to. We do not use any structured data using schema.org mark up. Could they be referring to something else? The message was: To: Webmaster of <a>http://www.lulus.com/</a>, Google has detected structured markup on some of your pages that violates our structured data quality guidelines. In order to ensure quality search results for users, we display rich search results only for content that uses markup that conforms to our quality guidelines. This manual action has been applied to lulus.com/ . We suggest that you fix your markup and file a reconsideration request. Once we determine that the markup on the pages is compliant with our guidelines, we will remove this manual action. What could we be showing them that would be interpreted as structured data, and or spammy structured data?
Technical SEO | | KentH0 -
Webmaster message - increase in 404 pages
I've had a message in webmaster tools: Increase in “404” pages on http://www.ethicaredental.co.uk/There are 1000 pages in the crawl error list. But all of them direct to t a 404 page, i.e http://www.ethicaredental.co.uk/search?searchword=toothwhich, which as far as I can tell has all the necessary features of a good 404 page (clear message saying the page doesnt exist anymore, navigation, and business details.The webiste was built in Joomla previous to a re-design in Wordpress. Is this a Joomla issue? How can I satisfy webmaster and Googles crawl to understand these are decent 404 pages? Or dop all 1000 pages need to be 301 redirected.??Any thoughts appreciated
Technical SEO | | dentaldesign1 -
My sites "pages indexed by Google" have gone up more than qten-fold.
Prior to doing a little work cleaning up broken links and keyword stuffing Google only indexed 23/333 pages. I realize it may not be because of the work but now we have around 300/333. My question is is this a big deal? cheers,
Technical SEO | | Billboard20120 -
How to handle pages I can't delete?
Hello Mozzers, I am using wordpress and I have a small problem. I have two sites, I don't want but the dev of the theme told me I can't delete them. /portfolio-items/ /faq-items/ The dev said he can't find a way to delete it because these pages just list faqs/portfolio posts. I don't have any of these posts so basically what I have are two sites with just the title "Portfolio items" and "FAQ Items". Furthermore the dev said these sites are auto-generated so he can't find a way to remove them. I mean I don't believe that it's impossible, but if it is how should I handle them? They are indexed by search engines, should I remove them from the index and block them from robots.txt? Thanks in advance.
Technical SEO | | grobro0 -
Can't get Google to Index .pdf in wp-content folder
We created an indepth case study/survey for a legal client and can't get Google to crawl the PDF which is hosted on Wordpress in the wp-content folder. It is linked to heavily from nearly all pages of the site by a global sidebar. Am I missing something obvious as to why Google won't crawl this PDF? We can't get much value from it unless it gets indexed. Any help is greatly appreciated. Thanks! Here is the PDF itself:
Technical SEO | | inboundauthority
http://www.billbonebikelaw.com/wp-content/uploads/2013/11/Whitepaper-Drivers-vs-cyclists-Floridas-Struggle-to-share-the-road.pdf Here is the page it is linked from:
http://www.billbonebikelaw.com/resources/drivers-vs-cyclists-study/0 -
Access denied in google webmaster tools
Hi I have just checked on my google webmaster tools and it is showing i 11 urls that are coming back as access denied. Now the urls are working, and they have been redirected using 301 redirect, so i have done everything right but for some reason google is not able to crawl them. Does anyone know what i have done wrong for it to come back as access denied and how i can solve this problem. the site is www.in2town.co.uk many thanks | | | |
Technical SEO | | ClaireH-184886
| | 2 | Gardening/Gardening-Advice-What-is-Hydroponic-Gardening/menu-id-4991 | 403 | 4/11/13 |
| | 3 | Top-Showbiz-News/Super-Injunctions-Are-Right-Says-Hugh-Grant | 403 | 4/29/13 |
| | 4 | Entertainment-Tonight/Cheryl-Cole-wants-to-spice-up-The-X-Factor | 403 | 4/11/13 |
| | 5 | Tiger-Woods-paid-10000-a-time-for-sex | 403 | 4/20/13 |
| | 6 | Thousands-of-children-hurt-trying-to-stop-arguments-between-adults/Thousands-of-children-hurt-trying-to-stop-arguments-between-adults/menu-id-4448 | 403 | 4/24/13 |
| | 7 | News-Showbiz/Doctor-Who-changed-my-life-says-Matt-Smith | 403 | 4/29/13 |
| | 8 | The-Latest-Health-News/Hypnosis-Hypnotherapy-for-Relationships/menu-id-4744 | 403 | 4/11/13 |
| | 9 | Soap-Gossip-Latest-News/Emmerdale-Marks-bit-on-the-side-comes-to-home-farm/menu-id-4615 | 403 | 4/11/13 |
| | 10 | news/eastenders/ | 403 | 4/11/13 |
| | 11 | entertainment-news/Prince-William-Stag-Do-To-Be-Held-in-Cape-Town | 403 | 3/24/13 | | | |0 -
Error in webmaster tools
Hi, I just got an error (12 pages especifically) from webmaster tools when consulting "indexing problems" Something like: The URL doesn't exist, but the server doesn't return a 404 error. What should I do? Many Thanks.
Technical SEO | | juanmiguelcr0