Very wierd pages. 2900 403 errors in page crawl for a site that only has 140 pages.
-
Hi there,
I just made a crawl of the website of one of my clients with the crawl tool from moz.
I have 2900 403 errors and there is only 140 pages on the website.
I will give an exemple of what the crawl error gives me.
|
http://www.mysite.com/en/www.mysite.com/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en
|
|
|
|
|
|
|
|
|
|
There are 2900 pages like this.
I have tried visiting the pages and they work, but they are only html pages without CSS.
Can you guys help me to see what the problems is. We have experienced huge drops in traffic since Septembre.
-
Thank you so much for your response!
Yes. Could you please email me at [email protected]? I will be able to give you the url via email
-
Almost right, but 'just about' wrong; the 403 error is only served once an URL 'is' accessed. The content may not be accessible (as it's forbidden) but the URL itself, still is. Whilst it's unlikely that these URLs would ever be indexed, there's still an infinite loop in the link architecture which could impact upon crawl allowance and site health metrics
I'd get it sorted out!
-
but 403 is a forbidden error so those pages wouldn't be getting accessed from google. Google can't access them which in this case is a good thing right.
-
This is almost assuredly a link-based architectural error. It will be something similar to this:
- You load a page on EN
- You click the EN flag or language icon
- Instead of just reloading the page you are already on (since you're already on EN) the link is coded wrong and adds another /EN/ layer to the URL
- Once the new URL loads, the problem can be repeated
- This creates infinity URLs on your site
- Bad for Google, and Moz's crawler
Bet you it's something like that. If you give me the exact URL I might even be able to find the flaw and detail it for you via email or something
-
Hi there,
Thanks so much for reaching out - Sam from Moz's Help Team here!
I'm just going to be reaching out to you directly from [email protected] about this, after taking a look into your campaign and crawl. I'll be in touch soon!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Confused about repeated occurences of URL/essayorg/topic/ showing up as 404 errors in our site logs
Working on a Wordpress website, https://thedoctorwithin.comScanning the site’s 404 errors, I’m seeing a lot of searches for URL/essayorg/topic, coming from Bingbot, as well as other spiders (Google, OpensiteExlorer). We get at least 200 of these irrelevant requests per week. Seems like each topic that follows /essayorg/ is unique. Some include typos: /dissitation/Haven't done a verification to make sure the spiders are who they say they are, yet.Almost seems like there are many links ‘in the wild’ intended for Essay.Org that are being directed towards the site I’m working on.I've considered redirecting any requests for URL/essayorg/ to our sitemap… figuring that might encourage further spidering of actual site content. Is redirection to our sitemap xml file a good idea, or might doing so have unintended consequences? Interested in suggestions about why this might be occurring. Thank you.
Technical SEO | | linkjuiced0 -
Will unused/dead pages within my site that is non-linked hurt my seo?
For example my website has mysite.com/randomunusedpage.html No links go into that page from the website but it is published (came with the WP theme). Will that hurt my SEO and should I delete the page or is it harmless? Thanks
Technical SEO | | Marvellous0 -
Is it detrimental to make a site wide change from .html to .shtml (all pages)?
We have an established website with decent domain authority. My developer inherited the site from another developer and is recommending that we convert all pages from the .html to the .shmtl From an SEO perspective, would this hurt us? Also, if this is not an issue, would updating the canonical help us, or does the canonical setting only deal with the "www." vs. "non-www"? Any insights will be appreciated greatly. Thanks!
Technical SEO | | BVREID0 -
Is it good to redirect million of pages on a single page?
My site has 10 lakh approx. genuine urls. But due to some unidentified bugs site has created irrelevant urls 10 million approx. Since we don’t know the origin of these non-relevant links, we want to redirect or remove all these urls. Please suggest is it good to redirect such a high number urls to home page or to throw 404 for these pages. Or any other suggestions to solve this issue.
Technical SEO | | vivekrathore0 -
Problem with duplicate pages due to mobile site.
Hey everyone, We've got an issue where our current shopping cart provider (Volusion) allows us to use canonical and rel="alternate" links, however the canonical links are forced on our Desktop as well as mobile pages. When they should only be on the mobile pages. You can view what I mean at the below two pages: http://www.absoluteautomation.ca/fgd400-sensaphone400-p/fgd400.htm https://www.absoluteautomation.ca/mobile/Product.aspx?ProductCode=FGD400 Does anyone have any ideas in terms of working around this?
Technical SEO | | absoauto0 -
Help Crawl friendliness for large site
After watching Rand's video I am trying to think of the best way to make my large site more crawl friendly. Background I have a large site with over 100k product skus and so when you get to a particular page of products there are tons of different refinements and options that help you sort the products. Most of these are noindex followed, but I was wondering if I should be nofollowing the internal links as well in order to keep bots out of those pages and going to the pages that I want them to go too. Is this a good way to handle it? Also, does anyone have good recommendations of links to posts that deal with helping the crawl friendliness of a large site? Thanks!
Technical SEO | | Gordian0 -
How do crawl errors from SEOmoz tool set effect rankings?
Hello - The other day I presented the crawl diagnostic report to a client. We identified duplicate page title errors, missing meta description errors, and duplicate content errors. After reviewing the report we presented it to the clients web company who operates a closed source CMS. Their response was that these errors are not worthy of fixing and in fact they are not hurting the site. We are having issues getting the errors fixed and I would like your opinion on this matter. My question is, how bad are these errors? Should we not fix them? Should they be fixed? Will fixing the errors have an impact on our site's rankings? Personally, I think the question is silly. I mean, the errors were found using the SEOmoz tool kit, these errors have to be effecting SEO.....right? The attached image is the result of the Crawl Diagnostics that crawled 1,400 pages. NOTE: Most of the errors are coming from Pages like blog/archive/2011-07/page-2 /blog/category/xxxxx-xxxxxx-xxxxxxx/page-2 testimonials/147/xxxxx--xxxxx (xxxx represents information unique to the client) Thanks for your insight! c9Q33.png
Technical SEO | | Gabe0 -
Does duplicate content on word press work against the site rank? (not page rank)
I noticed in the crawl that there seems to be some duplicate content with my word press blog. I installed a seo plugin, Yoast's wordpress seo plugin, and set it to keep from crawling the archives. This might solve the problem but my main question is can the blog drag my site down?
Technical SEO | | tommr10