Cannot crawl website with redirect intalled on subdomain url
-
Hi!
I want to crawl this website : http://www.car-moderne.ch.
I tried a got back the crawl just for that one url (not for all the pages of the website). This single line cvs says that the status of the http://www.car-moderne.ch is 200, but in fact it is a redirect 301 to http://www.car-moderne.ch/fr where the live home page is (actually the Moz bar sees the 301, not the 200 as the single-lined crawl does).
How can I proceed in this case (a 301 redirect being installed on the subdomain url) to still be able to have a full-fledged juicy cvs with all the broken links, duplicate content, etc.
Thank you for your help!
Pascal Hämmerli
-
So glad to help, Pascal!
-
Dear Chiaryn,
Thank you for your very helpful reply.
This website is hosted on a partner agency who create the website and I only act as a SEO consultant for them. What you say is very helpful because it means their home-made CMS should be corrected to provided better 301 redirection.
I wish you a good day,
Pascal
-
Hey Pascal,
Sorry for the confusion here! It looks like the subdomain, www.car-moderne.ch, returns a 200 HTTP status to our crawler and to other crawlers, such as the hurl.it tool. In the body of the screenshot I attached from the hurl.it tool, the only code there is the number 404, so basically the site is serving a page with no crawlable data. The page isn't redirecting and it doesn't return any real source code, so there is no data for us to include in the crawl. I would recommend working with your webmaster to resolve this issue and to get the page to correctly serve a 301 redirect to the /fr version of the site to all crawlers.
I can see that the site is correctly responding with a 301 redirect for some crawlers, such as this test I ran as googlebot, but the response doesn't seem to be consistent. One thing you will want to be sure to have your webmaster check is how the site responds to user-agents that are hosted on Amazon Web Services, as some of our crawlers and the hurl.it crawl are both hosted through AWS.
Once the issue of the HTTP response is resolved, you should be able to get much better data from the crawl test tool.
I hope this helps! Please let me know if I can help you with anything else.
Chiaryn
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I get a redirect chain issue, but is it because of how I entered the campaign?
I get a redirect chain issue, but I see that I enter the website in Moz as http://www.website.com Google and everywhere else have it as https://www.website.com so I would imaging Google would never run into this issue, only the Moz bot does because of how it's entered. However, I can't change the campaign, so do I just ignore it? Or is there still an actual problem that needs to be addressed?
Moz Bar | | bizmarquee0 -
New Domain Authority 2.0 have affected my website rankings badly
My website https://www.successvalley.tech/ domain authority was 39 but after the new update, its now 15. Am not happy at all because it took me three years to get that and now it has been reduce to nothing. Please I kindly need explanation as to what happened. Thanks
Moz Bar | | Amenorhu1 -
I have recently switzed to SSL on ramxpert.dk, now Moz is not able to crawl the site anymore
Hi Moz I have recently switzed to SSL on ramxpert.dk, now Moz is not able to crawl anymore. When I look at the crawl report there is just one fault displayed: "403 : Received 403 (Forbidden) error response for page." How can i solve this issue asap? I was not aware that there would be any issues with moz when activating a SSL certificate on the site. The campaign with the problem is: ramxpert.dk. Regards
Moz Bar | | WebBoost
Jesper Nielsen0 -
Crawl Notifications
Hi, I'm well aware that the title's for all of my blog post are longer than the recommended length. How can I tell moz to ignore that? I hate seeing 80 plus crawl notifications all regarding this.
Moz Bar | | prestigeluxuryrentals.com0 -
Why RogerBot can't crawl site https://unplag.com
Hello Please help me to solve the problem. The on-page grader and Crawl Test are not working for Unplag.com website. Both said that they can't access the url. Yes, I've tried different variants like unplag.com, http://unplag.com One more thing - RogerBot was disallowed in robots.txt file. I deleted it from the file a week ago so maybe moz index haven't been renewed.
Moz Bar | | Targeras0 -
Moz crawl issues: All pages keep resolving to our "cookies not enabled" page
Upon running the Moz Pro site crawler, I noticed that I received quite a bit of duplicate titles along with 302 redirects (which is our site creating a temporary 302 to our "cookies not enabled" page). How would I get around the crawler being redirected to this page? I've never ran across this issue before, despite using the crawler with sites that use the same framework as the one thats affected. Any ideas?
Moz Bar | | responsivelabs0 -
I got a 404 in the Crawl Test Tool Report
I, yesterday i ran an crawl on http://www.everlastinggarden.nl and i get an 404. Does anybody know why this happens? <colgroup><col width="1535"></colgroup>
Moz Bar | | IMforYou
| # ---------------------------------------- |
| Crawl Test Tool Report | Moz,http://pro.seomoz.org/tools/crawl-test |
| www.everlastinggarden.nl |
| Report created: 15 Jul 18:34 |
| # ---------------------------------------- |
| URL,Time Crawled,Title Tag,Meta Description,HTTP Status Code,Referrer,Link Count,Content-Type Header,4XX (Client Error),5XX (Server Error),Title Missing or Empty,Duplicate Page Content,URLs with Duplicate Page Content (up to 5),Duplicate Page Title,URLs with Duplicate Title Tags (up to 5),Long URL,Overly-Dynamic URL,301 (Permanent Redirect),302 (Temporary Redirect),301/302 Target,Meta Refresh,Meta Refresh Target,Title Element Too Short,Title Element Too Long,Too Many On-Page Links,Missing Meta Description Tag,Search Engine blocked by robots.txt,Meta-robots Nofollow,Blocked by X-robots,X-Robots-Tag Header,Blocked by meta-robots,Meta Robots Tag,Rel Canonical,Rel-Canonical Target,Blocking All User Agents,Blocking Google,Blocking Yahoo,Blocking Bing,Internal Links,Linking Root Domains,External Links,Page Authority |
| http://www.everlastinggarden.nl,2014,404 : Received 404 (Not Found) error response for page.,Error attempting to request page | Best regards, Jos0