Crawling password protected sites such as dev or staging areas to look at sites b4 going live ?
-
Hi
Ive instructed clients to password protect dev areas so dont get crawled and indexed but how do we set up Moz crawl software so we can crawl theses sites for final check of any issues before going live ?
Is there an option i havnt seen to add logins/passwords for crawl software to access ?
cheers
dan
-
ok thanks Chiaryn
is that the actual name of the moz crawler (to allow in Robots) simply rogerbot ? or any other characters etc ?
Also is it not the case that even when blocked by robots.txt G can still crawl/index it once password removed, think i read few comments somewhere on Moz that can still happen somehow ?
Please advise asap ?
Many Thanks
Dan
-
Hey Dan,
Unfortunately, our crawler is not able to access password protected content on your site. If you create a staging subdomain that is not password protected, you could use the robots.txt file to allow rogerbot and block other crawlers, but I'm afraid our crawler will not crawl anything that a normal search engine crawl would not be able to crawl so we cannot crawl password protected pages.
I hope this helps.
Chiaryn
-
i dont suppose either of you are able to help at all with this related question:
http://moz.com/community/q/site-crawl-errors-download-list-of-all-urls
-
i dont suppose either of you are able to help at all with this related question:
http://moz.com/community/q/site-crawl-errors-download-list-of-all-urls
-
Hi Andy
Screaming Frog does have password access feature for your info i have just tried it
All Best
Dan
-
Thanks Matt
I have got screaming frog and can confirm that it has password access feature, but i really want Moz to be able to access too, i would have thought they should have this option somewhere. Are you saying Moz crawls have more info than SF (re 'moz level' analysis) ?
Dev site better password prtected than robots arnt they i think ?
Cheers
Dan
-
Hi Dan
I was about to ask the exact same question, so will keep an eye out for an answer.
I hope it is possible, but I couldn't work it out.
-
I don't know if there's a way to do this in Moz but you could always get Screaming Frog & tell it to ignore robots.txt - that will definitely crawl it. You can check titles, descriptions, canonicals, H1s, etc. that way. It doesn't give the Moz level analysis but it's a start that def works. You can also see if you have parameter issues that way.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Spam site
my website is activate in egg incubation industry https://www.taksafir.com . Level of this site spam score is 11% . now how can i reduce that ?
Moz Bar | | HeidiMaryAyuningtyas0 -
In the Moz Site Crawl, what does "External Links" mean?
I thought I knew what it meant but am finding instances where the value in the column, "Linking Root Domains" is greater than the value in the column, "External Links?" Thanks!
Moz Bar | | Edward_Sturm1 -
I have recently switzed to SSL on ramxpert.dk, now Moz is not able to crawl the site anymore
Hi Moz I have recently switzed to SSL on ramxpert.dk, now Moz is not able to crawl anymore. When I look at the crawl report there is just one fault displayed: "403 : Received 403 (Forbidden) error response for page." How can i solve this issue asap? I was not aware that there would be any issues with moz when activating a SSL certificate on the site. The campaign with the problem is: ramxpert.dk. Regards
Moz Bar | | WebBoost
Jesper Nielsen0 -
My page cant be crawled by MOZ
Hi everyone, I have a page named zita.vn. Im using Moz to do SEO for the page, but when I use On-page grader in MOZ, I got an error that the page cant be accessed. Additionally, I got an issue with page crawling. Google search still crawl my page normally. Please help. Thanks a lot.
Moz Bar | | zita.vn0 -
804 : HTTPS (SSL) Error in Crawl Test
So I am getting this 804 Error but I have checked our Security Certificate and it looks to be just fine. In fact we have another 156 days before renewal on it. We did have some issues with this a couple months ago but it has been fixed. Now, there is a 301 from http to https and I did not start the crawl on https so I am curious if that is the issue? Just wanted to know if anybody else has seen this and if you were able to remedy it? Thanks,
Moz Bar | | DRSearchEngOpt
Chris Birkholm0 -
Crawl test csv has lost its formatting??
All the columns/heading merged into column A. Anyone else noticed this over the past few days?
Moz Bar | | Moving-Web-SEO-Auckland0 -
I got a 404 in the Crawl Test Tool Report
I, yesterday i ran an crawl on http://www.everlastinggarden.nl and i get an 404. Does anybody know why this happens? <colgroup><col width="1535"></colgroup>
Moz Bar | | IMforYou
| # ---------------------------------------- |
| Crawl Test Tool Report | Moz,http://pro.seomoz.org/tools/crawl-test |
| www.everlastinggarden.nl |
| Report created: 15 Jul 18:34 |
| # ---------------------------------------- |
| URL,Time Crawled,Title Tag,Meta Description,HTTP Status Code,Referrer,Link Count,Content-Type Header,4XX (Client Error),5XX (Server Error),Title Missing or Empty,Duplicate Page Content,URLs with Duplicate Page Content (up to 5),Duplicate Page Title,URLs with Duplicate Title Tags (up to 5),Long URL,Overly-Dynamic URL,301 (Permanent Redirect),302 (Temporary Redirect),301/302 Target,Meta Refresh,Meta Refresh Target,Title Element Too Short,Title Element Too Long,Too Many On-Page Links,Missing Meta Description Tag,Search Engine blocked by robots.txt,Meta-robots Nofollow,Blocked by X-robots,X-Robots-Tag Header,Blocked by meta-robots,Meta Robots Tag,Rel Canonical,Rel-Canonical Target,Blocking All User Agents,Blocking Google,Blocking Yahoo,Blocking Bing,Internal Links,Linking Root Domains,External Links,Page Authority |
| http://www.everlastinggarden.nl,2014,404 : Received 404 (Not Found) error response for page.,Error attempting to request page | Best regards, Jos0 -
Moz Crawl Showing Duplicate Content But It's Not?!
Unfortunately I can't give out the URL, but here's the deal... I have two URL's which have completely different content on them but are being crawled as duplicate content. Any Idea how that would happen? I'm not seeing any errors in WMT's. Has anyone seen this before? Is the duplicate content reporting based on a % of the page content matching as the same?
Moz Bar | | Swarm-SEO0