Possible Crawling Problem with Screaming Frog and Moz Crawlers
-
So I'm not sure if what I'm seeing is a problem or not.
As of about two weeks ago the Moz crawler has only been able to see www.mysite.com, and none of the links, content, title, ect associated with the page. Essentially the report has one line, what should be the homepage, but it's not able to pull any information from the page but does show a 200 http status code. The report shows nothing blocked by robots or any errors.
When I use screaming frog to crawl the site about 75% of the time it just reports one line www.mysite.com with a 200 status code, but again the crawler is not able to actually see the html. The other 25% of the time it works perfectly fine, crawls all pages and sees all meta info and content.
There are no errors in Google WMT and everything looks ok there. We have seen a traffic drop the last two weeks but I don't know if this is the reason for it.
I can't publicly post the page but if someone has an idea of what might be going on I'd be happy to PM them.
Thanks
-
Thank you for the response.
I've ran two MOZ crawl reports today, one with mysite.com and one www.mysite.com. Both returned 1 result for mysite.com and www.mysite.com respectively, with a 200 status code, but no meta data. I know that I've successfully crawled www.mysite.com about a month ago with no problems. I have made small changes here and there but nothing is jumping out at me as wrong.
Screaming Frog is currently crawling my site successfully about 1/10 tries. The successful tries it sees 163 Total URL Encountered (its a small site) and the other 9/10 times it shows exactly 1 URL (the one i entered) and no meta data. There doesn't seem to be any pattern when it successfully crawls and when it doesn't make it past the first page.
Google WMT is currently showing No Data Available for both internal links and links to your site which is a little concerning. Everything else in WMT looks ok.
-
Two possible simple key-in items to consider: make sure the URL is inputted w/ the full url (not just mysite.com) and/or ensure to click any options for including root or sub-domains so its not just looking at a single page.
-
If you PM me the domain I can take a look myself.
Does the robots.txt have anything funny in there?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Moz crawl our development site?
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues. What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further. How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again? Thanks!
Moz Pro | | MultiTimeMachine0 -
I got an 803 error yesterday on the Moz crawl for most of my pages. The page loads normally in the browser. We are hosted on shopify
I got an 803 error yesterday on the Moz crawl for most of my pages. The page loads normally in the browser. We are hosted on shopify, the url is www.solester.com please help us out
Moz Pro | | vasishta0 -
[Moz Help] Re: Trying to add a valid URL into MOZ account
See below and pls let us know what we have to do solve this : | | Joel Day (Moz Help) Mar 07 05:03 PM Hey Tracy, It looks like there's a redirect loop on your site. greatwesternflooring.com redirects to www.greatwesternflooring.com/ which in turn 302 redirects back into itself. You'll likely need to fix the redirect before you can continue configuring the campaign. 🙂 Thanks!
Moz Pro | | Britewave
Joel. Moz
t: @HelpWizard | | | Tracy Mar 07 03:14 PM I sent an email, and this is the response I got. The help forum sent me here, so here I am 🙂 An answer was posted to this question:
Question I have a valid URL greatwesternflooring.com, but when I try to add this campaign I get an "opps" message telling me it's not a valid URL. Can you help me? Answer
This looks like a bug. Please reach out to us via support so that we can forward this along to our Developers for review. Thanks!(https://mza.seotoolninja.com/help/contact)
See where this question was originally asked. |0 -
1 page crawled ... and other errors
1. Why is only one (1) page crawled every second time you crawl my site? 2. Why do your bot not obey the rules specified in the robots.txt? 3. Why does your site constantly loose connection to my facebook account/page? This means that when ever i want to compare performance i need to re-authorize, and therefor can not see any data until next time. Next time i also need to re-authorize ... 4. Why cant i add a competitor twitter account? What ever i type i get an "uh oh account cannot be tracked" - and if i randomly succeed, the account added never shows up with any data. It has been like this for ages. If have reported these issues over and over again. We are part of a large scandinavian company represented by Denmark, Sweden, Norway and Finland. The companies are also part of a larger worldwide company spreading across England, Ireland, Continental Europe and Northern Europe. I count at least 10 accounts on Seomoz.org We, the Northern Europe (4 accounts) are now reconsidering our membership at seomoz.org. We have recently expanded our efforts and established a SEO-community in the larger scale businees spanning all our countries. Also in this community we are now discussing the quality of your services. We'll be meeting next time at 27-28th of june in London. I hope i can bring some answers that clarify the problem we have seen here on seomoz.org. As i have written before: I love your setup and you tools - when they work. Regretebly, that is only occasionally the case!
Moz Pro | | alsvik1 -
No Crawl data in dashboard
For the second straight week, I have had no crawl data in my dashboard. It seems like the crawler erased all my results in the pro dashboard. Is there a way to manually recrawl my site, since I will have to wait another week to see if it comes back to earth? Thanks
Moz Pro | | bedwards0 -
Can we add sites to the crawl queue for OSE?
Is it possible to request that Open Site Explorer crawls a new URL on its next run? This tool is the first place I go to when working on a new site, and when there is "No Data Available" this is a little frustrating. I fully appreciate that this lack of data is usually a signal that the website is either very new or of low quality, however that if often the reason that I am brought in and would very much like to benchmark and provide initial analysis using this tool. It would make sense that OSE crawls the sites that Moz members are working on wouldnt it? Scott.
Moz Pro | | eseyo0 -
Where does the crawler find the urls?
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful 🙂 however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak thanks 🙂
Moz Pro | | Fammy1 -
Problems with OSE downloads
Ordered 5 reports last 24 hours, none received. Anyone else with this problem ? I do expect better from an expensive subscription. C'mon Moz, fix this new OSE report system please.
Moz Pro | | blocker04082