How to crawl the whole domain?
-
Hi,
I have a website an e-commerce website with more than 4.600 products. I expect that Seomoz scan check all url's. I don't know why this doesn't happens.
The Campaign name is Artigos para festa and should scan the whole domain festaexpress.com. But it crels only 100 pages
I even tried to create a new campaign named Festa Express - Root Domain to check if it scans but had the same problem it crawled only 199 pages.
Hope to have a solution.
Thanks,
Eduardo -
Hi Kery, thanks, I just sent to them.
Regards,
Eduardo. -
Hi Eduardo,
I'm sorry you're still having problems. At this point, it'd be best for you to send an email to [email protected] and have our help team look at it for you. They'd be the ones who could give you the most advice for diagnosing this.
Keri
-
Still have the same problem. Isn't that an issue with SEOMoz?
The domain is www.festaexpress.com has no flash and is crawled by google with no issues.Regards,
Eduardo. -
Hi Eduardo.
The way crawlers work is the begin on your home page and "crawl". They look at all the links on your home page and follow each one to the next page, then the next until your whole site has been captured.
Why are only 100 pages being crawled?
Most likely either because your site is not very well linked, or because you don't have a good navigation system, or because your navigation and links are presented in a format such as flash which the crawler cannot read.
Another possibility would be if the crawler is being blocked or hindered by your robots.txt file.
-
Not sure, but you could try Microsoft's IIS tool to spider your site. It is possible that your site has issues that make it difficult to spider, hence why SEOMoz's bot isn't working. You could also try something like Xenu Link Sleuth or HTTrack.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Still Cant Crawl My Site
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us. I did a fetch as google in our WM tools on our robots txt with success. SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there. What is going on here?
Moz Pro | | martJ0 -
Is there a easy way to see what pages are crawled?
Hello! Like the questions says... Is there a easy way to see what pages are crawled? I don't mean the ones that have issues, but just the ones that have been crawled? Regards,
Moz Pro | | MattDG0 -
Initiate crawl
Anyway to start the crawl of a site immediately after changes have been made? Or must you wait for the next scheduled crawl? Thanks.
Moz Pro | | dave_whatsthebigidea.com0 -
Crawl Diagnostics Report
I'm a bit concerned about the results I'm getting from the Crawl Diagnostics Report. I've updated the site with canonical urls to remove duplicate content and when I check the site - it all displays the right values, but the report, which has just finished crawling is still showing a lot of pages as duplicate content. Simple example: http://www.domain.com http://www.domain.com/ Both of them are in the duplicate content section although both have canonical url set as: Does each crawl check the entire site from the beginning or just the pages it didn't have a chance to crawl the last time? This is just one of 333 duplicate content pages, which have canonical url pointing to the right page. Can someone please explain?
Moz Pro | | coremediadesign0 -
Dynamic URL pages in Crawl Diagnostics
The crawl diagnostic has found errors for pages that do not exist within the site. These pages do not appear in the SERPs and are seemingly dynamic URL pages. Most of the URLs that appear are formatted http://mysite.com/keyword,%20_keyword_,%20key_word_/ which appear as dynamic URLs for potential search phrases within the site. The other popular variety among these pages have a URL format of http://mysite.com/tag/keyword/filename.xml?sort=filter which are only generated by a filter utility on the site. These pages comprise about 90% of 401 errors, duplicate page content/title, overly-dynamic URL, missing meta decription tag, etc. Many of the same pages appear for multiple errors/warnings/notices categories. So, why are these pages being received into the crawl test? and how to I stop it to gauge for a better analysis of my site via SEOmoz?
Moz Pro | | Visually0 -
Changing the Timeframe of Historical Crawl Data
Hello, Just read a great post about the implications of duplicate content for sites after the most recent Panda update: http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+seomoz+(SEOmoz+Daily+Blog) In the post is an image or crawl data history that shows months, not days or weeks, worth of trending data as it relates to duplicate content. So my question is this: How do I change my view/date range on my own campaigns so that I can view the trailing months of data rather than only what seems to be the past 4 weeks or so? This would really help me identify the impact of some on page changes we've recently made for a client. Many Thanks, Jared
Moz Pro | | surjm0 -
How long should the weekly crawl take
Mine started yesterday afternoon and it's now almost 11pm on Sunday. 30+ hours and still not finished (and no progress indicator). 438 pages quoted as being crawled. That's not normal - right? I have made a bunch of changes based on last weeks crawl so I have been eagerly waiting for this to finish But 30 hours?.... Thanks. Mark
Moz Pro | | MarkWill0 -
How are our competitors getting these inbound linking domains?
I'm currently managing SEO for my company's website, and I'm getting into link building for the first time. As part of the process, I'm using Open Site Explorer to see who's linking into our competitor sites, to get a better sense of what's available to us in our particular avenue of e-commerce. However, I'm finding that our competitors are getting inbound links from high-authority sites pretty far afield from selling jewelry - census.gov, parallels.com, warnerbros.com, and others. I try clicking through to these links, but each link starts a download of a file. I've seen .f4v, .7z, and .apk files listed as inbound links to our competitor. How is this happening? Again, I'm new to link building, so there may be a simple answer here, and if so I apologize for asking. However, this seems really strange to me, and a difficult situation to confront.
Moz Pro | | jozaksut0