Website pages missing from seomoz crawl
-
Hi!
I just added a website and the crawling result output has only 42 pages but my website has about 75 pages. What am i missing?
Thanks!
-
Is there any chance you could email me your sitemap as produced by wordpress? info[at]pathfindermedia[dot]co[dot]uk I'll take a closer look at whats being excluded.
-
I'll do that Keri. Thanks!
-
I don't think so:
User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /category/*/* Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /*?* Disallow: /*? Allow: /wp-content/uploads # Goole Bot User-agent: Googlebot Disallow: /*/feed/$ Disallow: /*/feed/rss/$ Disallow: /*/trackback/$ # Google Image User-agent: Googlebot-Image Disallow: Allow: /*
Regards.
-
Another possibility could be your robots.txt file, is it blocking some directories?
-
Hi! It's probably best to email [email protected] about this. You can give them your full URL and they can help figure out why Roger isn't crawling everything. Thanks!
-
Anyone?
-
Hi!
One thing i figured out is that the crawling on both seomoz and xml-sitemaps.com returm the same 42 pages.Here's my website homepage URL - http://bit.ly/TGjpVx
And a couple of missing pages from 36 at total - http://bit.ly/WM3Rwe and http://bit.ly/VpHJ9H.
Regards,
OV -
Hi!
My website has a xml sitemap, generated by Google XML sitemap Wordpress plugin, with all the 75 pages.The crawler http://www.xml-sitemaps.com/ also outputs just 42 pages. I think it has someting to do with the blogs being archived (?).
I need to solve this and don't know how?!
Thanks for your help do.
Regards.
-
Actually, the SEOmoz crawler should be crawling all of the pages -- it's OSE that doesn't crawl everything, but the crawler from your campaign should show all that it could find. If you email [email protected] they'd be happy to help you figure it out, or if you want to share your URL here along with some pages that are missing, the Q&A people could help diagnose things too.
-
Hi Ovieira,
This is not necessarily an indication that there are pages that are hidden from crawlers and the missing pages could simply be low priority for the moment. Or could have been created after the initial crawl had taken place.
The best way to check is to run a crawler like http://www.xml-sitemaps.com/ and that will give you a better idea. If the sitemap generates a complement of your pages then it's probably just a case of waiting until the next Moz crawl.
Mulith
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Updated page not ranking.
Hi Guys. Bit flummoxed by this. I've recently updated our Mid year diaries page to be this years mid year products. i.e) Diaries that go from 2015-16 not 2014-15. Last year we rank really well for the search term 'mid year diaries 14-15'. All i've done is update the page to be focused on 2015-16 diaries, but when i type in 'mid year diaries 15-16' it's no where to be seen in the SERP. Even our home page is ranking higher! I'm really puzzled about this, nothings changed apart from the year! The only reason I can think of is that Google is reading the file name of the images which are related to lasts years products? For example the file name might say mid year diary 2014-15. Do you think this is what's effecting us? Very puzzling 😕 I've submitted it through Webmaster tool btw 🙂 Isaac.
On-Page Optimization | | isaac6630 -
Noindex child pages (whose content is included on parent pages)?
I'm sorry if there have been questions close to this before... I've using WordPress less like a blogging platform and more like a CMS for years now... For content management purposes we organize a lot of content around Parent/Child page (and custom-post-type) relationships; the Child pages are included as tabbed content on the Parent page. Should I be noindexing these child pages, since their content is already on the site, in full, on their Parent pages (ie. duplicate content)? Or does it not matter, since the crawlers may not go to all of the tabbed content? None of the pages have shown up in Moz's "High Priority Issues" as duplicate content but it still seems like I'm making the Parent pages suffer needlessly... Anything obvious I'm not taking into consideration? By the by, this is my first post here @ Moz, which I'm loving; this site and the forums are such a great resource! Anyways, thanks in advance!
On-Page Optimization | | rsigg0 -
Home Page Keywords not Ranking and Assigned to Inside Pages
Hi, thank you for taking the time to read this. We have a few websites with the same problem. I will use http://www.prepared-meals.com as an example: The home page was ranking on page one for keyword "Prepared Meals". The site is about 6 months old. We use the Moz page optimizer on all pages of our websites to score an A rating. Recently we found the home page is no longer showing up in search results and the keyword "prepared meals" now points to an inside page that is not relevant: http://www.prepared-meals.com/Senior-Meals/Moms-Meals-Reviews.html this page shows up for Prepared Meals around page 15 in Google results. We have read keywords in the URL might be the issue, even though the page optimizer in MOZ says to do that. We are wondering if this is the issue or there is some other problem we are not aware of. Again, thank you for you for your time. -Craig
On-Page Optimization | | CraigSWD0 -
Duplicate Page Content on Empty Manufacturer Pages
I work for an internet retailer that specializes in pet supplies and medications. I was going through the Crawl Diagnostics for our website, and I saw in the Duplicate Page Content section that some of our manufacturer pages were getting flagged. The way our site is set up is that when products are discontinued we mark them as discontinued and use 301 redirects to redirect their URLs to other relevant products, brands, or our homepage. We do the same thing with brand and manufacturer pages if all of their products are discontinued. 90% of the time, this is a manual process. However, the other 10% of the time certain products come and go automatically as part of our inventory system with one of our fulfillment partners. This can sometimes create empty manufacturer pages. I can't redirect these empty pages because there's a chance that products will be brought back in stock and the page will be populated again. What can we do so that these pages won't get marked as duplicates while they're empty? Write unique short descriptions about the companies? Would the placement of these short descriptions matter--top of the page under the category name vs bottom of the page underneath where the products would go? The links in the left sidebar, top, and in the footer our part of our site architecture, so those are always going to be the same. To contrast, here's what a manufacturer page with products looks like: Thanks! http://www.vetdepot.com/littermaid-manufacturer.html
On-Page Optimization | | ElDude0 -
SeoMOZ meta descriptions
Anyone notice that seoMOZ doesn't use meta descriptions? Any idea why not?
On-Page Optimization | | SoulSurfer80 -
On-page: Over optimized images?
Hello guys. I have a small question about an on-page optimization for images. What I have: good title tag / good url structure good content (NOT keyword shuffled, its real content, for real people) images / gallery uploaded to folder named same as article name. For example: Great tips for bloggers [article name], great-tips-for-bloggers [folder name]. So my question is: Will Google harm me for this "too good" paths to images, article related image filenames, with mask like [gtips-img01], and if all images have titles / alt tags? Thank you guys.
On-Page Optimization | | infoo130 -
Web Page Refresh
Hi there, we redesign our Website, changing it for a jquery based version. This new design is much more usable and nice for our users, however the average page views for user decreased a lot. Basically this is due to the fact that once the user is logged in, it spends most of the time in the same Web form which is updated through jquery without refreshing it. We were thinking about adding a meta refresh tag, or ad some javascript for getting this task done in order to get the relation page views/visitor increased. Do you think refreshing the page every 4 minutes could be penalized by Google (or other Search engines) ? Which should be the interval between refresh ? Would it be better to make it very explicit (i.e. adding a meta refresh tag) or using a kind of hide javascript ? We want to increase the pageviews but of course, we don't want to get penalized
On-Page Optimization | | martincad0 -
Pages crawled
I noticed there is a limited in the number of pages crawled on galena.org? Will this number increase over time?
On-Page Optimization | | nskislak240