XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://mza.bundledseo.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mirror from my site
hi to all i find 2 site they do mirror my site and send back link to all my pages. do you thing its bad for my seo ?? my site is https://android-apk.org mirror sites | Who links the most | fryeboysent.com | 1,342,613 |
Intermediate & Advanced SEO | | moztabliq1
| ficyexp.cl | 934,654 | |0 -
Sitemap and content question
This is our primary sitemap https://www.samhillbands.com/sitemaps/sitemap.xml We have a about 750 location based URL's that aren't currently linked anywhere on the site. https://www.samhillbands.com/sitemaps/locations.xml Google is indexing most of the URL because we submitted the locations sitemap directly for indexing. Thoughts on that? Should we just create a page that contains all of the location links and make it live on the site? Should we remove the locations sitemap from separate indexing...because of duplicate content? #
Intermediate & Advanced SEO | | brianvestSitemap Type Processed Issues Items Submitted Indexed --- --- --- --- --- --- --- --- --- 1 /sitemaps/locations.xml Sitemap May 10, 2016 - Web 771 648 2 /sitemaps/sitemap.xml Sitemap index May 8, 2016 - Web 862 730
0 -
Our parent company has included their sitemap links in our robots.txt file - will that have an impact on the way our site is crawled?
Our parent company has included their sitemap links in our robots.txt file. All of their sitemap links are on a different domain and I'm wondering if this will have any impact on our searchability or potential rankings.
Intermediate & Advanced SEO | | tsmith1310 -
Submitting XML Sitemap for large website: how big?
Hi there, I’m currently researching how I can generate an XML sitemap for a large website we run. We think that Google is having problems indexing the URLs based on some of the messages we have been receiving in Webmaster tools, which also shows a large drop in the total number of indexed pages. Content on this site can be accessed in two ways. On the home page, the content appears as a list of posts. Users can search for previous posts and can search all the way back to the first posts that were submitted. Posts are also categorised using tags, and these tags can also currently be crawled by search engines. Users can then click on tags to see articles covering similar subjects. A post could have multiple tags (e.g. SEO, inbound marketing, Technical SEO) and so can be reached in multiple ways by users, creating a large number of URLs to index. Finally, my questions are: How big should a sitemap be? What proportion of the URLs of a website should it cover? What are the best tools for creating the sitemaps of large websites? How often should a sitemap be updated? Thanks 🙂
Intermediate & Advanced SEO | | RG_SEO0 -
Sitemap for SmartPhone site
Hello I have a smartphone site (e.g.m.abc.com). To my understanding we do not need a mobile sitemap as its not a traditional mobile site. Shall I add those mobile site links in my regular www XML sitemap or not bother to add the links as we already have rel = canonical (on m.abc.com ) and rel= alternate in place (on www site) to respective pages. Please suggests a solution. I really look forward to an answer as I haven't found the "official" answer to this question anywhere.
Intermediate & Advanced SEO | | AdobeVAS0 -
Should I create a separate sitemap.xml for paginated categories?
For example: http://www.site.com/category/sub-category http://www.site.com/category/sub-category/1 http://www.site.com/category/sub-category/2 http://www.site.com/category/sub-category/3 Thanks in advance! 🙂
Intermediate & Advanced SEO | | esiow20130 -
SEOMOZ crawl all my pages
SEOMOZ crawl all my pages including ".do" (all web pages after sign up ) . Coz of this it finishes all my 10.000 crawl page quota and be exposed to dublicate pages. Google is not crawling pages that user reach after sign up. Because these are private pages for customers I guess The main question is how we can limit SEOMOZ crawl bot. If the bot can stay out of ".do" java extensions it'll perfect to starting SEO analysis. Do you know think about it? Cheers Example; .do java extension (after sign up page) (Google can't crawl) http://magaza.turkcell.com.tr/showProductDetail.do?psi=1001694&shopCategoryId=1000021&model=Apple-iPhone-3GS-8GB Normal Page (Google can crawl) http://magaza.turkcell.com.tr/telefon/Apple-iPhone-3GS-8GB/1001694/.html
Intermediate & Advanced SEO | | hcetinsoy0 -
Xml sitemap advice for website with over 100,000 articles
Hi, I have read numerous articles that support submitting multiple XML sitemaps for websites that have thousands of articles... in our case we have over 100,000. So, I was thinking I should submit one sitemap for each news category. My question is how many page levels should each sitemap instruct the spiders to go? Would it not be enough to just submit the top level URL for each category and then let the spiders follow the rest of the links organically? So, if I have 12 categories the total number of URL´s will be 12??? If this is true, how do you suggest handling or home page, where the latest articles are displayed regardless of their category... so I.E. the spiders will find l links to a given article both on the home page and in the category it belongs to. We are using canonical tags. Thanks, Jarrett
Intermediate & Advanced SEO | | jarrett.mackay0