XML sitemap generator only crawling 20% of my site
-
Hi guys,
I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap.
How should I go about this?
Thanks
-
I believe it's not a significant issue if the sitemap encompasses the core framework of your website. As long as the sitemap is well-organized, omitting a few internal pages is acceptable since Googlebot will crawl all pages based on the sitemap. Take a look at the <a href="https://convowear.in">example page</a> that also excludes some pages, yet it doesn't impact the site crawler's functionality.
-
Yes Yoast on WordPress works fine for sitemap generation. I would also recommend that. Using on all of my blog sites.
-
If you are using WordPress then I would recommend to use Yoast plugin. It generates sitemap automatically regularly. I am also using it on my blog.
-
I'm using Yoast SEO plugin for my website. It generates the Sitemap automatically.
-
My new waterproof tent reviews blog facing the crawling problem. How can I fix that?
-
use Yoast or rankmath ot fix it
آموزش سئو در اصفهان https://faneseo.com/seo-training-in-isfahan/
-
Patrick wrote a list of reasons why Screaming Frog might not be crawling certain pages here: https://mza.bundledseo.com/community/q/screamingfrog-won-t-crawl-my-site#reply_300029.
Hopefully that list can help you figure out your site's specific issue.
-
This doesn't really answer my question of why I am not able to get all links into the XML sitemap when using xml sitemap generators.
-
I think it's not a big deal if the sitemap covers the main structure of your site. If your sitemap is constructed in a really decent structure, then missing some internal pages are acceptable because Googlebot will crawl all of your pages based on your site map. You can see the following page which also doesn't cover all of its pages, but there's no influence in terms of site crawler.
-
Thanks Boyd but unfortunately I am still missing a good chunk of URLs here and I am wondering why? Do those check on internal links in order to find these pages?
-
Use Screaming Frog to crawl your site. It is free to download the software and you can use the free version to crawl up to 500 URLs.
After it crawls your site you can click on the Sitemaps tab and generate an XML sitemap file to use.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is possible to submit a XML sitemap to Google without using Google Search Console?
We have a client that will not grant us access to their Google Search Console (don't ask us why). Is there anyway possible to submit a XML sitemap to Google without using GSC? Thanks
Intermediate & Advanced SEO | | RosemaryB0 -
XML Sitemaps settings for simple websites
We have numerous simple websites that are not updated very often (maybe once every 12-18 months). We are trying to create the perfect XML sitemap for them. Which brings me to my 2 questions:- 1. Should we include the "lastmod" tag in the XML sitemap? 2. What would you set the "changefreq" at? Once a month? Remember these sites are never updated as they are simple sites for industry sectors such as building and property maintenance. Please justify your answer and explain why as we are trying to understand.
Intermediate & Advanced SEO | | JohnW-UK0 -
My Site is built for a U.S. audience, but my xml:lang in view source is en-gb?
Hi Ya'll, I have a U.S. based site with english content, but my xml lang in my view source ode is in en-gb. Should i change it to "en-US"? Or should i just change it to just "en", that way i can target all english speaking countries..... And if i do make the switch, does it make a difference in my traffic and SEO for the U.S.? Thank you!
Intermediate & Advanced SEO | | Shawn1240 -
Do image sitemaps provide value for non e-commerce sites?
Is it worth putting together an image sitemap to submit to Google if you're not an e-commerce site? Also, if you're using a CDN like Amazon Web Services (cloudfront), can you even submit an image sitemap? According to Google you need to verify your CDN in webmaster tools if you're going to do so. https://support.google.com/webmasters/answer/178636?hl=en
Intermediate & Advanced SEO | | kking41201 -
I have a general site for my insurance agency. Should I create niche sites too?
I work with several insurance agencies and I get this questions several times each month. Most agencies offer personal and business insurance and in a certain geographic location. I recommend creating a quality general agency site but would they have more success creating other nice sites as well? For example, a niche site about home insurance and one about auto insurance. What would your recommendation be?
Intermediate & Advanced SEO | | lagunaitech1 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0 -
When I try creating a sitemap, it doesnt crawl my entire site.
We just launched a new Ruby app at (used to be a wordpress blog) - http://www.thesquarefoot.com We have not had time to create an auto-generated sitemap, so I went to a few different websites with free sitemap generation tools. Most of them index up to 100 or 500 URLS. Our site has over 1,000 individual listings and 3 landing pages, so when I put our URL into a sitemap creator, it should be finding all of these pages. However, that is not happening, only 4 pages seem to be seen by the crawlers. TheSquareFoothttp://www.thesquarefoot.com/http://www.thesquarefoot.com/users/sign_inhttp://www.thesquarefoot.com/searchhttp://www.thesquarefoot.com/renters/sign_upThis worries me that when Google comes to crawl our site, these are the only pages it will see as well. Our robots.txt is blank, so there should be nothing stopping the crawlers from going through the entire site. Here is an example of one of the 1,000s of pages not being crawled****http://www.thesquarefoot.com/listings/Houston/TX/77098/Central_Houston/3910_Kirby_Dr/Suite_204Any help would be much appreciated!
Intermediate & Advanced SEO | | TheSquareFoot0 -
What on-page/site optimization techniques can I utilize to improve this site (http://www.paradisus.com/)?
I use a Search Engine Spider Simulator to analyze the homepage and I think my client is using black hat tactics such as cloaking. Am I right? Any recommendations on to improve the top navigation under Resorts pull down. Each of the 6 resorts listed are all part of the Paradisus brand, but each resort has their own sub domain.
Intermediate & Advanced SEO | | Melia0