Add selective URLs to an XML Sitemap
-
Hi!
Our website has a very large no of pages. I am looking to create an XML Sitemap that contains only the most important pages (category pages etc). However, on crawling the website in a tool like Xenu (the others have a 500 page limit), I am unable to control which pages get added to the XML Sitemap, and which ones get excluded.
Essentially, I only want pages that are upto 4 clicks away from my homepage to show up in the XML Sitemap.
How should I create an XML sitemap, and at the same time control which pages of my site I add to it (category pages), and which ones I remove (product pages etc).
Thanks in advance!
Apurv
-
Thanks a lot for sharing Travis. This is really helpful!
Appreciate your help here.
-
Hey Intermediate,
Here's my setup - image - http://screencast.com/t/qThC401hQVUp Be careful of the line breaks if you want your sitemap to be pretty (I'm not sure if it also works if everything is on a single line).
Column A:
Column B:
URLColumn
<lastmod>2013-08-27</lastmod>
Column
<changefreq>always</changefreq>Column E:
<priority>1</priority>Column F:
=CONCATENATE(A2,B2,C2,D2,E2)You will need to add this as first 2 lines in your sitemap:
and add to the end, but you should be good to go!
I Hope that helps! -
Thanks Schwaab!
-
Hi Travis
That sounds like a smart way to go about this. Could you please guide me regarding how to add parameters like lastmod, priority, changefreq etc in the XML sitemap, using the URLs that I have in the Excel sheet.
Thanks!
-
If you have a list of all the URLs on your site, it is easy to create a sitemap using excel. I have a template that I use and I can crank out a 50k URL sitemap in 5 minutes.
-
I would recommend purchasing Screaming Frog. You can crawl the site and sort the URLs by level. Remove the URLs that are too deep from the crawl and export to XML sitemap. Screaming Frog is definitely worth the price to unlock all of its features and have an unlimited crawl limit.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
HTTP URLs Still in Index
One of the sites I manage was migrated to secure 2 months ago. XML sitemaps have been updated, canonical tags all have https:, and a redirect rule was applied. Despite all this, I'm still seeing non-secure URLs in Google's index. The weird thing is, when I click those links, they go to the secure version. Has anyone else seen weird things with Google not properly indexing secure versions of URLs?
Technical SEO | | LoganRay0 -
Upgrade old sitemap to a new sitemap index. How to do without danger ?
Hi MOZ users and friends. I have a website that have a php template developed by ourselves, and a wordpress blog in /blog/ subdirectory. Actually we have a sitemap.xml file in the root domain where are all the subsections and blog's posts. We upgrade manually the sitemap, once a month, adding the new posts created in the blog. I want to automate this process , so i created a sitemap index with two sitemaps inside it. One is the old sitemap without the blog's posts and a new one created with "Google XML Sitemap" wordpress plugin, inside the /blog/ subdirectory. That is, in the sitemap_index.xml file i have: Domain.com/sitemap.xml (old sitemap after remove blog posts urls) Domain.com/blog/sitemap.xml (auto-updatable sitemap create with Google XML plugin) Now i have to submit this sitemap index to Google Search Console, but i want to be completely sure about how to do this. I think that the only that i have to do is delete the old sitemap on Search Console and upload the new sitemap index, is it ok ?
Technical SEO | | ClaudioHeilborn0 -
Sitemap Generator Tool
We have developed a very large domain with well over 500 pages that need to be indexed. The tool we usually use to create a sitemap has a limit of 500 pages. Does anyone know of good tool we can use to create a sitemap text and xml that doesn't have a limit of pages? Thanks!
Technical SEO | | TracSoft0 -
Do Seomozers recommend sitemaps.xml or not. I'm thoroughly confused now. The more I read, the more conflicted I get
I realize I'm probably opening a can of worms, but here we go. Do you or do you not add a sitemap.xml to a clients site?
Technical SEO | | catherine-2793880 -
Query strings in Canoncials URLs
Video on my site all resides at www.mydomain.com/video in a player that does not assign unique URLs for each video. We may be able to rewrite the URLs to include a unique identifier found in the video's metadata (www.mydomain.com/video/?bctid=17769780). If I did this, how would it impact the canonical URL? Do the SEs accept canonicals with query strings? What if I only changed the canonical URL and did not change the video's URL? Would that be a problem?
Technical SEO | | BostonWright0 -
301 on certain url string
I have a few thousand old urls with the string /content/ in them and are looking for a way to 301 batch redirect them. So for all the urls that contain the word 'content' I would like to redirect to 1 specific page. I have tried the methods below without success. Regular 301's are working fine but this particular method is not working for me. I am running a Joomla site but I don't imagine that would have any impact. Any suggestions would be greatly appreciated. Redirect 301 ^content/.*$ http://www.mysite.com Redirect 301 ^content/ http://www.mysite.com
Technical SEO | | omega0 -
Is it OK for a sitemap to appear as a "Top URL" in Google Webmaster?
I'm using Google Webmaster (alongside other tools) to understand how Google is indexing my site. One of the tools is "Content Keywords", where it lists keywords that Google sees as significant for your site. The keywords shown are generally fine, but when I click on an individual word, I am often seeing our sitemap as one of the "Top URLs" that the keyword is found on (our sitemap is at system/sitemap1.xml.gz) - is this OK? Obviously I don't want to add the sitemap URL to robots.txt, but I also want to ensure that 'real' user-focused pages (e.g. our homepage) appear higher in the "Top URLs" list for the keywords, as I'm assuming this is an indicator of how the site is performing in search. Any help appreciated!
Technical SEO | | anilababla0 -
Wordpress URL weirdness - why is google registering non-pretty URLS?
I've noticed in my stats that google is indexing some non-pretty URLs from my wordpress-based blog.
Technical SEO | | peterdbaron
For instance, this URL is appearing google search: http://www.admissionsquest.com/onboardingschools/index.php?p=439 It should be: http://www.admissionsquest.com/onboardingschools/2009/01/do-american-boarding-schools-face-growing-international-competition.html Last week I added the plugin Redirection in order to consolidate categories & tags. Any chance that this has something to do with it? Recs on how to solve this? Fyi - I've been using pretty URLS with wordpress from the very beginning and this is the first time that I've seen this issue. Thanks in advance for your help!0