How ro write a robots txt file to point to your site map
-
Good afternoon from still wet & humid wetherby UK...
I want to write a robots text file that instruct the bots to index everything and give a specific location to the sitemap. The sitemap url is:http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspx
Is this correct:
User-agent: *
Disallow:
SITEMAP: http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspxAny insight welcome
-
Thank you so much for all your replies
[CASE CLOSED] -
Ryan's answer is correct. I just wanted to jump in to say that I know from first hand experience that Google and Bing are both able to read the sitemap file even if it is a different extension and even if you can't name it sitemap.xml.
-
Yes, your example is correct.
A great page for learning about robots.txt is: http://en.wikipedia.org/wiki/Robots_exclusion_standard#Sitemap
I will share the official method of declaring your sitemap location involves only the first letter being capitalized (i.e. Sitemap not SITEMAP) but I am almost certain it does not make a difference.
A few other suggestions which are best practices but do not have to be followed:
-
use all lowercase letters in URLs
-
name the sitemap file "sitemap" not "GoogleSiteMap"
-
submit XML sitemaps when possible. I am again almost certain Google can read other versions so if all you care about is Google then it's fine but otherwise I would suggest just using xml files.
example: business.leedscityregion.gov.uk/cmspages/sitemap.xml
Some other helpful links:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=183668
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots and Canonicals on Moz
We noticed that Moz does not use a robots "index" or "follow" tags on the entire site, is this best practice? Also, for pagination we noticed that the rel = next/prev is not on the actual "button" rather in the header Is this best practice? Does it make a difference if it's added to the header rather than the actual next/previous buttons within the body?
Technical SEO | | PMPLawMarketing0 -
Robots.txt blocking Addon Domains
I have this site as my primary domain: http://www.libertyresourcedirectory.com/ I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.) However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't. As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority. How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.) Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks. Proof I'm not crazy in attached video. robotstxt_addon_domain.mp4
Technical SEO | | eglove0 -
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
IIS 7.5 - Duplicate Content and Totally Wrong robot.txt
Well here goes! My very first post to SEOmoz. I have two clients that are hosted by the same hosting company. Both sites have major duplicate content issues and appear to have no internal links. I have checked this both here with our awesome SEOmoz Tools and with the IIS SEO Tool Kit. After much waiting I have heard back from the hosting company and they say that they have "implemented redirects in IIS7.5 to avoid duplicate content" based on the following article: http://blog.whitesites.com/How-to-setup-301-Redirects-in-IIS-7-for-good-SEO__634569104292703828_blog.htm. In my mind this article covers things better: www.seomoz.org/blog/what-every-seo-should-know-about-iis. What do you guys think? Next issue, both clients (as well as other sites hosted by this company) have a robot.txt file that is not their own. It appears that they have taken one client's robot.txt file and used it as a template for other client sites. I could be wrong but I believe this is causing the internal links to not be indexed. There is also a site map, again not for each client, but rather for the client that the original robot.txt file was created for. Again any input on this would be great. I have asked that the files just be deleted but that has not occurred yet. Sorry for the messy post...I'm at the hospital waiting to pick up my bro and could be called to get him any minute. Thanks so much, Tiff
Technical SEO | | TiffenyPapuc0 -
Need Help writing 301 redirects in .htaccess file
SEOmoz tool shows me 2 errors for duplicate content pages (www.abc.com and www.abc.com/index.html). I believe, the solution to this is writing 301 redirects I need two 301 redirects 1. abc.com to www.abc.com 2. /index.html to / (which is www.abc.com/index.html to www.abc.com) The code that I currently have is ................................................... RewriteEngine On
Technical SEO | | WebsiteEditor
RewriteCond %{HTTP_HOST} ^abc.com
RewriteRule (.*) http://www.abc.com/$1 [R=301,L] Redirect 301 http://www.abc.com/index.html http://www.abc.com ...................................................... but this does not redirect /index.html to abc.com. What is wrong here? Please help.0 -
I am able to point an old domain name to part of my site
hi, i have a domain name to a site that i have but i am thinking of closing that site down and transfer it to a site that is popular. I want to know if i am allowed to point the domain name to a section of my site and if so would it benefit my site or should i just continue running the other site and make it better. would the links to that site be transfered to my site any advice would be great.
Technical SEO | | ClaireH-1848860 -
Robots.txt and canonical tag
In the SEOmoz post - http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts, it's being said - If you have a robots.txt disallow in place for a page, the canonical tag will never be seen. Does it so happen that if a page is disallowed by robots.txt, spiders DO NOT read the html code ?
Technical SEO | | seoug_20050 -
Which is more accurate? site: or GWT?
when viewing urls in google's index, is it more accurate to refer to site:www.domain.com or google webmaster tools (urls in web index)?
Technical SEO | | nicole.healthline0