Ruby on rails sitemap.xml structure
-
Is their a recommended way/best practice to implement sitemap.xml files on a site built with ruby on rails?
-
XML sitemap is well defined here:
http://www.sitemaps.org/protocol.htmlBut i can quickly resume:
- limitation up to 50000 URLs and up to 50MB as file. If you need more you can split them as sitemap index with several sitemaps.
- sitemap index are up to 50000 sitemaps and up to 10MB as file.
- lastmod, priority and change frequency didn't play HUGE role anymore: https://www.seroundtable.com/google-lastmod-xml-sitemap-20579.html https://www.seroundtable.com/google-priority-change-frequency-xml-sitemap-20273.html but just keep them to be fully formatted.
- sitemaps can be compressed (gzip)
- sitemap must be UTF-8 encoded but beware of entities - Ampersand, Single Quote, Double Quote, Greater Than, Less Than. You must replace them with % char codes.
- you can put sitemap location in robots.txt. You can place there also few sitemaps. Sitemaps can be located on 3rd party servers too.
I think that this is most important in XML sitemaps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What should my main sitemap URL be?
Hi Mozzers - regarding the URL of a website's main website: http://example.com/sitemap.xml is the normal way of doing it but would it matter if I varied this to: http://example.com/mainsitemapxml.xml or similar? I can't imagine it would matter but I have never moved away from the former before - and one of my clients doesn't want to format the URL in that way. What the client is doing is actually quite interesting - they have the main sitemap: http://example.com/sitemap.xml - that redirects to the sitemap file which is http://example.com/sitemap (with no xml extension) - might that redirect and missing xml extension the redirected to sitemap cause an issue? Never come across such a setup before. Thanks in advance for your feedback - Luke
Intermediate & Advanced SEO | | McTaggart0 -
URL structure with broad search phrase but specific intent
My question is regarding some difficult URL structure questions in an online real estate marketplace. Our problem is that our customers search behavior is very broad, but their intent very narrow. For IRL examples go to objektia (dot) se. Example: Lease commercial space Stockholm Is a usual search query, wherein the user searches for the **broad category **commercial space, in the geography of Stockholm. The problem is that their intent is actually much more specific, since: Commercial space === [Office, Retail, Industrial, Storage, Properties] I have previously asked the forum for help regarding the placement of products in our URL-hierarchy, in which I got some good answers. We chose to go the route of alternative #3, ie placing our products (real estate listings), directly beneath their respective category (neighborhoods). https://mza.bundledseo.com/community/q/placement-of-products-in-url-structure-for-best-category-page-rankings Basically we chose to have the following URL structure: Structure: domain.se/category/subcategory/product Example: domain.se/Stockholm/suburb-of-stockholm/specific-listing-12 Now the question is, how do we deal with the **space type **modifier in our URL structure. Nobody wants to see retail space when they are after office space, so our current search page solution (category page) is the following: Structure: domain.se/space-type/neighborhood/sub-neighborhood All space types: domain.se/commercial-space/neighborhood/sub-neighborhood Specific space type: domain.se/office-space/neighborhood/sub-neighborhood Now, the problem with our current solution in combination with our intent to move our product pages into this hierarchy, is that every product page will be (and is today) linking towards the specific type category. Our internal link network would be built around type categories that are extremely relevant from a UX standpoint, but almost worthless (surprisingly) from an organic traffic standpoint. Also, every search page (category page) for each space type would be competing for the same search broad search phrase. The alternative is to place the type modifier at the end of the URL: Category page type at the end: domain.se/neighborhood/sub-neighborhood/type Listing page (product page), type at the end: domain.se/neighborhood/sub-neighborhood/street-address/type/listing-12
Intermediate & Advanced SEO | | Viktorsodd0 -
Have I set up my structured data correctly, the testing tool suggests not?
Hi, I've recently marked up some Events for a client in hope that they'll appear as rich snippets in ther SERPS. I have access to their Google Search Console so used the Data Highlighter facility to mark them up, rather than the Raven plugin available for WordPress sites like this. I completed this on 10th July and the snippets are yet to appear - I understand that this can take time and there are no guarantees - but as a novice it would be reassuring if someone can advise that I have done this correctly. We did incidentally resubmit a sitemap after completing this task, but I'm not sure if that makes any difference. I've read that it's the structured data testing tool that I need to use to test my markup, but when I input the urls below, the tool doesn't tell me a lot, which either suggests I've marked it up incorrectly, or don't know how to read it! http://www.ad-esse.com/events/19th-august-2015-reducing-costs-changing-culture-improving-services/
Intermediate & Advanced SEO | | nathangdavidson
http://www.ad-esse.com/events/160915-reducing-costs-changing-culture-improving-services-london/
http://www.ad-esse.com/events/151015-reducing-costs-changing-culture-improving-services-london/ Any guidance welcomed! Many thanks,
Nathan0 -
Can submitting sitemap to Google webmaster improve SEO?
Can creating fresh sitemap and submitting to Google webmaster improve SEO?
Intermediate & Advanced SEO | | chanel270 -
Google Not Indexing XML Sitemap Images
Hi Mozzers, We are having an issue with our XML sitemap images not being indexed. The site has over 39,000 pages and 17,500 images submitted in GWT. If you take a look at the attached screenshot, 'GWT Images - Not Indexed', you can see that the majority of the pages are being indexed - but none of the images are. The first thing you should know about the images is that they are hosted on a content delivery network (CDN), rather than on the site itself. However, Google advice suggests hosting on a CDN is fine - see second screenshot, 'Google CDN Advice'. That advice says to either (i) ensure the hosting site is verified in GWT or (ii) submit in robots.txt. As we can't verify the hosting site in GWT, we had opted to submit via robots.txt. There are 3 sitemap indexes: 1) http://www.greenplantswap.co.uk/sitemap_index.xml, 2) http://www.greenplantswap.co.uk/sitemap/plant_genera/listings.xml and 3) http://www.greenplantswap.co.uk/sitemap/plant_genera/plants.xml. Each sitemap index is split up into often hundreds or thousands of smaller XML sitemaps. This is necessary due to the size of the site and how we have decided to pull URLs in. Essentially, if we did it another way, it may have involved some of the sitemaps being massive and thus taking upwards of a minute to load. To give you an idea of what is being submitted to Google in one of the sitemaps, please see view-source:http://www.greenplantswap.co.uk/sitemap/plant_genera/4/listings.xml?page=1. Originally, the images were SSL, so we decided to reverted to non-SSL URLs as that was an easy change. But over a week later, that seems to have had no impact. The image URLs are ugly... but should this prevent them from being indexed? The strange thing is that a very small number of images have been indexed - see http://goo.gl/P8GMn. I don't know if this is an anomaly or whether it suggests no issue with how the images have been set up - thus, there may be another issue. Sorry for the long message but I would be extremely grateful for any insight into this. I have tried to offer as much information as I can, however please do let me know if this is not enough. Thank you for taking the time to read and help. Regards, Mark Oz6HzKO rYD3ICZ
Intermediate & Advanced SEO | | edlondon0 -
Sitemap Dissappearance??
Greetings Mozzers, Doing my standard run through Webmaster tools and I discover up to 30% of my sitemaps no longer exist. Has anyone else experienced the recent loss of sitemaps/can suggest reasons why this may have happened? Re-submitting all sitemaps now but just concerned this might become an on-going issue...
Intermediate & Advanced SEO | | RobertChapman0 -
Sitemaps. When compressed do you use the .gz file format or the (untidy looking, IMHO) .xml.gz format?
When submitting compressed sitemaps to Google I normally use the a file named sitemap.gz A customer is banging on that his web guy says that sitemap.xml.gz is a better format. Google spiders sitemap.gz just fine and in Webmaster Tools everything looks OK... Interested to know other SEOmoz Pro's preferences here and also to check I haven't made an error that is going to bite me in the ass soon! Over to you.
Intermediate & Advanced SEO | | NoisyLittleMonkey0 -
What is the best method for segmenting HTML sitemaps?
Sitemaps create a Table of Contents for web crawlers and users alike. Understanding how PageRank is passed, HTML sitemaps play a critical role in how Googlebot and other crawlers spider and catalog content. I get asked this question a lot and, in most cases, it's easy to categorize sitemaps and create 2-3 category-based maps that can be linked to from the global footer. However, what do you do when a client has 40 categories with 200+ pages of content under each category? How do you segment your HTML sitemap in a case like this?
Intermediate & Advanced SEO | | stevewiideman0