How to create site map for large site (ecommerce type) that has 1000's if not 100,000 of pages.
-
I know this is kind of a newbie question but I am having an amazing amount of trouble creating a sitemap for our site Bestride.com. We just did a complete redesign (look and feel, functionality, the works) and now I am trying to create a site map. Most of the generators I have used "break" after reaching some number of pages. I am at a loss as to how to create the sitemap. Any help would be greatly appreciated!
Thanks
-
I agree with Chris. With such large websites it would be advisable having a sitemap index and then splitting the index into various individual indexes such as Pages, Products, Categories, images, media, tags etc.
-
The easiest thing i can think of is to write a script that works with your dispatcher to create a site map. The format I would use is add the page and all of the "product images" on the page to the map and move to the next. At the same time I would use an auto increment variable to keep track of how many lines you have written. When you get around 50k, write out the name of the next site map file that the program will create and have them chained together this way.
-
That's a great help Chris, thank you! And thanks to all for your help!
-
Typically, a sitemap is going to include every page on the site. As Francesca said, each sitemap can be up to 50K urls and if you need multiple sitemaps then you create a sitemap index that points to the rest of the sitemaps.
-
Thanks for the feedback!
I will look into screamingfrog for sure.
@Lesley - we are using a custom platform (in house) so we don't have that functionality. The issue is that we have a lot of inventory (millions) of cars. We have built (and are releasing new functionality today) to provide internal links so that Google can crawl all the inventory easily (users can too :). My question about sitemaps has boiled down to this: Do we need to build the sitemap to include every single page (all the inventory) or do we provide a "map" so that google can find the top pages and then crawl the inventory from there. Again the site is bestride.com. If anyone wants to take a look at the site, that would be fantastic!
Thanks
-
Are you using a custom platform or an off the shelf e-commerce package? Most off the shelf packages actually have a module that can create a site map and a lot have it where you can cron it too.
-
Of course, you can also use the moz's crawl test report at http://pro.moz.com/tools/crawl-test
-
Hi Kristin,
Each sitemap.xml can support maximum 50.000 URLs. So, If you have a site with more than 100K, It'd be better to create 2 or 3 o 4 etc sitemaps.xml in order to contain all URLs. Hope it is useful.
Kind regards!
Francesca
-
You can use screamingfrog to create your sitemap. You just need to license it for crawl more than 500 URI.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
disavow link more than 100,000 lines
I recieved a huge amount of spamy link (most of them has spam score 100) Currently my disavow link is arround 85.000 lines but at least i have 100.000 more domain which i should add them. All of them are domains and i don't have any backlink in my file. My Problem is that google dosen't accept disavow link which are more than 2MB and showes this message : File too big: Maximum file size is 100,000 lines and 2MB What should i do now?
Technical SEO | | sforoughi0 -
Old pages not mobile friendly - new pages in process but don't want to upset current traffic.
Working with a new client. They have what I would describe as two virtual websites. Same domain but different coding, navigation and structure. Old virtual website pages fail mobile friendly, they were not designed to be responsive ( there really is no way to fix them) but they are ranking and getting traffic. New virtual website pages pass mobile friendly but are not SEO optimized yet and are not ranking and not getting organic traffic. My understanding is NOT mobile friendly is a "site" designation and although the offending pages are listed it is not a "page" designation. Is this correct? If my understanding is true what would be the best way to hold onto the rankings and traffic generated by old virtual website pages and resolve the "NOT mobile friendly" problem until the new virtual website pages have surpassed the old pages in ranking and traffic? A proposal was made to redirect any mobile traffic on the old virtual website pages to mobile friendly pages. What will happen to SEO if this is done? The pages would pass mobile friendly because they would go to mobile friendly pages, I assume, but what about link equity? Would they see a drop in traffic ? Any thoughts? Thanks, Toni
Technical SEO | | Toni70 -
Do I have to create a separate sitemap for my multilingual site?
Hi, I was wondering how should I implement a sitemap for my multilingual site. Currently we have two languanges separated by subdirectories in our site /en (english) and /fr (french) however based on the the articles that I have read there are no clear explanation on the implementation of the sitemap with different languanges. Here are the cases I think is possible for the implementation: Case 1: One sitemap with all the en and fr pages together with hreflang attribution for each pages Case 2: One sitemap with only en pages with hreflang attribution for both languages (en and fr) Case 3: Separate sitemap for en and fr pages with hreflang attribution for both languanges and connect both through sitemapindex creation. If any of my proposed cases are not possible please let me know the best approach in creating a multilingual sitemap for my site. Appreciate your thoughts regarding this. Thank you!
Technical SEO | | ReneAnton0 -
URL Structure On Site - Currently it's domain/product-name NOT domain/category/product name is this bad?
I have a eCommerce site and the site structure is domain/product-name rather than domain/product-category/product-name Do you think this will have a negative impact SEO Wise? I have seen that some of my individual product pages do get better rankings than my categories.
Technical SEO | | the-gate-films0 -
The importance of url's - are they that important?
Hi Guys I'm reading some very contrasting and confusing reviews regarding urls and the impact they have on a sites ability to rank. My client has a number of flooring products, 71 to be exact - categorised under three sub categories 1. Gallery Wood - 2. Prefinshed Wood - 3. Parquet & Reclaimed. All of the 71 products are branded products (names that are completely unrelated to specific keyword search terms. This is having a major impact regarding how we optimise the site. FOR EXAMPLE: A product of the floor called "White Grain" - the "Key Word" we would like to rank this page for is Brown Engineered Flooring. I'm interested to know, should the name of the branded product match the url? What would you change to help this page rank better for the keyword - Brown Engineered Flooring. Title page: White Grain Url: thecompanyname.com/gallery-wood/white-grain (white grain is the name of the product) Key Word: Brown Engineered Flooring **Seo Title: **White Grain, Brown Engineered Flooring by X Meta Description: BLAH BLAH Brown Engineered Flooring BLAH BLAH Any feedback to help get my head around this would be really appreciated. Thank you.
Technical SEO | | GaryVictory0 -
Do you need an on page site map as well as an XML Sitemap?
Do on page site maps help with SEO or are they more for user experience? We submit and update our XML Sitemaps for the search engines but wondering if /sitemap for users is necessary?
Technical SEO | | bonnierSEO0 -
Site not passing page authority....
Hi, This site powertoolworld.co.uk is not passing page authority. In fact every page shows no links unless it has a link from an external source. Originally this site blocked Roger from crawling it but that block was lifted over 6 months ago. I also ran a crawl test last night and it shows the same thing. PA of 1 and no links. I would like to point out that the problem seems to be the same for all sites on the same platform. Which points me in the direction of code. for example there is a display: none tag in the ccs which is used to style where the side bar links are. It's a Blue Park platform. What could be causing the problem? Thanks in advance. EDIT Turns out that blocking the ezooms crawler stopped it from being included.
Technical SEO | | PowerToolWorld0 -
We're working on a site that is a beer company. Because it is required to have an age verification page, how should we best redirect the bots (useragents) to the actual homepage (thus skipping ahead of the age verification without allowing all browsers)?
This question is about useragents and alcohol sites that have an age verification screen upon landing on the site.
Technical SEO | | OveritMedia0