Using Site Maps Correctly
-
Hello
I'm looking to submit a sitemap for a post driven site with over 5000 pages.
The site hasn't got a sitemap but it is indexed by google - will submitting a sitemap make a difference at this stage?
Also, most free sitemap tools only go up to 5000 pages, and I'm thinking I would try a sitemap using a free version of the tool before I buy one - If my site is 5500 pages but I only submit a sitemap for 5000 (I have no control of which pages get included in the sitemap) would this have a negative effect for the pages that didn't get included?
Thanks
-
Submitting a sitemap in Webmaster Console is always a good idea at any stage. If your website URLs are crawled and indexed in search engines than there will be no negative impact of it but in the longer run if you add more pages sitemap will defiantly a help.
If you are using CMS like WordPress, Joomla, Zencart or any other they all have extensions and plugins in their directory that will help you generate the sitemap of your current site and will add links as soon as you will add more pages.
Rest peter explains almost everything in detail like if you have URL issues and issues with crawling and indexing.
If you have a custom CMS, I think you should seriously consider the idea by Peter as this is something you need on regular basis anyways!
Hope this helps!
-
It's hard to tell without seeing your URL architecture.
First there are two specific terms and you never, never ever should forget them. They are - crawling and indexing. Once you prepare sitemap and submit there (or include in robots.txt) all bots get some map of your site and start crawling pages based on their crawling budget for your site. In crawling process they MAY find new pages that doesn't include in this map and will crawl them too. Again this is based on your crawling budget.
So when you submit sitemap - bot will get within seconds list of "non-crawled" 5000 pages and will start crawl them. Then he can find missed 500 pages and will crawl them too. Tricky is that when you update sitemap - he can detect quick changes there and start recrawling them again. But for missed 500 pages he can visit you again to check them for changes. And this will be also under your crawling budget. But if pages there isn't changed often - isn't big deal.
So you shouldn't hesitated about negative impact there. Only negative impact can happen if you have some serious URL architecture issues and messy URLs there. Then submitting partial sitemap can obfuscate this issues and some of your URLs to remain non-crawled.
Technically in SearchConsole you can see sitemap statistics like submitted and indexed. In perfect world numbers should be almost equal with little difference. But if you see huge difference between them - then you're in trouble. For example - on some site i have sitemap with submitted 44,950 pages and indexed of them was 29,643. This is pure example site crawling troubles or sitemap troubles. Because 1/3 of all pages isn't indexed at all.
PS: I forgot. You should use own CMS plugin for generating sitemap inside. Even if your CMS was custom made you should write (or hire someone) to create plugin inside. It's near 20-30 lines of write-here-your-favorite-language (PHP/Python/Perl/Ruby) and isn't big deal. This plugin will minimize crawling time from 3rd party sitemap generator tool because CMS already have all information inside and just need to be exported to XML.
-
It would definitely be better to submit a complete sitemap. If your site is built in Wordpress, Joomla, Magento, or many other standard CMS, it should have the ability to generate a full sitemap. Plugins like Yoast or Google Sitemaps help. Just depends on the site.
Otherwise you can probably get any pro SEO or agency to create a full 5500+ sitemap for you for $100 bucks or so. PM me if you need more help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What Analytics Do You Use To Track IOS and Android?
We have about 600k downloads of our app and would like to know what cost effective analytics tool you use to gain insight into what your app users are doing? Thanks! Best... Mike
Reporting & Analytics | | 945011 -
Site relaunch and impact on SEO
I have some tough decisions to make about a web site I run. The site has seen around for 20 years (September 1995, to be precise, is the date listed against the domain). Over the years, the effort I've expanded on the site has come and gone, but I am about to throw a lot of time and effort back into it. The majority of the content on the site is pretty dated, isn't tremendously useful to the audience (since it's pretty old) and the site design and URL architecture isn't particularly SEO-friendly. In addition, I have a database of thousands vendors (for the specific industry this site serves). I don't know if it's a factor any more but 100% of the links there have been populated by the vendors themselves specifically requesting inclusion (through a form we expose on the site). When the request is approved, the vendor link shows up on the appropriate pages for location (state) and segment of the industry. Though the links are all "opt-in" from vendors (we've never one added or imported any ourselves), I am sure this all looks like a terrible link farm to Google! And some vendors have asked us to remove their link for that reason 🙂 One final (very important) point. We have a relationship with a nationwide brand and have four very specific pages related to that brand on our site. Those pages are essential - they are by far the most visited pages and drive virtually all our revenue. The pages were put together with SEO in mind and the look and feel is very different to the rest of the site. The result is, effectively, a site-within-a-site. I need to carefully protect the performance of these pages. To put some rough numbers on this, the site had 475,000 page views over the last year, with about 320,000 of those being to these four pages (by the way, for the rest of the content "something happened" around May 20th of last year - traffic almost doubled overnight - even though there were no changes to our site). We have a Facebook presence and have put a little effort into that recently (increasing fans from about 10,000 last August to nearly 24,000 today, with a net gain of about 2,500 per month currently). I don't have any sense of whether that is a meaningful resource in the big picture. So, that's the background. I want to totally revamp the broader site - much improved design, intentional SEO decisions, far better, current and active content, active social media presence and so on. I am also moving from one CMS to another (the target CMS / Blog platform being WordPress). Part of me wants to do the following: Come up with a better plan for SEO and basically just throw out the old stuff and start again, with the exception of the four vendor pages I mentioned Implement redirection of the old URLs to new content (301s) Just stop exposing the vendor pages (on the basis that many of the links are old/broken and I'm really not getting any benefit from them) Leave the four important pages exactly as they are (URL and content-wise) I am happy to rebuild the content afresh because I have a new plan around that for which I have some confidence. But I have some important questions. If I go with the approach above, is there any value from the old content / URLs that is worth retaining? How sure can I be there is no indirect negative effect on the four important pages? I really need to protect those pages Is throwing away the vendor links simply all good - or could there be some hidden negative I need to know about (given many of the links are broken and go to crappy/small web sites, I'm hoping this is just a simple decision to make) And one more uber-question. I want to take a performance baseline so that I can see where I started as I start making changes and measure performance over time. Beyond the obvious metrics like number of visitors, time per page, page views per visit, etc what metrics would be important to collect from the outset? I am just at the start of this project and it is very important to me. Given the longevity of the site, I don't know if there is much worth retaining for that reason, even if the content changes radically. At a high level I'm trying to decide what questions I need to answer before I set off on this path. Any suggestions would be very much appreciated. Thanks.
Reporting & Analytics | | MarkWill0 -
Google Analtyics during site redesign
Hi, We will be launching a new redesign for our website. There will be new URLs and navigation and almost everything (except for static pages like about and contact) will be different. The overwhelming opinion seems to say that it's important to keep the same Google Analytics profile. How can we compare the past URLs to the new ones if they are completely different. Does anyone have any experience in this? Did you create any segmentation? Thanks 🙂
Reporting & Analytics | | WSteven0 -
Procedure of ecommerce tracking installation and set-up using GTM
Hello Experts, I have an e-commerce website. Google Analytic configured in Google Tag manager. Now in google analytic I have "ON" the Enable Ecommerce. Now i have to do ecommerce tracking installation and set-up? So can you please let me know the procedure? Regards, Jackin
Reporting & Analytics | | jackinmathis10 -
Had suspicious spike in Adsense clicks, next day site ranking tanks
Yesterday, one of my sites had extreme Adsense clicks for several hours in the morning, which brought it up to CTRs of around 120%. My normal CTR is about 10-15%. It added several hundred dollars income over and above my normal amount. After that, it went back to normal. I have waited to see if Google would adjust the income down, as someone or some bot seemingly clicked the heck out of the site's ads. Nothing has been adjusted; it's been 24 hours. Question #1: what usually causes this type of insane clicking to occur (i.e. competitors messing me?) Then, today I noticed something else disturbing. I cannot find my site in the top 100 SERPs for the main keyword. I was at #1 for a couple years, then, when I changed themes from Thesis to Genesis (site otherwise exactly the same) a couple months ago, I bounced around various positions on the first page. In the last couple weeks we've been bouncing between the teens and the thirties. Two days ago we were at #15. (the site is still indexed when I use "site:" to check. It seems awfully coincidental that yesterday I had the Adsense click explosion, and today I'm not even in the top 100 for the first time in my pretty stable two-year history, and have no idea how far behind 100 I am. I went to Google Webmaster Tools and see no errors or warnings relating to this. Adsense has not sent me any messages. So... Question #2: does Google search apply some sort of penalty to site that have suspicious Adsense clicking? By the way, I don't have any funny business going on with any bad SEO practices, it's all above board, and I have thousands of real readers each day Liking and commenting on the pages. It's a very real site. Note: I have been checking the ranking each day via a Google Incognito window and searching for the term. Of course I use MOZ but I do the Incognito search for a quick real time check, which I've found to be accurate.
Reporting & Analytics | | bizzer0 -
Wordpress site with increase number of Crawl(400 response Code) errors in Others section of GWT
I have a wordpress site http://muslim-academy.com/I check in Google Webmasters tool today and I see the increase number of errors in Others area of Google webmaster Tool.The error code is 400http://muslim-academy.com/%D8%B3%D9%8A%D8%B1%D8%A9-%D8%AA%D8%A7%D8%B1%D9%8A%D8%AE%D9%8A%D8%A9-%D9%84%D9%84%D8%B1%D8%A6%D9%8A%D8%B3-%D8%AC%D9%85%D8%A7%D9%84-%D8%B9%D8%A8%D8%AF-%D8%A7%D9%84%D9%86%D8%A7%D8%B5%D8%B1-2/%D8%B3%D9%....%3Cbr%20/%3E________________%3Cbr%20/%3E___________%3Ca%20href=?lang=zhOne of the example link of this error.Can you guide me why the number of errors are increasing and how to fix the existing errors.
Reporting & Analytics | | csfarnsworth0 -
What tools are people using to analyse clicked links
Hi, What tools do you use/recommend to analyse what/where links are being clicked on a page. I have seen a few mentions about CrazyEgg but are there any free (but good) tools around worth using?
Reporting & Analytics | | NeilD0 -
Blocking our IP's but wondering if Google still uses our search data?
The company owner here has our (company) website as his home page. I excluded our static IP’s on Google Analytics, but is that good enough to keep Google from using his search traffic as an indicator of anything negative. Does Google still take into account his activity, but simply block it from my reporting? Finally, does one person actually have that kind of influence as far as time on site, bounce rates, etc. Should I convince him to find a new home page?
Reporting & Analytics | | Ticket_King0