Exclude Noindex, Followed pages from sitemap?
-
Hello Everyone!
This is a question about my site, which is running on WordPress. Currently, I have category page to have the noindex, follow attributes, as they have little unique content. I do have them currently in my sitemap.xml file, however. Should I remove them from the sitemap since Google technically shouldn't index them?
Thanks for your help!
-
then that makes things easier then
-
It's easy to remove. I currently have the default setup in Yoast where I have (currently) a sitemap_index.xml file with a page_sitemap, post_sitemap, category_sitemap, and a portfolio_sitemap (my custom post type for portfolio items).
-
There's no need to have them in the sitemap. If you have them there then its not really a big deal but just try to remove it when you can. If you have a page sitemap, then you can add them in if you wanted to.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rel Canonical, Follow/No Follow in htaccess?
Very quick question, are rel canonical, follow/no follow tags, etc. written in the htaccess file?
Technical SEO | | moon-boots0 -
404 Errors for Form Generated Pages - No index, no follow or 301 redirect
Hi there I wonder if someone can help me out and provide the best solution for a problem with form generated pages. I have blocked the search results pages from being indexed by using the 'no index' tag, and I wondered if I should take this approach for the following pages. I have seen a huge increase in 404 errors since the new site structure and forms being filled in. This is because every time a form is filled in, this generates a new page, which only Google Search Console is reporting as a 404. Whilst some 404's can be explained and resolved, I wondered what is best to prevent Google from crawling these pages, like this: mydomain.com/webapp/wcs/stores/servlet/TopCategoriesDisplay?langId=-1&storeId=90&catalogId=1008&homePage=Y Implement 301 redirect using rules, which will mean that all these pages will redirect to the homepage. Whilst in theory this will protect any linked to pages, it does not resolve this issue of why GSC is recording as 404's in the first place. Also could come across to Google as 100,000+ redirected links, which might look spammy. Place No index tag on these pages too, so they will not get picked up, in the same way the search result pages are not being indexed. Block in robots - this will prevent any 'result' pages being crawled, which will improve the crawl time currently being taken up. However, I'm not entirely sure if the block will be possible? I would need to block anything after the domain/webapp/wcs/stores/servlet/TopCategoriesDisplay?. Hopefully this is possible? The no index tag will take time to set up, as needs to be scheduled in with development team, but the robots.txt will be an quicker fix as this can be done in GSC. I really appreciate any feedback on this one. Many thanks
Technical SEO | | Ric_McHale0 -
Page not cached
Hi there, we uploaded a page but unfortunately didn't realise it had noindex,nofollow in the meta tags. Google had cached it then decached it (i guess thats possible) it seems? now it will not cache even though the correct meta tags have been put in and we have sent links to it internally and externally. Anyone know why this page isn't being cached, the internal link to it is on the homepage and that gets cached almost every day. I even submitted it to webmaster tools to index.
Technical SEO | | pauledwards0 -
"INDEX,FOLLOW" then later in the code "NOINDEX,NOFOLLOW" which does google follow?
background info: we have an established closed E-commerce system which the company has been using for years. I have only just started and reviewing the system, I don't have direct access to the code, but can request changes, but it could take months before the changes are in effect (or done at all), and we won't can't change to a new E-commerce system for the short to mid term. While reviewing the site (with help of seomoz crawl diagnostics) I noticed that some of the existing "landing pages" have in the code: <meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">INDEX,FOLLOW</a>" /> then a few lines later <meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">NOINDEX,NOFOLLOW</a>" /> Which the crawl diagnostics flagged up, but in the webmaster tools says
Technical SEO | | PaddyDisplays
"We didn't detect any issues with non-indexable content on your site." so the question is which instructions does google follow? the first or 2nd? note: clearly this is need fixed, but I have a big list of changes for the system so I need to know how important this is tthanks0 -
Why use noindex, follow vs rel next/prev
Look at what www.shutterstock.com/cat-26p3-Abstract.html does with their search results page 3 for 'Abstract' - same for page 2-N in the paginated series. | name="robots" content="NOINDEX, FOLLOW"> |
Technical SEO | | jrjames83
| | Why is this a better alternative then using the next/prev, per Google's official statement on pagination? http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663744 Which doesn't even mention this as an option. Any ideas? Does this improve the odds of the first page in the paginated series ranking for the target term? There can't be a 'view all page' because there are simply too many items. Jeff0 -
Ranked on Page 1, now between page 40-50... Please help!
My site, http://goo.gl/h0igI was ranking on page one for many of our biggest keywords. All of a sudden, we completely fell off. I believe I'm down somewhere between page 40-50. I have no warning or error messages in webmaster tools. Can anyone please help me identify what the problem is? This is completely unexpected and I don't know how to fix it... Thanks in advance
Technical SEO | | Prime850 -
Why does SEOMos Pro include noindex pages?
I'm new to SEOMoz. Been digesting the crawl data and have a tonne of action items that we'll be executing on fairly soon. Love it! One thing I noticed is in some of crawl warnings include pages that expressly have the ROBOTS meta tag with the "noindex" value. Example: many of my noindex pages don't include meta descriptions. Therefore, is it safe to ignore warnings of this nature for these pages?
Technical SEO | | ChatterBlock0 -
Does page speed affect what pages are in the index?
We have around 1.3m total pages, Google currently crawls on average 87k a day and our average page load is 1.7 seconds. Out of those 1.3m pages(1.2m being "spun up") google has only indexed around 368k and our SEO person is telling us that if we speed up the pages they will crawl the pages more and thus will index more of them. I personally don't believe this. At 87k pages a day Google has crawled our entire site in 2 weeks so they should have all of our pages in their DB by now and I think they are not index because they are poorly generated pages and it has nothing to do with the speed of the pages. Am I correct? Would speeding up the pages make Google crawl them faster and thus get more pages indexed?
Technical SEO | | upper2bits0