Google Search console says 'sitemap is blocked by robots?
-
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt."
I don't understand why my sitemap is being blocked? My robots.txt look like this:
User-Agent: *
Disallow:It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?
-
Nice happy to hear that do you work with Greg Reindel? He is a good friend I looked at your IP that is why I ask?
Tom
-
I agree with David
Hey is your dev Greg Reindel? If so you can call me for help PM me here for my info.
Thomas Zickell
-
Hey guys, I ended up disabling the sitemap option from YoastSEO, then installed the 'Google (XML) sitemap' plug-in. I re-submitted the sitemap to Google last night, and it came back with no issues. I'm glad to finally have this sorted out.
Thanks for all the help!
-
Hi Christian,
The current robots.txt shouldn't be blocking those URLs.
Did you or someone else recently change the robots.txt file? If so, give Google a few days to re-crawl your site.
Also, can you check what happens when you do a fetch and render on one of the blocked posts in Search Console? Do you have issues there?
Cheers,
David
-
I think you need to make an https robots.txt file if you are running https if running https
https://mza.bundledseo.com/blog/xml-sitemaps
`User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php` Sitemap: https://domain.com/index-sitemap.xml
(that is a https site map)
can you send the sitemap URL or run it though deepcrawl
Hope this helps?
Did you make a new robots.txt file?
-
Thanks for the response. Do you think this is a robots.txt issue? Or could this be caused by the YoastSEO plugin?
Do you know if this plug-in works with YoastSEO together? Or will it cause issues?
-
Thank you for the response.
I just scanned the site using 'Screaming frog'. Under Internal>Directives there were zero 'no index' links. I also check for '404 errors', server 505 errors, or anything 'blocked by robots.txt'.
Google search console is still showing me that there are URL's being blocked by my sitemap. (I added a screenshot of this). When I click through, it tells me that the 'post sitemap' has over +300 warnings.
I have just deleted the YoastSEO plugin, and I am now re-installing it. hopefully, this fixes the issue.
-
No, you do not need to change or plug-in what is happening is Webmaster tools is telling you that you have no index or no follow were robots xTag somewhere on your URLs inside your sitemap.
Run your site through Moz, screaming frog Seo spider or deepcrawl and look for no indexed URLs.
webmaster tools/search console is telling you that you have no index URLs inside of your XML sitemap not that you robots.txt is blocking it. This would be set in the Yoast plugin. one way to correct it is to look for noindex URLs & filter them inside Yoast so they are not being presented to the crawlers.
If you would like you can turn off the sitemap on Yoast and turn it back on if that does not work I recommend completely removing the plug-in and reinstalling it
- https://kb.yoast.com/kb/how-can-i-uninstall-my-plugin/
- https://kinsta.com/blog/uninstall-wordpress-plugin/
Can you send a screenshot of what you're seeing?
When you see it in Google Webmaster tools are you talking about the XML sitemap itself mean no indexed because all XML sitemaps are no indexed.
Please add this to your robots.txt
`User-agent:* Disallow:/wp-admin/ Allow:/wp-admin/admin-ajax.php` Sitemap: http://www.website.com/sitemap_index.xml
I hope this is of help,
Tom
-
Hi,
Use this plugin
https://wordpress.org/plugins/wp-robots-txt/
it will remove previous robots.txt and set simple wordpress robots.txt and wait for a day
problem can be solved.
Also watch this video on the same @ https://www.youtube.com/watch?v=DZiyN07bbBM
Thanks
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console 'Change of Address' Just 301s on source domain?
Hi all. New here, so please be gentle. 🙂 I've developed a new site, where my client also wanted to rebrand from .co.nz to .nz On the source (co.nz) domain, I've setup a load of 301 redirects to the relevant new page on the new domain (the URL structure is changing as well).
Technical SEO | | WebGuyNZ
E.G. On the old domain: https://www.mysite.co.nz/myonlinestore/t-shirt.html
In the HTACCESS on the old/source domain, I've setup 301's (using RewriteRule).
So that when **https://www.mysite.co.nz/**myonlinestore/t-shirt.html is accessed, it does a 301 to;
https://mysite.nz/shop/clothes/t-shirt All these 301's are working fine. I've checked in dev tools and a 301 is being returned. My question is, is having the 301's just on the source domain only enough, in regards to starting a 'Change of Address' in Google's Search Console? Their wording indicates it's enough but I'm concerned, maybe I also need redirects on the target domain as well? I.E. Does the Search Console Change of Address process work this way?
It looks at the source domain URL (that's already in Google's index), sees the 301 then updates the index (and hopefully pass the link juice) to the new URL. Also, I've setup both source and target Search Console properties as Domain Properties. Does that mean I no longer need to specify that the source and target properties are HTTP or HTTPS? I couldn't see that option when I created the properties. Thanks!0 -
Search Console has found over 18k 404 errors in my site, should I redirect?
most of them where old URLs pointed from a really old domain, that we have just shutten down. If the pages didn't receive any traffic, should we redirect? If I follow this https://mza.bundledseo.com/learn/seo/http-status-codes we shouldn't
Technical SEO | | pablo_carrara0 -
Google has deindexed 40% of my site because it's having problems crawling it
Hi Last week i got my fifth email saying 'Google can't access your site'. The first one i got in early November. Since then my site has gone from almost 80k pages indexed to less than 45k pages and the number is lowering even though we post daily about 100 new articles (it's a online newspaper). The site i'm talking about is http://www.gazetaexpress.com/ We have to deal with DDoS attacks most of the time, so our server guy has implemented a firewall to protect the site from these attacks. We suspect that it's the firewall that is blocking google bots to crawl and index our site. But then things get more interesting, some parts of the site are being crawled regularly and some others not at all. If the firewall was to stop google bots from crawling the site, why some parts of the site are being crawled with no problems and others aren't? In the screenshot attached to this post you will see how Google Webmasters is reporting these errors. In this link, it says that if 'Error' status happens again you should contact Google Webmaster support because something is preventing Google to fetch the site. I used the Feedback form in Google Webmasters to report this error about two months ago but haven't heard from them. Did i use the wrong form to contact them, if yes how can i reach them and tell about my problem? If you need more details feel free to ask. I will appreciate any help. Thank you in advance C43svbv.png?1
Technical SEO | | Bajram.Kurtishaj1 -
Can't get Google to Index .pdf in wp-content folder
We created an indepth case study/survey for a legal client and can't get Google to crawl the PDF which is hosted on Wordpress in the wp-content folder. It is linked to heavily from nearly all pages of the site by a global sidebar. Am I missing something obvious as to why Google won't crawl this PDF? We can't get much value from it unless it gets indexed. Any help is greatly appreciated. Thanks! Here is the PDF itself:
Technical SEO | | inboundauthority
http://www.billbonebikelaw.com/wp-content/uploads/2013/11/Whitepaper-Drivers-vs-cyclists-Floridas-Struggle-to-share-the-road.pdf Here is the page it is linked from:
http://www.billbonebikelaw.com/resources/drivers-vs-cyclists-study/0 -
Google Webmaster tools: Sitemap.xml not processed everyday
Hi, We have multiple sites under our google webmaster tools account with each having a sitemap.xml submitted Each site's sitemap.xml status ( attached below ) shows it is processed everyday except for one _Sitemap: /sitemap.xml__This Sitemap was submitted Jan 10, 2012, and processed Oct 14, 2013._But except for one site ( coed.com ) for which the sitemap.xml was processed only on the day it is submitted and we have to manually resubmit every day to get it processed.Any idea on why it might?thank you
Technical SEO | | COEDMediaGroup0 -
Help - we're blocking SEOmoz cawlers
We have a fairly stringent blacklist and by the looks of our crawl reports we've begin unintentionally blocking the SEOmoz crawler. can you guys let me know the useragent string and anything else I need to enable mak sure you're crawlers are whitelisted? Cheers!
Technical SEO | | linklater0 -
Javascript to manipulate Google's bounce rate and time on site?
I was referred to this "awesome" solution to high bounce rates. It is suppose to "fix" bounce rates and lower them through this simple script. When the bounce rate goes way down then rankings dramatically increase (interesting study but not my question). I don't know javascript but simply adding a script to the footer and watch everything fall into place seems a bit iffy to me. Can someone with experience in JS help me by explaining what this script does? I think it manipulates the reporting it does to GA but I'm not sure. It was supposed to be placed in the footer of the page and then sit back and watch the dollars fly in. 🙂
Technical SEO | | BenRWoodard1 -
Best blocking solution for Google
Posting this for Dave SottimanoI Here's the scenario: You've got a set of URLs indexed by Google, and you want them out quickly Once you've managed to remove them, you want to block Googlebot from crawling them again - for whatever reason. Below is a sample of the URLs you want blocked, but you only want to block /beerbottles/ and anything past it: www.example.com/beers/brandofbeer/beerbottles/1 www.example.com/beers/brandofbeer/beerbottles/2 www.example.com/beers/brandofbeer/beerbottles/3 etc.. To remove the pages from the index should you?: Add the Meta=noindex,follow tag to each URL you want de-indexed Use GWT to help remove the pages Wait for Google to crawl again If that's successful, to block Googlebot from crawling again - should you?: Add this line to Robots.txt: DISALLOW */beerbottles/ Or add this line: DISALLOW: /beerbottles/ "To add the * or not to add the *, that is the question" Thanks! Dave
Technical SEO | | goodnewscowboy0