Robots.txt Allowed
-
Hello all,
We want to block something that has the following at the end:
http://www.domain.com/category/product/some+demo+-text-+example--writing+here
So I was wondering if doing:
/*example--writing+here
would work?
-
Yes, that should work just fine. As Logan mentioned, I recommend you test it in the robots.txt testing tool in Google Search Console.
-
Yes, that would work. I'm sure everyone already knows that if in case you have a product that has the word example at the end of URL, it would block that too. A little off tangent here but blocking in robots.txt does not mean that every single spiders out there is going to honor this rule. The major ones like Google Spiders does honor this. Also, it doesn't mean that the URL won't be indexed. Sorry for the long winded answer but just make sure that if this is truly an example or demo page that you don't want search engines to index to make sure that you include "noindex, nofollow" in the metainfo.
I agree with Logan Ray. In case you want the "Robots TXT" Tester, you can google it "Robots Txt Tester" and the first one should be from support.google.com
-
Hi Thomas,
That should work. You can confirm this by modifying your robots.txt file in Search Console and testing a handful of URLs to ensure they're blocked the way you want.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If my website do not have a robot.txt file, does it hurt my website ranking?
After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help!
Intermediate & Advanced SEO | | binhlai0 -
Question about Syntax in Robots.txt
So if I want to block any URL from being indexed that contains a particular parameter what is the best way to put this in the robots.txt file? Currently I have-
Intermediate & Advanced SEO | | DRSearchEngOpt
Disallow: /attachment_id Where "attachment_id" is the parameter. Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do Disallow: attachment_id or Disallow: attachment_id= but figured I would ask you guys first. Thanks!0 -
Can URLs blocked with robots.txt hurt your site?
We have about 20 testing environments blocked by robots.txt, and these environments contain duplicates of our indexed content. These environments are all blocked by robots.txt, and appearing in google's index as blocked by robots.txt--can they still count against us or hurt us? I know the best practice to permanently remove these would be to use the noindex tag, but I'm wondering if we leave them they way they are if they can still hurt us.
Intermediate & Advanced SEO | | nicole.healthline0 -
MOZ crawl report says category pages blocked by meta robots but theyr'e not?
I've just run a SEOMOZ crawl report and it tells me that the category pages on my site such as http://www.top-10-dating-reviews.com/category/online-dating/ are blocked by meta robots and have the meta robots tag noindex,follow. This was the case a couple of days ago as I run wordpress and am using the SEO Category updater plugin. By default it appears it makes categories noindex, follow. Therefore I edited the plugin so that the default was index, follow as I want google to index the category pages so that I can build links to them. When I open the page in a browser and view source the tags show as index, follow which adds up. Why then is the SEOMOZ report telling me they are still noindex,follow? Presumably the crawl is in real time and should pick up the new follow tag or is it perhaps because its using data from an old crawl? As yet these pages aren't indexed by google. Any help is much appreciated! Thanks Sam.
Intermediate & Advanced SEO | | SamCUK0 -
Best practices for robotx.txt -- allow one page but not the others?
So, we have a page, like domain.com/searchhere, but results are being crawled (and shouldn't be), results look like domain.com/searchhere?query1. If I block /searchhere? will it block users from crawling the single page /searchere (because I still want that page to be indexed). What is the recommended best practice for this?
Intermediate & Advanced SEO | | nicole.healthline0 -
Robots.txt: Can you put a /* wildcard in the middle of a URL?
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt. For example: Disallow: /images/ is blocked just fine However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed. The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
Intermediate & Advanced SEO | | IHSwebsite0 -
301 redirect or Robots.txt on an interstatial page
Hey guys, I have an affiliate tracking system that works like this : an affiliate puts up a certain code on his site, for example : www.domain.com/track/aff_id This url leads to a page where the hit is counted, analysed and then 302 redirects to my sales page with the affiliates ID in the url : www.mysalespage.com/?=aff_id. However, we've noticed recently that one affiliate seems to be ranking for our own name and the url google indexed was his tracking url (domain.com/track/aff_id). Which is strange because there is absolutely nothing on that page, its just an interstatial page so that our stats tracking software can properly filter hits. To remove the affiliate's url from showing up in the serps, I've come up with 2 solutions : 1 - Change the redirect to a 301 redirect on his track page. 2 - Change our robots.txt page to block all domain.com/track/ pages from being indexed. My question is : if I 301 redirect instead of 302, will I keep the affiliates from outranking me for my own name AND pass on link juice or should I simply block google from crawling the interstatial tracking pages?
Intermediate & Advanced SEO | | CrakJason0 -
Why is noindex more effective than robots.txt?
In this post, http://www.seomoz.org/blog/restricting-robot-access-for-improved-seo, it mentions that the noindex tag is more effective than using robots.txt for keeping URLs out of the index. Why is this?
Intermediate & Advanced SEO | | nicole.healthline0