Robots.txt
-
Hi All
Having a robots.txt looking like the below will this stop Google crawling the site
User-agent: *
-
If that is the only line in your robots.txt file then it really shouldn't accomplish anything. It's like saying, "Hey...all search engines...take note of this....oh forget it, there's nothing to see here."
I agree with Dave...try to fetch the page in Webmaster tools (Google Search Console). You can also use the Webmaster Tools robots.txt tester which often will tell you if there are issues.
- Hi this is what we thought but Google has not indexed any pages
How old is the site? It can take weeks for a new site to get indexed and then to get ranked as well. Do you see any pages on a site: search for your domain? (i.e. site:example.com). This might sound silly, but are you sure that there is no noindex tag on the page?
-
Via Search Console try to "Fetch As Google" and assuming that works without errors use the submit function. You'll know very quickly whether you've got technical issues and get the page into the index very quickly.
-
Hey, throw us a link to your robots.txt file and we can take a look, probably tell you pretty quickly. Without seeing it, we're all pretty much just taking guesses.
-
David's spot on. The User-agent: * mean this section applies to all robots. If you want Google (or any robot) to index your whole site, no need for a robots.txt file.
-
From your original post I presumed you had not wanted Google to index your pages?
If you want Google to index your pages it can take some time to happen naturally. You might want to submit a sitemap and ask Google to crawl your site within Webmaster Tools.
Robots.txt is normally only used to block crawlers, so you will not need to put any code in there for it to allow Google to crawl.
-
Hi this is what we thought but Google has not indexed any pages
-
Wouldn't have thought so, you'd need to include this line of code as well:
Disallow: /
That will stop anything from crawling the site.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Role of Robots.txt and Search Console parameters settings
Hi, wondering if anyone can point me to resources or explain the difference between these two. If a site has url parameters disallowed in Robots.txt is it redundant to edit settings in Search Console parameters to anything other than "Let Googlebot Decide"?
Technical SEO | | LivDetrick0 -
Adding directories to robots nofollow cause pages to have Blocked Resources
In order to eliminate duplicate/missing title tag errors for a directory (and sub-directories) under www that contain our third-party chat scripts, I added the parent directory to the robots disallow list. We are now receiving a blocked resource error (in Webmaster Tools) on all of the pages that have a link to a javascript (for live chat) in the parent directory. My host is suggesting that the warning is only a notice and we can leave things as is without worrying about the page being de-ranked/penalized. I am wondering if this is true or if we should remove the one directory that contains the js from the robots file and find another way to resolve the duplicate title tags?
Technical SEO | | miamiman1000 -
Log in, sign up, user registration and robots
Hi all, We have an accommodation site that asks users only to register when they want to book a room, in the last step. Though this is the ideal situation when you have tons of users, nowadays we are having around 1500 - 2000 per day and making tests we found out that if we ask for a registration (simple, 1 click FB) we mail them all and through a good customer service we are increasing our sales. That is why, we would like to ask users to register right after the home page ie Home/accommodation or and all the rest. I am not sure how can I make to make that content still visible to robots.
Technical SEO | | Eurasmus.com
Will the authentication process block google crawling it? Maybe something we can do? We are not completely sure how to proceed so any tip would be appreciated. Thank you all for answering.3 -
IIS 7.5 - Duplicate Content and Totally Wrong robot.txt
Well here goes! My very first post to SEOmoz. I have two clients that are hosted by the same hosting company. Both sites have major duplicate content issues and appear to have no internal links. I have checked this both here with our awesome SEOmoz Tools and with the IIS SEO Tool Kit. After much waiting I have heard back from the hosting company and they say that they have "implemented redirects in IIS7.5 to avoid duplicate content" based on the following article: http://blog.whitesites.com/How-to-setup-301-Redirects-in-IIS-7-for-good-SEO__634569104292703828_blog.htm. In my mind this article covers things better: www.seomoz.org/blog/what-every-seo-should-know-about-iis. What do you guys think? Next issue, both clients (as well as other sites hosted by this company) have a robot.txt file that is not their own. It appears that they have taken one client's robot.txt file and used it as a template for other client sites. I could be wrong but I believe this is causing the internal links to not be indexed. There is also a site map, again not for each client, but rather for the client that the original robot.txt file was created for. Again any input on this would be great. I have asked that the files just be deleted but that has not occurred yet. Sorry for the messy post...I'm at the hospital waiting to pick up my bro and could be called to get him any minute. Thanks so much, Tiff
Technical SEO | | TiffenyPapuc0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
Robots.txt issue - site resubmission needed?
We recently had an issue when a load of new files were transferred from our dev server to the live site, which unfortunately included the dev site's robots.txt file which had a disallow:/ instruction. Bad! Luckily I spotted it quickly and the file has been replaced. The extent of the damage seems to be that some descriptions aren't displaying and we're getting a message about robots.txt in the SERPs for a few keywords. I've done a site: search and generally it seems to be OK for 99% of our pages. Our positions don't seem to be affected right now but obviously it's not great for the CTRs on those keywords affected. My question is whether there is anything I can do to bring the updated robots.txt file to Google's attention? Or should we just wait and sit it out? Thanks in advance for your answers!
Technical SEO | | GBC0 -
Robots.txt
should I add anything else besides User-Agent: * to my robots.txt file? http://melo4.melotec.com:4010/
Technical SEO | | Romancing0 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0