Robots.txt
-
Google Webmaster Tools say our website's have low-quality pages, so we have created a robots.txt file and listed all URL’s that we want to remove from Google index.
Is this enough for the solve problem?
-
Ah, it's difficult to see anything on the page because i can't read Turkish.
The only thing you should know is that every single page in a website should have unique content. So if two pages are exactly or almost exactly the same then Google will think it's duplicate content.
-
Yeah that's definitely a duplicate content issue you're facing.
However, did you know that each of your pages have this little tag right at the top of them? name="robots" content="noindex" />
...Seems like it's already done.
-
Thank You Wesley,
Here our pages but language is Turkish,
http://www.enakliyat.com.tr/detaylar/besiktas-basaksehir-ev-esyasi-tasinma-6495
http://www.enakliyat.com.tr/detaylar/ev-tasima-6503
http://www.enakliyat.com.tr/detaylar/evden-eve-nakliyat-6471
Our site is a home to home moving listing portal. Consumers who wants to move his home fills a form so that moving companies can cote prices. We were generating listing page URL’s by using the title submitted by customer. Unfortunately we have understood by now that many customers have entered same content.
-
Well now I'm confused on the problem.. If the issue is duplicate content then the answer is definitely to block them with robots and/or use a rel=canonical tag on each.
However, the Google notice you are referencing has nothing to do with duplicate content notices to my knowledge.
There is always a way to improve your content. Filling out a form auto-generates a page, per my understanding. Great. Have it auto-generate a better looking page!
-my 2 cents. hope it's helpful.
-
I agree with Jesse and Allen.
Of course the problems in Google Webmaster Tools will disappear by no-indexing it.
Low quality pages isn't a good thing for visitors either.It's difficult to give you any other advice then the very broad advise: Improve the quality of the pages.
If you could give us some links to let us know which website and which pages we're talking about then we could give you a better advice on how exactly you can improve those pages. -
Our site is a home to home moving listing portal. Consumers who wants to move his home fills a form so that moving companies can cote prices. We were generating listing page URL’s by using the title submitted by customer. Unfortunately we have understood by now that many customers have entered same content.
-
Iskender.
Our experience has been YES. Google does follow your Robots.txt file and will ignore indexing those pages. If they have a problem, the problem will disappear.
My concern is, what is causing the "Low-quality" error message? In the long run, wouldn't it be better to correct the page to improve the quality? I look at each page as a way to qualify for a greater number of keywords, hence attracting more attention for your website.
We have had several pages flagged as duplicate content, when we never wanted the duplicate page indexed anyway. Once we included the page in the Robots.txt file the flagged error disappeared.
-
Why not improve the pages, instead?
If Google says they are low quality, what makes you think any viewer will stick around? Bet the bounce rate is exceptionally high on those pages, maybe even site-wide.
Always remember to design pages for readers and not Google. If Google tells you your pages suck, they are probably just trying to help you and give you a hint that it's time to improve your site.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages being flagged in Search Console as having a "no-index" tag, do not have a meta robots tag??
Hi, I am running a technical audit on a site which is causing me a few issues. The site is small and awkwardly built using lots of JS, animations and dynamic URL extensions (bit of a nightmare). I can see that it has only 5 pages being indexed in Google despite having over 25 pages submitted to Google via the sitemap in Search Console. The beta Search Console is telling me that there are 23 Urls marked with a 'noindex' tag, however when i go to view the page source and check the code of these pages, there are no meta robots tags at all - I have also checked the robots.txt file. Also, both Screaming Frog and Deep Crawl tools are failing to pick up these urls so i am a bit of a loss about how to find out whats going on. Inevitably i believe the creative agency who built the site had no idea about general website best practice, and that the dynamic url extensions may have something to do with the no-indexing. Any advice on this would be really appreciated. Are there any other ways of no-indexing pages which the dev / creative team might have implemented by accident? - What am i missing here? Thanks,
Technical SEO | | NickG-1230 -
Robot.txt : How to block a specific file type in several subdirectories ?
Hello everyone ! I need help setting up a robot.txt. I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site. Block files of a specific file type (for example, .gif) | Disallow: /*.gif$ 2 questions : Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ? Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$ Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files. Let's say I want to block pdf files in all these 3 directories /fileadmin/directory1 /fileadmin/directory1/sub1 /fileadmin/directory1/sub1/pdf Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple : Disallow: /fileadmin/directory1*/ Many thanks in advance for any insight you may have.
Technical SEO | | LabeliumUSA0 -
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?
I've got several URL's that I need to disallow in my robots.txt file. For example, I've got several documents that I don't want indexed and filters that are getting flagged as duplicate content. Rather than typing in thousands of URL's I was hoping that wildcards were still valid.
Technical SEO | | mkhGT0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT. I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap. Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
Technical SEO | | DotCar0 -
Should I add my blog posts to my sitemap.txt file?
This seems like it should be an obvious no, just because of the amount of work that would entail, and then remembering to do it every time I make a post, but since I couldn't find anything on Google about it and have never heard anyone mention it, I figured I'd ask.
Technical SEO | | UnderRugSwept0 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0