Application & understanding of robots.txt
-
Hello Moz World!
I have been reading up on robots.txt files, and I understand the basics. I am looking for a deeper understanding on when to deploy particular tags, and when a page should be disallowed because it will affect SEO. I have been working with a software company who has a News & Events page which I don't think should be indexed. It changes every week, and is only relevant to potential customers who want to book a demo or attend an event, not so much search engines. My initial thinking was that I should use noindex/follow tag on that page. So, the pages would not be indexed, but all the links will be crawled.
I decided to look at some of our competitors robots.txt files. Smartbear (https://smartbear.com/robots.txt), b2wsoftware (http://www.b2wsoftware.com/robots.txt) & labtech (http://www.labtechsoftware.com/robots.txt).
I am still confused on what type of tags I should use, and how to gauge which set of tags is best for certain pages. I figured a static page is pretty much always good to index and follow, as long as it's public. And, I should always include a sitemap file. But, What about a dynamic page? What about pages that are out of date? Will this help with soft 404s?
This is a long one, but I appreciate all of the expert insight. Thanks ahead of time for all of the awesome responses.
Best Regards,
Will H.
-
Yup.. also don't forget that robots.txt is just a "recommendation" for robots. they do not obey it
Basically Google does what ever it wants to
Also if you want to block a folder so its inner content wont be "accessed", in case anylink will point to this page, even if its coming from outside of your domain, it will be indexed.. Although the content of it wont be shown on search results but it will show up with a notice stating that the site content is blocked due to the sites robots.txt..best of luck!
-
Great Advice Yossi & Chris. Thanks for taking the time to reply. I will have to dig into the Google Guidelines for additional information, but both of your points are valid. I think I was looking at robots.txt the wrong way. Thanks Again Guys!
-
I completely agree with Yossi here; no need to go blocking that page at all.
I can't really add any further value to the points he has covered but one other part of your question suggested that perhaps you're looking at this the wrong way (and it's very common, don't worry!). Rather than having your site stay as-is and just obscuring the bad parts of it from search engines, the thought process should really around creating a great website instead.
If you're ever considering blocking a page from search engines, the first step should always be "why am I blocking this page(s); could I just fix the issue instead?".
For example, you asked if this might help with soft 404s. Rather than trying to find a way to hide these soft 404s, spend that time fixing them instead!
-
Hi Will
There are some concerns that you have which I do not understand.
Why you want to block News & Events page? If it has unique content and on top of that if it is updated regularly, you have no reason to block access to the page. If it is "relevant to potential customers who want to book a demo" its great. I would definitely keep it indexed and followed.Google explicitly states that you should not block access to a page if you simply want to de-index it/remove it. If the page should not be indexed publicly you should remove it or password protect it (a google suggestion).
About tags, i assume you are talking about meta tags, correct?
There is no need to use any kind of meta tag to signal search engines that they need to index or follow the page, you use it only when you want to limit them not to take certain actions.
Also there is no difference between a static or dynamic page when it comes to tag usage. There is no rules for that. A page perfectly be static for years and still get indexed and ranked very good. (but, well we all know that updating the site is a ranking signal)
If you believe that certain page should be tagged "noindex" it is not because it is not updated within the last month or year. Just for an example: contact us pages, about us pages and terms of use pages. These are super static pages that in many cases probably wont be changed for years.best
Yossi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ranking #1 in Bing & DuckDuckGo, not at all for Google - where am I going wrong?
According to the Moz rank checking tool, my blog ranks in the top 3 for my name "James Crowley" on Bing, Yahoo (both in the US and UK), and also DuckDuckGo (though Moz can't tell me that). And yet doesn't rank anywhere for Google. I don't have any penalties, and for other keywords it appears fine on Google. Does this seem strange to you? Am I going wrong somewhere? The blog is https://www.jamescrowley.net/. Many thanks James Nq5uF2al.png
Intermediate & Advanced SEO | | james.crowley0 -
AMP Benefits
Hello, Does AMP have ranking benefits ? Should I just AMP my post or all the pages of my website, product page, homepage etc... Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
How Can I Displace a Quora Q&A in a Google Featured Snippet?
Hello all. I'm looking for ideas for displacing a Quora Q&A as the featured snippet in google search results. I rank organically for the target term (it's a branded term, "urban airship pricing") in results 1, 2, 3 and 4. The Quora Q&A ranks 5, but is still getting the featured snippet. The Quora question, which is from 2013, is negative - essentially "why does Urban Airship cost so much." It was posed / someone answered the question before we restructured pricing, and added a free starter edition, so the information in the answer is incorrect. It's causing issues for our sales teams, there's a fair amount of volume around this term for us, and worst of all, it's making me mad 😉 I've considered the tactics listed below, but would love to know if anyone's done this, and what free or low-lost tactics work/where to focus efforts. Thanks in advance for help! -Jessica Tactics I'm Considering (Are some or all worth doing? Better ideas?) Create a pricing FAQ page on my website to try give Google a short answer to a query related to pricing that it might feature instead of the Quora Q&A Get a lot of folks to downvote the Quora question (and upvote the short answer we added). Although I'm worried that "activity" on the question might actually make things worse not better in terms of its visibility. Buy paid Google Adwords for the term so the featured snippet isn't quite so starkly featured (we were buying for this term, looking into why our ads aren't showing up at the moment) Talk about pricing on sites like Product Hunt or others (other ideas?) to see if they'll rank highly enough to add more/better content to page 1 results. Contact Quora and let them know that this outdated question is being pulled into a featured snippet and see if they'll do something about it (remove it, etc.) Provide feedback to Google (using the link under the snippet) that "something is wrong" or "this isn't useful"
Intermediate & Advanced SEO | | jpoundstone0 -
Thin Content, Ecommerce & Reviews
I've been reading a lot today about thin content and what constitutes thin content. We have an ecommerce site and have to compete with large sites in Google - product pages in terms of content quantity are low and obviously competitors all have similar variations of the same product descriptions. Does Google still consider ecommerce sites as with thin content as low quality? A product page surely shouldn't have too much content which doesn't help the user. My solution to start was to get our customer reviews added to the product pages to help improve the amount of quality content on this page, then move into adding video etc when we have resource. Thanks
Intermediate & Advanced SEO | | BeckyKey0 -
Canonicle & rel=NOINDEX used on the same page?
I have a real estate company: www.company.com with approximately 400 agents. When an agent gets hired we allow them to pick a URL which we then register and manage. For example: www.AGENT1.com We then take this agent domain and 301 redirect it to a subdomain of our main site. For example
Intermediate & Advanced SEO | | EasyStreet
Agent1.com 301’s to agent1.company.com We have each page on the agent subdomain canonicled back to the corresponding page on www.company.com
For example: agent1.company.com canonicles to www.company.com What happened is that google indexed many URLS on the subdomains, and it seemed like Google ignored the canonical in many cases. Although these URLS were being crawled and indexed by google, I never noticed any of them rank in the results. My theory is that Google crawled the subdomain first, indexed the page, and then later Google crawled the main URL. At that point in time, the two pages actually looked quite different from one another so Google did not recognize/honor the canonical. For example:
Agent1.company.com/category1 gets crawled on day 1
Company.com/category1 gets crawled 5 days later The content (recently listed properties for sale) on these category pages changes every day. If Google crawled the pages (both the subdomain and the main domain) on the same day, the content on the subdomain and the main domain would look identical. If the urls are crawled on different days, the content will not match. We had some major issues (duplicate content and site speed) on our www.company.com site that needed immediate attention. We knew we had an issue with the agent subdomains and decided to block the crawling of the subdomains in the robot.txt file until we got the main site “fixed”. We have seen a small decrease in organic traffic from google to our main site since blocking the crawling of the subdomains. Whereas with Bing our traffic has dropped almost 80%. After a couple months, we have now got our main site mostly “fixed” and I want to figure out how to handle the subdomains in order to regain the lost organic traffic. My theory is that these subdomains have a some link juice that is basically being wasted with the implementation of the robots.txt file on the subdomains. Here is my question
If we put a ROBOTS rel=NOINDEX on all pages of the subdomains and leave the canonical (to the corresponding page of the company site) in place on each of those pages, will link juice flow to the canonical version? Basically I want the link juice from the subdomains to pass to our main site but do not want the pages to be competing for a spot in the search results with our main site. Another thought I had was to place the NOIndex tag only on the category pages (the ones that seem to change every day) and leave it off the product (property detail pages, pages that rarely ever change). Thank you in advance for any insight.0 -
Disavowal & Reconsideration request - Can I do one without the other?
I submitted a link disavowal file for a client a few weeks ago and before doing that I read up on how to properly use the tool. My understanding is that if you received a manual penalty then you need to submit a reconsideration request after cleaning up links. We didn't receive a penalty so I didn't submit one. I'm wondering if anyone has used the tool (not stemming from a penalty) and if you did or didn't submit a recon. request, and what the results were. I've read that if a site is hit algorithmically, then filing a recon request won't help. Should I just do it anyway? Would be great to hear from anyone who has gone through a similar situation.
Intermediate & Advanced SEO | | Vanessa120 -
Affiliate & canonicals
Hi, any help with this one would be great.... www.example.com sells widgets online. They are also promoted on a 3rd party website www.partner.com. Currently www.partner.com links to a page on www.example.com that is completely branded with the 'partners' design, style and unique copy (you would think you were still on 'partner' website). I saw this interesting article from 2011: http://www.seomoz.org/blog/getting-seo-value-from-your-affiliate-links (in particular idea 1) Do you think adding a rel=canonical on www.example.com's partner page is still safe? All the best & thank you, Richard
Intermediate & Advanced SEO | | Richard5550 -
Subdomains vs. Subfolders for unique categories & topics
Hello, We are in the process of redesigning and migrating 5 previously separate websites (all different niche topics, including dining, entertainment, retail, real estate, etc.) under one umbrella site for the property in which they exist. From the property homepage, you will now be able to access all of the individual category sites within. As each niche microsite will be focused on a different topic, I am wondering whether it is best for SEO that we use subdomains such as category.mainsite.com or subfolders mainsite.com/category. I have seen it done both ways on large corporate sites (ie: Ikea uses subdomains for different country sites, and Apple uses subfolders), so I am wondering what makes the most sense for this particular umbrella site. Any help is greatly appreciated. Thanks, Melissa
Intermediate & Advanced SEO | | grapevinemktg0