Robot.txt help
-
Hi,
We have a blog that is killing our SEO.
We need to
Disallow
Disallow: /Blog/?tag*
Disallow: /Blog/?page*
Disallow: /Blog/category/*
Disallow: /Blog/author/*
Disallow: /Blog/archive/*
Disallow: /Blog/Account/.
Disallow: /Blog/search*
Disallow: /Blog/search.aspx
Disallow: /Blog/error404.aspx
Disallow: /Blog/archive*
Disallow: /Blog/archive.aspx
Disallow: /Blog/sitemap.axd
Disallow: /Blog/post.aspxBut Allow everything below /Blog/Post
The disallow list seems to keep growing as we find issues. So rather than adding in to our Robot.txt all the areas to disallow. Is there a way to easily just say Allow /Blog/Post and ignore the rest. How do we do that in Robot.txt
Thanks
-
These: http://screencast.com/t/p120RbUhCT
They appear on every page I looked at, and take up the entire area "above the fold" and the content is "below the fold"
-Dan
-
Thanks Dan, but what grey areas, what url are you looking at?
-
Ahh. I see. You just need to "noindex" the pages you don't want in the index. As far as how to do that with blogengine, I am not sure, as I have never used it before.
But I think a bigger issue is like the giant box areas at the top of every page. They are pushing your content way down. That's definitely hurting UX and making the site a little confusing. I'd suggest improving that as well
-Dan
-
Hi Dan, Yes sorry that's the one!
-
Hi There... that address does not seem to work for me. Should it be .net? http://www.dotnetblogengine.net/
-Dan
-
Hi
The blog is www.dotnetblogengine.com
The content is only on the blog once it is just it can be accessed lots of different ways
-
Andrew
I doubt that one thing made your rankings drop so much. Also, what type of CMS are you on? Duplicate content like that should be controlled through indexation for the most part, but I am not recognizing that type of URL structure as any particular CMS?
Are just the title tags duplicate or the entire page content? Essentially, I would either change the content of the pages so they are not duplicate, or if that doesn't make sense I would just "noindex" them.
-Dan
-
Hi Dan,
I am getting duplicate content errors in WMT like
This is because tag=ABC and page=1 are both different ways to get to www.mysite.com/Blog/Post/My-Blog-Post.aspx
To fix this I have remove the URL's www.mysite.com/Blog/?tag=ABC and www.mysite.com/Blog/?Page=1from GWMT and by setting robot.txt up like
User-agent: *
Disallow: /Blog/
Allow: /Blog/post
Allow: /Blog/PostI hope to solve the duplicate content issue to stop it happening again.
Since doing this my SERP's have dropped massively. Is what I have done wrong or bad? How would I fix?
Hope this makes sense thanks for you help on this its appreciated.
Andrew
-
Hi There
Where are they appearing in WMT? In crawl errors?
You can also control crawling of parameters within webmaster tools - but I am still not quite sure if you are trying to remove these from the index or just prevent crawling (and if preventing crawling, for what reason?) or both?
-Dan
-
Hi Dan,
The issue is my blog had tagging switched on, it cause canonicalization mayhem.
I switched it off, but the tags still appears in Google Webmaster Tools (GWMT). I Remove URL via GWMT but they are still appearing. This has also caused me to plummet down the SERPs! I am hoping this is why my SERPs had dropped anyway! I am now trying to get to a point where google just sees my blog posts and not the ?Tag or ?Author or any other parameter that is going to cause me canoncilization pain. In the meantime I am sat waiting for google to bring me back up the SERPs when things settle down but it has been 2 weeks now so maybe something else is up?
-
I'm wondering why you want to block crawling of these URLs - I think what you're going for is to not index them, yes? If you block them from being crawled, they'll remain in the index. I would suggest considering robots meta noindex tags - unless you can describe in a little more detail what the issue is?
-Dan
-
Ok then you should be all set if your tests on GWMT did not indicate any errors.
-
Thanks it goes straight to www.mysite.com/Blog
-
Yup, I understand that you want to see your main site. This is why I recommended blocking only /Blog and not / (your root domain).
However, many blogs have a landing page. Does yours? In other words, when you click on your blog link, does it take you straight to Blog/posts or is there another page in between, eg /Blog/welcome?
If it does not go straight into Blog/posts you would want to also allow the landing page.
Does that make sense?
-
The structure is:
www.mysite.com - want to see everything at this level and below it
www.mysite.com/Blog - want to BLOCK everything at this level
www.mysite.com/Blog/posts - want to see everything at this level and below it
-
Well what Martijn (sorry, I spelled his name wrong before) and I were saying was not to forget to allow the landing page of your blog - otherwise this will not be indexed as you are disallowing the main blog directory.
Do you have a specific landing page for your blog or does it go straight into the /posts directory?
I'd say there's nothing wrong with allowing both Blog/Post and Blog/post just to be on the safe side...honestly not sure about case sensitivity in this instance.
-
"We're getting closer David, but after reading the question again I think we both miss an essential point ;-)" What was the essential point you missed. sorry I don't understand. I don;t want to make a mistake in my Robot.txt so would like to be 100% sure on what you are saying
-
Thanks guys so I have
User-agent: *
Disallow: /Blog/
Allow: /Blog/post
Allow: /Blog/Postthat works. My Home page also works. I there anything wrong with including both uppercase "Post" and lowercase "post". It is lowercase on the site but want uppercase "P" just incase. Is there a way to make the entry non case sensitive?
Thanks
-
Correct, Martijin. Good catch!
-
There was a reason that I said he should test this!
We're getting closer David, but after reading the question again I think we both miss an essential point ;-). As we know also exclude the robots from crawling the 'homepage' of the blog. If you have this homepage don't forget to also Allow it.
-
Well, no point in a blog that hurts your seo
I respectfully disagree with Martijin; I believe what you would want to do is disallow the Blog directory itself, not the whole site. It would seem if you Disallow: / and _Allow:/Blog/Post _ that you are telling SEs not to index anything on your site except for /Blog/Post.
I'd recommend:
User-agent: *
Disallow: /Blog/
Allow: /Blog/PostThis should block off the entire Blog directory except for your post subdirectory. As Maritijin stated; always test before you make real changes to your robots.txt.
-
That would be something like this, please check this or test this within Google Webmaster Tools if it works because I don't want to screw up your whole site. What this does is disallowing your complete site and just allows the /Blog/Post urls.
User-agent: *
Disallow: /
Allow: /Blog/Post
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Teaser Content Help!!
I'm in the process of a redesign and upgrade to Drupal 8 and have used Drupal's taxonomy feature to add a fairly large database of Points of Interest, Services etc. initially this was just for a Map/Filter for site users. The developer also wants to use teasers from these content types (such as a scenic vista description) as a way to display the content on relevant pages (such as the scenic vistas page, as well as other relevant pages). Along with the content it shows GPS coordinates and icons related to the description. In short, it looks cool, can be used in multiple relevant locations and creates a great UX. However, many of these teasers would basically be pieces of content from pages with a lot of SEO value, like descriptive paragraphs about scenic viewpoints from the scenic viewpoints page. Below is an example of how the descriptions of the scenic viewpoints would be displayed on the scenic viewpoints pages, as well as other potential relevant pages. HOW WILL THIS AFFECT THE SEO VALUE OF THE CONTENT?? Thanks in advance for any help, I can't find an answer anywhere. About 250 words worth of content about a scenic vista. There’s about 8 scenic vista descriptions like this from the scenic vistas page, so a good chunk of valuable content. There are numerous long form content pages like this that have descriptions and information about sites and points of interest that don't warrant having their own page. For more specific content with a dedicated page, I can just the the intro paragraph as a teaser and link to that specific page of content. Not sure what to do here.
Intermediate & Advanced SEO | | talltrees0 -
Need Help - Lost 75% Of Traffic Since May 2018
Sorry to go in-depth here, but want to give all available information. We went live late April 2018 with our two websites in Shopify (moved from Magento, same admin, different storeviews...which we find later to cause some issues). Both of these websites sell close to the same products (we purchased a competitor about 5 years ago, which is why we have two). The nice thing is that they do almost identical amounts in sales. They have done very well for years, especially in the last two years. Well, the core algo update around May 22nd-24th 2018 happened and wiped out about 65% of our Google traffic for one website (MySupplementStore.com). And this latest update, wiped out another 20%. I couldn't figure out why this would have happened, because we were very cautious about keeping things separate, unique descriptions etc. So I did some digging and this is what I found: The reviews we migrated over from Magento somehow were combined and added to both websites. This is something I didn't notice. I had this resolved a month ago so that each site's reviews are now only on that website. Our blog section was duplicated across both websites during the migration. Again, something I didn't notice, as we have close to over 1,000 blog posts per site. This was resolved two weeks ago. As I was looking more, I found that the last 6 months, a person working for us (for 3 years), started writing descriptions and pasting them on both websites, instead of making them unique to each website. I trusted her for years, but I think she just got lazy. She quit about a month before the migration as well. We are currently working on this, but its been taking awhile because we have over 5,000 products on each site and have no idea which ones are duplicates. I did also notice: Site very slow when checking site speed tools. Working on that this week. When I take snippets of text or do searches, many times it shows up in omitted results. No messages in Google Webmaster Tools So the question is... Do you think it is the duplicate content issues that caused the drop? Our other site is Best Price Nutrition, which didn't see a big drop at all during that update. If not, any other ideas why?
Intermediate & Advanced SEO | | vetofunk0 -
Robots txt is case senstive? Pls suggest
Hi i have seen few urls in the html improvements duplicate titles Can i disable one of the below url in the robots.txt? /store/Solar-Home-UPS-1KV-System/75652
Intermediate & Advanced SEO | | Rahim119
/store/solar-home-ups-1kv-system/75652 if i disable this Disallow: /store/Solar-Home-UPS-1KV-System/75652 will the Search engines scan this /store/solar-home-ups-1kv-system/75652 im little confused with case senstive.. Pls suggest go ahead or not in the robots.txt0 -
Video SERP Help
Hello Friends,
Intermediate & Advanced SEO | | KINQDOM
I try to appear on search results of property related search terms with my property videos. Here is a sample property video
http://www.antalyahomes.com/videositemap.asp May you please check it and tell me what I do wrong? Thanks in advance for your time.0 -
Meta Robot Tag:Index, Follow, Noodp, Noydir
When should "Noodp" and "Noydir" meta robot tag be used? I have hundreds or URLs for real estate listings on my site that simply use "Index", Follow" without using Noodp and Noydir. Should the listing pages use these Noodp and Noydr also? All major landing pages use Index, Follow, Noodp, Noydir. Is this the best setting in terms of ranking and SEO. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Should I be using meta robots tags on thank you pages with little content?
I'm working on a website with hundreds of thank you pages, does it make sense to no follow, no index these pages since there's little content on them? I'm thinking this should save me some crawl budget overall but is there any risk in cutting out the internal links found on the thank you pages? (These are only standard site-wide footer and navigation links.) Thanks!
Intermediate & Advanced SEO | | GSO0 -
Help!!! Am I being Attacked???
Hello, I do not believe so much in spammy links attacks and I definitely do not believe my site is worth attacking. However, I'm seeing new links pointing to my site that I have no idea where they come from. I just spotted three articles on a poor crappy article site with exact match keywords point to me. The articles are completely unique (copyscaped them) and they were posted according to the site time stamp during Oct and Nov 2012. (And they Appear in the WMT recently discovered links from more or less the same time). What to do (besides for disavowing this domain)? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
Help diagnosing a complex SEO issue
Good evening SEOMoz. A series events, in close succession are making it somewhat difficult for me to diagnose a cause of fluctuations in traffic. Please excuse some of the stupid moves I made, but desperation got the better of me. One of my most beloved websites was hit by Panda on January 18th. Pretty sure it was due to a CMS bug that is now fixed. The website site started to show great signs of recovery from April 19th - Panda 3.5. I'm going to be as explicit as possible with the traffic for the days that follow. Traffic was stable previously. April 20th +10%. April 21st +5%. April 22nd +5%. (half way recovered, also the first real fluctuation since the site was hit in Jan). Due to the looming over-optimisation penalty, on the 22nd I changed the titles to unoptimise them a little. (fear is a dangerous thing at times). April 23rd -10%. April 24th -10% April 25th onwards, pretty much levelled out. The websites I've seen hit by Penguin, lost around 40% of their traffic, very steeply on 24th and 25th April. So the drops aren't in keeping with my experience of Penguin. But they do coincide perfectly with the massive site-wide title change. I've haven't read anything definitive about a penalty for changing titles too often, but for obvious reasons, it makes sense. The drop seems terribly soon after changing titles, but the site is very heavily indexed. It's also worth mentioning that I did changed the titles BACK, incase it was purely the fact the titles had been slightly de-optimised, that caused the drop. I waited until May 5th. This had no positive nor negative effect. It's a lot to take in but I'd love to hear your thoughts. I'm feeling a little bamboozled looking at all the figures. There was of course the above the fold update on the 19th Jan, but lets ignore that as we've only ever had a max 1 ad per page, most pages have none.
Intermediate & Advanced SEO | | seo-wanna-bs0