Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Robots.txt question

Intermediate & Advanced SEO

612

rbai last edited by

I notice something weird in Google robots. txt tester

I have this line

Disallow: display=

in my robots.text but whatever URL I give to test it says blocked and shows this line in robots.text

for example this line is to block pages like

http://www.abc.com/lamps/floorlamps?display=table

but if I test

http://www.abc.com/lamps/floorlamps or any page

it shows as blocked due to Disallow: display=

am I doing something wrong or Google is just acting strange? I don't think pages with no display= are blocked in real.
1 Reply Last reply
Reply Quote 0
Mobilio last edited by

Yes - there is bug in your robots.txt. You should wrote some as:
Disallow: /?display=table
or:
Disallow: /?display=*
1 Reply Last reply
Reply Quote 5

Got a burning SEO question?

Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.

Start my free trial

Browse Questions

View

From

Sorted by

With category

Explore more categories

Related Questions

Robots.txt blocked internal resources Wordpress

Hi all, We've recently migrated a Wordpress website from staging to live, but the robots.txt was deleted. I've created the following new one: User-agent: *
Allow: /
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Allow: /wp-admin/admin-ajax.php However, in the site audit on SemRush, I now get the mention that a lot of pages have issues with blocked internal resources in robots.txt file. These blocked internal resources are all cached and minified css elements: links, images and scripts. Does this mean that Google won't crawl some parts of these pages with blocked resources correctly and thus won't be able to follow these links and index the images? In other words, is this any cause for concern regarding SEO? Of course I can change the robots.txt again, but will urls like https://example.com/wp-content/cache/minify/df983.js end up in the index? Thanks for your thoughts!
Intermediate & Advanced SEO | | Mat_C

2
Robots.txt & Disallow: /*? Question!

Hi, I have a site where they have: Disallow: /*? Problem is we need the following indexed: ?utm_source=google_shopping What would the best solution be? I have read: User-agent: *
Allow: ?utm_source=google_shopping
Disallow: /*? Any ideas?
Intermediate & Advanced SEO | | vetofunk

0
HTTPS 301 Redirect Question

Hi, I've just migrated our previous site (siteA) to our new url (siteB) and I've setup 301 redirects from the old url (siteA) to the new (siteB). However, the old url operated on https and users who try to go to the old url with https (https://siteA.com) receive a message that the server cannot be reached, while the users who go to http://siteA.com are redirected to siteB. Is there a way to 301 redirect https traffic? Also, from an SEO perspective if the site and all the references on Google search are https://siteA.com does a 301 redirect of http pass the domain authority, etc. or is https required? Thanks.
Intermediate & Advanced SEO | | opstart

0
Questions on Google Penguin Clean-up Strategy

Hello Moz Community! I was hit with a REAL bad penalty in May 2013, and the date corresponds to Penguin #4. Never received a manual spam action, but the 50% drop in traffic was very apparent. Since then, I've had a slow reduction in traffic, to where I am today... which is almost baseline. Increases in traffic have not occurred regardless of efforts. In researching a little more, I see that my old SEO companies built my links with exact keyterm matches, many of them repeated over and over, verbatim, on different sites. I've heard two pieces of advice that I don't like 1) scrap the site, or 2) disavow all the links. I would rather see if I can get the webmasters to change the link to something generic, or my brand name, before I do either of these. To scrap my site and start new will be damn near impossible because I'm in an extremely competitive niche, and my site has age (since 2007), so rather work with what I have. A couple of questions, for folks who are in the know about this penalty, if I may: This penguin update, #4, on May 22nd, was it ONLY because of the link text? Or was it also because of the link quality? None of the updates before it harmed me, and I believe those were because of the quality? Could it be for links linking from my blog to my site? My blog (ex. www.mysite.com/blog), has close to 1,000 blog posts, and back in the days I would write these really long, keyword stuffed links leading to www.mysite.com. I've been in the process of cleaning these up, and shortening them, and changing them to more generic (click here's), but it is a LONG and painstaking process. If I get webmasters to change text to just the url or brand name, that's better than disavowing, correct? As long the linking site has a decent spam score and PA/DA on OSE? Is having SOME exact anchor text okay on these links? Is it just the abuse that's the problem? If so, how many should I leave? (like 5 max per keyword?) Or should I just change to the url, or disavow altogether, any and all links that have exact keyword matches? I've downloaded my link profile from OSE and Majestic, and will do so from Ahrefs (I believe it is)? Does Webmaster Tools have any section that can help give me insights into the issue? If so, can you point me in the right direction? Can I get partial credit, for some work done? For instance, say a major update, or crawl, happens, and I've only fixed/disavowed 25% percent of the links by then, is there a possibility that I get a small boost in traffic? Or am I in the doghouse till they are all fixed? Say I clean/disavow everything up, will my improvement be seen in the next crawl? Or the next Penguin update? As there may be a substantial difference in time there. 😎 I see AHREFS, has some information on anchor text... any rules of thumb as to percentages of use of a certain anchor text, to see if I'm abusing or not, before I start undertaking all of this? Thanks! Could the penalty have "passed" altogether, and this is just where I rank? Thanks guys, but the last thing I want to do is ditch my site... I will work hard on this, but need some guidance. Much appreciated! David
Intermediate & Advanced SEO | | DavidC.

0
Yoast seo title question

I was referred to this plugin and have found it to be the most irritating and poorly designed plugin in the world. I want to be able to set my titles without it changing my page headers as well. For instance - If I set my title to be "This is my article name | site name" it will make my H1 tag read the same. I do not want or desire this nonsense. Why would they think this is something wise? Why would I want my site name on every single H1 tag on my site? How can I fix this? I only want my title to be my title. I want my H1 tag to remain the post/page name that I define in wordpress.
Intermediate & Advanced SEO | | Atomicx

0
Another E-commerce Canonical Question

Hi guys, Quick question: one of our clients has an e-commerce site with a very poor canonical tag setup and thousands of pages of duplicate content. Let's use this as an example: BRAND > Category > Type > Color
Four separate pages/URLs. The BRAND page lists all products.
The Category page lists all BRAND products for that category.
The Type page lists all BRAND products of a specific type in that category.
The Color page lists all BRAND products of a specific type in that category of a specific color. Anyway, these generate four separate URLs: /BRAND
/BRAND/Category
/BRAND/Category-Type
/BRAND/Category-Type-Color Avoiding duplicate content and product listings, I would appreciate your proposed canonicalization strategy/feedback.
Intermediate & Advanced SEO | | elcrazyhorse

0
Site changes lead to big questions

I'm making some changes to my business that will cause me to move my blog to a new domain. The existing site will serve as a sales campaign for our full service programs and I want to keep visitors focused on that campaign. The old site will serve much like a mini site with a sales letter and video sales letter. In moving the blog content to another page - I found a post from Rand from a few years ago http://www.seomoz.org/blog/expectations-and-best-practices-for-moving-to-or-launching-a-new-domain. The way I wanted to approach this was to remove the content from the old site, and then resubmit the site map to Google for indexing. Of course they'll notice that the blog pages are gone. (probably a load of 404's) After perhaps a week, I'd repost the content (about 50 posts) on the new domain, which will be little more than a blog. I'd like some input on the way to approach this. Should I... a) Follow Rand's formula? b) Go with my idea (sort of the brute force model)? c) Consider an alternative method? It's probably worth mentioning that none of these posts have high search engine rankings. I appreciate your input Mozzers!
Intermediate & Advanced SEO | | sdennison

0
202 error page set in robots.txt versus using crawl-able 404 error

We currently have our error page set up as a 202 page that is unreachable by the search engines as it is currently in our robots.txt file. Should the current error page be a 404 error page and reachable by the search engines? Is there more value or is it a better practice to use 404 over a 202? We noticed in our Google Webmaster account we have a number of broken links pointing the site, but the 404 error page was not accessible. If you have any insight that would be great, if you have any questions please let me know. Thanks, VPSEO
Intermediate & Advanced SEO | | VPSEO

0