Robots.txt

teleman

Hello everyone

I have the following link:

http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167

I want to prevent google from indiexing everything that is related to "view=send_friend"

The problem is that its giving me dublicate content, and the content of the links has no SEO value of any sort.

My problem is how i disallow it correctly via robots.txt

I tried this syntax:

Disallow: /view=send_friend/

However after doing a crawl on request the 200+ dublicate links that contains view=send_friend is still present in the CSV crawl report.

What is the correct syntax if i want to prevent google from indexing everything that is related to this kind of link?

teleman

I added your suggestion to robots.txt and requested a crawl again.

I only have 3 pages with dublicate page content now

So your suggestion seemes to have worked.

Thanks for your reply.. it worked!

JarnoNijzing

you are right. misinterpreted the explanation. Apologies

Martijn_Scheijbeler

Jarno,

The $ would suggest this parameter is always on the end of a URL. And within Henrik's example it's already somewhere in the middle of the URL.

JarnoNijzing

Henrik,

i think you should be looking into something like this:

User-agent: Googlebot
Disallow: /*view=send_friend$

hope this helps

Kind regards

Jarno

Martijn_Scheijbeler

Hi Henrik,

I would suggest trying: Disallow: &view=send_friend
Optional you could try this without the & as I'm not sure this is always at the start of this parameter.

Hope this helps!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt - What is the correct syntax?

Browse Questions

Explore more categories

Related Questions

Google is Still Blocking Pages Unblocked 1 Month ago in Robots

Why is robots.txt blocking URL's in sitemap?

Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?

I accidentally blocked Google with Robots.txt. What next?

Robots.txt query

Site not being Indexed that fast anymore, Is something wrong with this Robots.txt

How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?