Blocking Certain Site Parameters from Google's Index - Please Help
-
Hello,
So we recently used Google Webmaster Tools in an attempt to block certain parameters on our site from showing up in Google's index. One of our site parameters is essentially for user location and accounts for over 500,000 URLs. This parameter does not change page content in any way, and there is no need for Google to index it. We edited the parameter in GWT to tell Google that it does not change site content and to not index it. However, after two weeks, all of these URLs are still definitely getting indexed. Why? Maybe there's something we're missing here. Perhaps there is another way to do this more effectively. Has anyone else ran into this problem?
The path we used to implement this action:
Google Webmaster Tools > Crawl > URL ParametersThank you in advance for your help!
-
Thanks! We will probably test this solution.
-
Continuing from EGOL's comment #3 if you do need the parameters for on-site search or categories then another option (admittedly it relies on Google obeying it) is to use the robots.txt and disallow the parameters for example:
Disallow: /*categoryFilter=*
Disallow: /*?utm_
As with any change to that could affect the visibility of your site to the search engines always test first.
-
Thanks, we have a few thousand parent pages that relate to these 500,000 URLs that have the parameters. Is there a quick way to canonicalise thousands of pages at once? It may not be scalable...
-
I recently posted about this problem here..
In summary, I have three points...
-
The parameters control in Google Webmaster Tools is unreliable. It did not work for me. And, it does not work for any other search engine. Find a different solution, is what I recommend.
-
Using rel=canonical relies on Google to obey it. From my experience it works well at present time. But we know that Google says how they are going to do things and then changes their mind without tellin' anybody. I would not rely on this.
-
If you really want to control these parameters, use htaccess to strip them off at the server level. That is doing it where you control it and not relying on what anybody says that they are going to do. Take control.
The only reservation about #3 is that you might need parameters for on-site search or category page sorting on your own site. These can be excluded from being stripped in your htaccess file.
Don't allow search engines to do anything for you that you can do for yourself. They can screw it up or quit doing it at any time and not say anything about it.
-
-
That was the link I was going to sugest simply from the title you set this up with.
Have you also canonicalised the page in question so that Google only determines that the parent page is the main source. it may help.
More details on setting it up here - Use Canonical URLs
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long will old pages stay in Google's cache index. We have a new site that is two months old but we are seeing old pages even though we used 301 redirects.
Two months ago we launched a new website (same domain) and implemented 301 re-directs for all of the pages. Two months later we are still seeing old pages in Google's cache index. So how long should I tell the client this should take for them all to be removed in search?
Intermediate & Advanced SEO | | Liamis0 -
Site still indexed after request 'change of address' search console
Hello, A couple of weeks ago we requested a change of address in Search console. The new, correct url is already indexed. Yet when we search the old url (with site:www.) we find that the old url is still indexed. Is there another way to remove old urls?
Intermediate & Advanced SEO | | conversal0 -
Canonical URL's searchable in Google?
Hi - we have a newly built site using Drupal, and Drupal likes to create canonical tags on pretty much everything, from their /node/ url's to the URL Alias we've indicated. Now, when I pull a moz crawl report, I get a huge list of all the /node/ plus other URL's. That's beside the point though... Question: when I directly enter one of the /node/ url's into a google search, a result is found. Clicking on it redirects to the new URL, but should Google even be finding these non-canonical URL's?? I don't feel like I've seen this before.
Intermediate & Advanced SEO | | Jenny10 -
Site was moved, but still exists on the old server and is being outranked for it's own name
Recently, a client went through a split with a business partner, they both had websites on the same domain, but within their own sub directories. There is a main landing page, which links to both sites, the landing page sits on the root. Ie. example.com is a landing page with links to example.com/partner1, and example.com/partner2 Parter 2 will be my client for this example. After the split, partner 2 downloaded his website, and put it up on his own server, but no longer has any kind of access to the old servers ftp, and partner 1 is refusing to cooperate in any way to have the site removed from the old server. They did add a 301 redirect for the home page on the old server for partner 2, so, example.com/partner2/index.html is 301'ing to the new site on the new server, HOWEVER, every other page is still live on that old server, and is outranking the new site in every instance. The home page is also being outranked, even with the 301 redirect in place. What are some steps I can take to rectify this? The clients main concern is that this old website, containing the old partners name, is outranking him for his own name, and the name of his practice. So far, here's what i've been thinking: Since the site has poor on-page optimization, i'll start be cleaning all of that up. I'll then optimize the home page to better depict the clients name and practice through proper usage of heading tags, titles, alt, etc, as well as the meta title and description. The only other thing I can think of would be to start building some backlinks? Any help/suggestions would be greatly appreciated! Thanks.
Intermediate & Advanced SEO | | RCDesign740 -
Don't affiliate programs have an unfair impact on a company's ability to compete with bigger businesses?
So many coupon sites and other websites these days will only link to your website if you have a relationship with Commission Junction or one of the other large affiliate networks. It seems to me that links on these sites are really unfair as they allow businesses with deep pockets to acquire links unequitably. To me it seems like these are "paid links", as the average website cannot afford the cost of running an affiliate program. Even worse, the only reason why these businesses are earning a link is because they have an affiliate program; that to me should violate some sort of Google rule about types and values of links. The existence of an affiliate program as the only reason for earning a link is preposterous. It's just as bad as paid link directories that have no editorial standards. I realize the affiliate links are wrapped in CJ's code, so that mush diminish the value of the link, but there is still tons of good value in having the brand linked to from these high authority sites.
Intermediate & Advanced SEO | | williamelward0 -
What's better ...more or less linking C-blocks?
I'm a little confused about c-blocks, I've been reading about them but I still don't get it. Are these similar to sitewide links? do they have to come from websites that I own and hosted in the same ip? and finally, what's better ...more or less linking c-blocks? Cheers 🙂
Intermediate & Advanced SEO | | mbulox0 -
Google Indexing Feedburner Links???
I just noticed that for lots of the articles on my website, there are two results in Google's index. For instance: http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html and http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+thewebhostinghero+(TheWebHostingHero.com) Now my Feedburner feed is set to "noindex" and it's always been that way. The canonical tag on the webpage is set to: rel='canonical' href='http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html' /> The robots tag is set to: name="robots" content="index,follow,noodp" /> I found out that there are scrapper sites that are linking to my content using the Feedburner link. So should the robots tag be set to "noindex" when the requested URL is different from the canonical URL? If so, is there an easy way to do this in Wordpress?
Intermediate & Advanced SEO | | sbrault740 -
Can I, in Google's good graces, check for Googlebot to turn on/off tracking parameters in URLs?
Basically, we use a number of parameters in our URLs for event tracking. Google could be crawling an infinite number of these URLs. I'm already using the canonical tag to point at the non-tracking versions of those URLs....that doesn't stop the crawling tho. I want to know if I can do conditional 301s or just detect the user agent as a way to know when to NOT append those parameters. Just trying to follow their guidelines about allowing bots to crawl w/out things like sessionID...but they don't tell you HOW to do this. Thanks!
Intermediate & Advanced SEO | | KenShafer0