URL Parameter Handling In GWT to Treat Overindexation - how aggressive?
-
Hi,
My client recently launched a new site and their index went from about 20K up to about 80K - which is a severe over indexation.
I believe this was caused by parameter handling as some category pages now have 700 pages in the results for "site:domain.com/category1" - and apart from the top result, they are all parameters being indexed.
My question is how active/aggressive should I be in blocking these parameters in Google Webmaster Tools? Currently, everything is set to 'let googlebot decide'.
-
Hi! Did these answers take care of your question, or do you still have some questions?
-
Hey There
I would use a robots meta noindex on them (except for the top page of course) and use rel = prev/next to show they are paginated.
I would prefer to do that than use WMT. Also, WMT crawl settings will stop the crawling, but not remove them from the index. Plus, WMT will only handle Google, not other engines like Bing etc. Not that Bing matters, but always better to have a universal solution.
-Dan
-
Hello Search Guys,
Here is some food for thought taken from: http://www.quora.com/Does-Google-limit-the-number-of-pages-it-indexes-for-a-particular-site
Summary:
"Google says they crawl the web in "roughly decreasing PageRank order" and thus, pages that have not achieved widespread link popularity, particularly on large, deep sites, may not be crawled or indexed."
"Indexation
There is no limit to the number of pages Google may index (meaning available to be served in search results) for a site. But just because your site is crawled doesn't mean it will be indexed.Crawl
The ability, speed and depth for which Google crawls your site and retrieves pages can be dependent on a number of factors: PageRank, XML sitemaps, robots.txt, site architecture, status codes and speed.""For a zero-backlink domain with 80.000+ pages, in conjunction with rel=canonical and an xml-sitemap (You do submit a sitemap, don't you?), after submitting the domain to Google for a crawl, a little less than 10k pages remained in index. A few crawls later this was reduced to a mere 250 (very good job on Google's side).
This leads me to believe the indexation cap for a newer site with low to zero pagerank/authority is around 10k."
Another interesting article: http://searchenginewatch.com/article/2062851/Google-Upping-101K-Page-Index-Limit
Hope this helps, and easy response is to limit crawling to the most needed pages as aggressive as possible to remove the unneeded links leaving only needed ones
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Competing URLs
Hi We have a number of blogs that compete with our homepage for some keywords/phrases. The URLs of the blogs contain the keywords/phrases. I would like to re-work the blogs so that they target different keywords that don't compete and are more relevant. Should I change the URLs as I think this is what is mainly causing the issue? If so, should I 301 old URL's to the homepage? For example, say we we're a site that specialised in selling plastic cups. Currently there is a blog with the URL www.mysite.com/plastic-cups that outranks the homepage for _plastic cups. _The blog isn't particularly relevant to plastic cups and the homepage should rank for this term. How should I let Google know that it is the homepage that is most relevant for this term? Thanks
Intermediate & Advanced SEO | | Buffalo_71 -
Need help in de-indexing URL parameters in my website.
Hi, Need some help.
Intermediate & Advanced SEO | | ImranZafar
So this is my website _https://www.memeraki.com/ _
If you hover over any of the products, there's a quick view option..that opens up a popup window of that product
That popup is triggered by this URL. _https://www.memeraki.com/products/never-alone?view=quick _
In the URL you can see the parameters "view=quick" which is infact responsible for the pop-up. The problem is that the google and even your Moz crawler is picking up this URL as a separate webpage, hence, resulting in crawl issues, like missing tags.
I've already used the webmaster tools to block the "view" parameter URLs in my website from indexing but it's not fixing the issue
Can someone please provide some insights as to how I can fix this?0 -
Duplicate URL Parameters for Blog Articles
Hi there, I'm working on a site which is using parameter URLs for category pages that list blog articles. The content on these pages constantly change as new posts are frequently added, the category maybe for 'Heath Articles' and list 10 blog posts (snippets from the blog). The URL could appear like so with filtering: www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016 www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=1 All pages currently have the same Meta title and descriptions due to limitations with the CMS, they are also not in our xml sitemap I don't believe we should be focusing on ranking for these pages as the content on here are from blog posts (which we do want to rank for on the individual post) but there are 3000 duplicates and they need to be fixed. Below are the options we have so far: Canonical URLs Have all parameter pages within the category canonicalize to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general and generate dynamic page titles (I know its a good idea to use parameter pages in canonical URLs). WMT Parameter tool Tell Google all extra parameter tags belong to the main pages (e.g. www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=3 belongs to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general). Noindex Remove all the blog category pages, I don't know how Google would react if we were to remove 3000 pages from our index (we have roughly 1700 unique pages) We are very limited with what we can do to these pages, if anyone has any feedback suggestions it would be much appreciated. Thanks!
Intermediate & Advanced SEO | | Xtend-Life0 -
Htaccess Issue: URL not resolving properly
I am merging a niche site, tshirts.com to another site mainsite.com. I am using an htaccess file on a linux server, and the homepage of the niche site is being directed to the corresponding category page on the main site (i.e nichesite.com to mainsite.com/niche.html). Everything else is also a page to page redirect. I have something like this in the htaccess file: Redirect 301 http://tshirts.com/ http://www.mainsite.com/tshirts.html
Intermediate & Advanced SEO | | inhouseseo
Redirect 301 http://tshirts.com/blue.html http://www.lampclick.com/blue-t-shirts.html
Redirect 301 http://tshirts.com/white.html http://www.mainsite.com/white-t-shirts.html
Redirect 301 http://tshirts.com/black-tshirts.html http://www.mainsite.com/bk-t-shirts.html When I check 301 for lets say http://tshirts.com/blue.html, I get: http://tshirts.com/blue.html -** 301 Moved Permanently** http://www.mainsite.com/tshirts.htmlblue.html -** 302 Found** http:www.mainsite.com/ How do I fix this? Why is everything being appended to minsite/tshirts.html? I appreciate your help.0 -
New-york-city vs. broadway as a URL parameter
We're a content publisher that writes news and reviews about the theater community, both in New York City (broadway mainly) and beyond. Presently, we display the term 'new-york-city' in news articles about Broadway / New York City theater (see http://screencast.com/t/XlifMdT9QP). Would it be better for us to replace that term with simply 'Broadway' to improve its searchability? I was doing some google trends keyword research and it looks like the search term "Broadway" in various permutations is substantially more popular than "New York City Theater."
Intermediate & Advanced SEO | | TheaterMania0 -
Massive URL blockage by robots.txt
Hello people, In May there has been a dramatic increase in blocked URLs by robots.txt, even though we don't have so many URLs or crawl errors. You can view the attachment to see how it went up. The thing is the company hasn't touched the text file since 2012. What might be causing the problem? Can this result any penalties? Can indexation be lowered because of this? ?di=1113766463681
Intermediate & Advanced SEO | | moneywise_test0 -
Need Perfect URLs
I'm redesigning a site's structure from the ground up, and am having issues with the URLs. I'd love to have them be perfect, but kept finding conflicting advice online. 1. For my services blog, is it best to have it set up like www.example.com/services/keyword or
Intermediate & Advanced SEO | | Stryde
www.example.com/keyword There seems to be conflicting advice as to keep it short and keep the keyword as far to the left as possible, but also that including the word services would help with long tail phrases and site organization. 2. For my blog section, is it best to have it set up like
www.example.com/blog/keyword or
www.example.com/keyword or
www.example.com/blog-post-title-with**-keyword**-in-it It's similar to the first question, but also adds the question of including the entire post title in the URL or just the keyword. Your help would be greatly appreciated!1 -
We are changing ?page= dynamic url's to /page/ static urls. Will this hurt the progress we have made with the pages using dynamic addresses?
Question about changing url from dynamic to static to improve SEO but concern about hurting progress made so far.
Intermediate & Advanced SEO | | h3counsel0