How to resolve - Googlebot found an extremely high number of URLs
-
Hi,
We got this message from Google Webmaster “Googlebot found an extremely high number of URLs on your site”. The sample URLs provided by Google are all either noindex or have a canonical.
- http://www.myntra.com/nike-stylish-show-caps-sweaters
- http://www.myntra.com/backpacks/f-gear/f-gear-unisex-black-&-purple-calvin-backpack/162453/buy?src=tn&nav_id=541
- http://www.myntra.com/kurtas/alma/alma-women-blue-floral-printed-kurta/85178/buy?nav_id=625
Also we have specified the parameters on these URLs as representative URL in Google Webmaster - URL parameters.
Your comments on how to resolve this issue will be appreciated.
Thank You
Kaushal Thakkar
-
Hi Kaushal,
Thanks for the question.
There are a few ways to deal with this problem which are recommended by Google here. In summary, you can:
- Use parameter handling as you have done
- Add the nofollow attribute to problematic URLs
- Block problematic URLs in robots.txt
There is also a thread in the Google webmaster forums which may be useful to you:
Overall, it comes down to having a good site architecture and cutting down / removing / blocking URLs that you don't care about from a search perspective.
I hope that helps a bit!
Paddy
-
Thank you David, Its been more than 10 months since these parameters have been specified in webmaster. This and other activities like noindex and canonicals helped us to reduce the indexed URL count from 32 million to 1.2 million. As the url index reduced this warning from google stopped for 4 months. However we started receiving this message again from february 2014.
Thanks
Kaushal
-
"we have specified the parameters on these URLs as representative URL in Google Webmaster - URL parameters."
How long ago was this done? Since there are so many URL's, it may take a while for them to recrawl and index the representative URL's per your request.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Googlebot crawling AJAX website not always uses _escaped_fragment_
Hi, I started to investigate googlebot crawl log of our website, and it appears that there is no 1:1 correlation between a crawled URL with escaped_fragment and without it.
White Hat / Black Hat SEO | | yohayg
My expectation is that each time that google crawls a URL, a minute or so after, it suppose to crawl the same URL using an escaped_fragment For example:
Googlebot crawl log for https://my_web_site/some_slug Results:
Googlebot crawled this URL 17 times in July: http://i.imgur.com/sA141O0.jpg Googlebot crawled this URL additional 3 crawls using the escaped_fragment: http://i.imgur.com/sOQjyPU.jpg Do you have any idea if this behavior is normal? Thanks, Yohay sOQjyPU.jpg sA141O0.jpg0 -
Can I 301 redirect old URLs to staging URLs (ex. staging.newdomain.com) for testing?
I will temporarily remove a few pages from my old website and redirect them to a new domain but in staging domain. Once the redirection is successful, I will remove the redirection rules in my .htaccess and get the removed pages back to live. Thanks in advance!
White Hat / Black Hat SEO | | esiow20130 -
How to remove trailing slashes in URLs using .htaccess (Apache)?
I want my URLs to look like these: http://www.domain.com/buy http://www.domain.com/buy/shoes http://www.domain.com/buy/shoes/red Thanks in advance!
White Hat / Black Hat SEO | | esiow20130 -
Removing/ Redirecting bad URL's from main domain
Our users create content for which we host on a seperate URL for a web version. Originally this was hosted on our main domain. This was causing problems because Google was seeing all these different types of content on our main domain. The page content was all over the place and (we think) may have harmed our main domain reputation. About a month ago, we added a robots.txt to block those URL's in that particular folder, so that Google doesn't crawl those pages and ignores it in the SERP. We now went a step further and are now redirecting (301 redirect) all those user created URL's to a totally brand new domain (not affiliated with our brand or main domain). This should have been done from the beginning, but it wasn't. Any suggestions on how can we remove all those original URL's and make Google see them as not affiliated with main domain?? or should we just give it the good ol' time recipe for it to fix itself??
White Hat / Black Hat SEO | | redcappi0 -
URL structure: 301 redirect or leave as is?
Hello, My website, www.coloringbookfun.com is very old and authoritative, but the URL structure is terrible. If you check out some of our subcategories such as http://www.coloringbookfun.com/Kung Fu Panda and individual printables such as http://www.coloringbookfun.com/Kung Fu Panda/imagepages/image2.html You can see that they aren't optimized. I am curious to know the pros and cons of fixing the URL structure and 301ing them to the new optimized url. Will 301ing lose authority and backlinks for the sites pages? Does optimizing the url structure outweigh losing the authority/backlinks?
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
Redirecting an image url to a more SEO friendly image url
We are currently trying to find the best way of making the images on one of our sites more SEO friendly, the easiest way for us would be to redirect the image URL to a more SEO friendly image URL. For example: http://www.website.com/default/cache/file/F8325DA-0A9A-437F-B5D0A4255A066261_medium.jpg redirects to http://www.website.com/default/cache/file/spiral-staircase.jpg Would Google frown upon this as it's saying the image is one thing and then points the user somewhere else?
White Hat / Black Hat SEO | | RedAntSolutions0 -
DropBox.com High PA & DA?
"What’s up with these dl.dropbox.com High PA & DA links?" You know, It's frustrating to spend almost an entire day getting a few great link backs... then to find out your competitor has hundreds of cheap & easy link backs for the keyword you are going for with greater Authority [according to SEOmoz's OSE]. So I ran a search on one of our top competitors in Open Site Explorer to gather an idea of where the heck they are getting all of their links. Please feel free to copy my actions so you can see what I see. Run a search in OSE for www[dot]webstaurantstore[dot]com. Click on the ‘Anchor Text’ Tab. Click on the first Anchor Text Term, which should be ‘restaurant supplies’ :: Then it will expand, click on the ‘View more links and details in the inbound links section.’ As you scroll down the list you will notice that they have a bunch of linking pages from dl.dropbox.com, all of them are .pdb files, for their targeted Anchor Text, restaurant supplies. Q: So my question is can someone please elaborate on what .pdb files are and how they are getting this to work for them so well? Also you will notice, on the expanded Anchor Text Page, that their 6<sup>th</sup> most powerful link for this phrase (restaurant supplies) seems to be linked straight from a porn site, I thought Google does not rank adult sites like this? Q: For future reference, does anyone know legitimate websites to maybe file an SEO manipulation complaint? Thanks!
White Hat / Black Hat SEO | | Burkett.com0