URLs appear in Google Webmaster Tools that I can't find on my own site?!?
-
Hi,
I have a Magento e-commerce site (clothing) and when I had a look through some of the sections in Google Webmaster Tools I found URLs that I can't find on my site.
For example, a product url maybe http://www.example.co.uk/product-url/ which is fine. In that product there maybe three sizes of the product (Small, Medium, Large) and for some reason Googlebot is sometimes finding a url like:
http://www.example.co.uk/product-url/1202/ has been found and when clicked on is a live url (Status code: 200) with is one of the sizes (medium). However I have ran a site crawl in Screaming Frog and other crawl tests and can't seem to find where Googlebot is finding these URLs.
I think I need to:
1. Find how Googlebot is finding these urls?
2. Find out how to keep out of index (e.g. robots.txt, canonical etc....
Any help would be much appreciated and I'm happy to share the URL with members if they think they can have a look and help with this problem. I can share specific URLs which might make the issue seem clearer, let me know?
Thanks,
Darrell
-
No problem, glad it resolved the problem.
There are a number of possibilities, probably through one of the following;
- XML sitemap
- Faceted navigation
- Magento pinged Google when the page was created
-
Cheers John, sorted the issue! Appreciate your expertise.
-
Thanks John, your reply was really helpful and I've now done that for the 4000 simple product and now those URLs are returning 404 pages, which is great. Well, just going to see if I can find a mass import 301 redirect extension for Magento to 301 redirect these urls to the homepage so I can redirect them rather than leave as 404 pages.
How do you think Googlebot found those pages as there is no links to them? Maybe through a link when the simple products were loaded to the cart?
-
What is the visibility set to on the simple products for different sizes? If it's set to "Catalog" it will still be crawlable but not appear in your website's internal search results.
Setting the visibility to "Not Visible Individually" should resolve this issue.
-
I had a similar issue (not Magento), turns out it was in the sitemap that was submitted to WMTs, did you check there?
check the url in the open site explore too, it might tell you if any urls are linking to it
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What To Do When Improved Site Speed & Layout Result In Higher Bounce Rates & Lower Time On Site
We launched a new Bootstrap 3.0 site template 2 weeks ago. The site loads 5x faster and has a much improved layout (utilizing most common above the fold recommendations ). It's only been two weeks, but our bounce rate has increased 5-10% and our avg time on site decreased by 10-18%. Here is the page for one of our most common products so you can see the general experience: <a>http://www.jwsuretybonds.com/surety-bonds/commercial-bonds/auto_dealer_bond.htm</a> (here is the old version: <a>http://199.119.123.134/surety-bonds/commercial-bonds/auto_dealer_bond.htm</a>) We spent two months implementing the new design and working on a speedy load time. We had anticipated a drastic improvement, not mild downturn in user behavior. I'm hopeful that the Analytics metrics aren't showing the true picture on the keywords we care about (can't see anymore due to "Not Provided" listed as most keywords now. Argh!) and perhaps some of the more important/accurate user behavior metrics that we can't see are improving. We know our industry and our clients needs VERY well. We THOUGHT our new content/layout was perfect so it will be tough for us to try to make improvements at this point. We believe our best plan of action now is to add more content on each page and A/B test it along with other subtle changes. The problem is that our new content is very concise and hits on all of the primary visitor intentions, so additions of content could be redundant and making concise answers more "fluffy", which is what we tried to get away from. What do you think? Is there reason for panic? What would your plan of attack be if your "sure shot" new design didn't provide the improvements you "knew" it would? 🙂
Web Design | | TheDude0 -
How can a Pincode finder website be SEO optimised?
Guys, I wanted to build a simple Pincode finder website for India. The targeted visitors as is obvious will be from India. Alike other Pincode finder websites, the users in this case too will have to key in the location / area of whose pincode he is looking for and they will get Pincode from that very location / area. Other than this, users will also come to this website when they search for something like " <location name="">pincode</location>" on Google (for instance, users will search for something like "Hiranandani Gardens Powai Pincode") Along with data fethced from our sources via Indian postal departments and other data available in public domain, we shall be using data from Google Maps API too. My question in regards to the same is as follows: What should the page-structure / structure of the website be for ranking well on Google? What should be the URL structure? Other suggestions to rank well on Google in this regards? Competition: (You can search for the term "Hiranandani Gardens Powai Pincode" to know how these sites show data) http://www.getpincode.info http://www.pincode.net.in Pls. help...
Web Design | | ShalinTJ0 -
Unable to set preferred domain, can I verify a site that's already redirected?
I'm in the process of trying to set a preferred domain in webmaster tools -- to set our www version as preferred vs. the non www. version. IT is already redirecting non-www to www, but I get this message when trying to change settings "Part of the process of setting a preferred domain is to verify that you own http://mnn.com/. Please verify http://mnn.com/." While we own the domain, I am not sure how we can have Google access a file at [http://mnn.com/some_file when we are forwarding all requests for non-www to our www site.
Web Design | | Aggie
Note: The apache rewrite predates me and I'm not sure how / why we have two domains set up, but I'm trying to fix the preferred domain now.Am I able to verify the non version once the redirect is in place.Any ideas??? Help???Thanks!Lisa0 -
Getting ranked on google
I help run a small real estate site in ireland www.aplacetorent.ie and Im in charge of seo. I have read lots of books over the last year or so and while they offer lots of advice some of them dont actually show you what to do. I have joined distilled and I think its the best thing i have done in the last few weeks and am learning a lot but if anyone has any advice i would be very grateful. Thank you
Web Design | | Kessie0 -
Site Activity, SEO, and behind login
I have a site that provides online education and as such, most of the user activity happens behind a login. This has me thinking about potential SEO impacts with a few questions that maybe someone could lend some light on: How important is activity (above just search activity) to the search engines Would it help to enter these pages, even though they're behind a login, into GA as we have with the front-end of the site Does a subdomain make a difference (right now we implement the course as a subdomain of the main site Lastly, as I was looking at compete.com, I am wondering how they get these use statistics?
Web Design | | uwaim20120 -
Can anyone rcommend a UK Hosting company
As some of you may have seen in an earlier post, i have had problems with the speed of my site. After good advice i got an expert on board who done tests and found that my server (hosting company) was taking a long time to answer, on some occasions it was taking seven seconds. I have tried to get the hosting company to listen and sort the problem out but they are not interested and keep trying to sell me other things to get the site faster, i am already on a dedicated server. So now I am looking for a UK hosting company who offer good service and would be grateful if anyone could recommend some on here so i can speak to them, as i want to get my sites moves a.s.a.p many thanks
Web Design | | ClaireH-1848860 -
What is value in site aggregation?
If one was to own 5 sites that were in a similar vertical and at some point decide that managing 5 sites was more of a pain than managing 1,2, or 3, is it possible to combine sites via 301 redirects and increase the overall DA, and for urls that are similar, PA? So, we have site Hairbrush.com, comb.info, trimmer.com, hairmud.org, and barber.net: Assuming that they have a DA of Hairbrush.com = 32 Comb.info = 36 Trimmer.com = 27 Hairmud.org = 21 Barber.net = 44 Is there any testing that has shown combining the first 4 would increase the Comb.info from DA of 36 to DA of 51, etc? Is there any testing regarding the same, but with Page Authority? Thanks PS Assume other variables are equal. I also realize this could look as if they were all ECommerce, but what if they were service or bricks and mortar?
Web Design | | RobertFisher0 -
What is the optimal URL Structure for Internal Pages
Is it more SEO friendly to have an internal page URL structure that reads like www.smithlawfirm.com/personal-injury/car-accidents or www.smithlawfirm.com/personal-injury-car-accidents? The former structure has the benefit of showing Google all the sub-categories under personal injury; the later the benefit of a flatter structure. Thanks
Web Design | | rarbel0