URL Parameters
-
Hi Moz Community,
I'm working on a website that has URL parameters. After crawling the site, I've implemented canonical tags to all these URLs to prevent them from getting indexed by Google. However, today I've found out that Google has indexed plenty of URL parameters..
1-Some of these URLs has canonical tags yet they are still indexed and live.
2- Some can't be discovered through site crawling and they are result in 5xx server error.
Is there anything else that I can do (other than adding canonical tags) + how can I discover URL parameters indexed but not visible through site crawling?
Thanks in advance!
-
I'm also facing the same problem with my website pages. My Blackpods pro website pages don't show the exact permalink urls.
-
Hi there,
Thanks very much for your response. I checked the sitemap and there are no URL parameters listed - only the canonical URL listed on the sitemap.
If you have any other suggestions it'll be much appreciated.
Thank you!
-
Hi Rajesh,
Thank you for your response. I cannot share the website due to client's confidentiality but basically when I search to find a stockist {brand name}, Google lists similar URLs below on the first page. The pages are showing a list of stockists depending on the product availability:
1-website.com/find-stockist?model=10 (5xx status code)
2-website.com/find-stockist?model=11 (200 status code)
3-website.com/find-stockist?model=10 (5xx status code)
4-website.com/find-stockist?model=11 (200 status code)Thank you!
-
Hi Gaston,
Thanks very much for your time. The canonicals have implemented around a month ago and the pages are almost identical. I discovered all URL parameters without performing an advanced search.
Also, I come across the 5xx errors when I clicked indexed URL parameters on Google SERP and I cannot discover them when I crawl the site with Screaming Frog.
I'd appreciate if you have any other suggestions based on your experience!
Many thanks
-
Just so you know, if a URL results in a 5XX server error then it usually won't render your canonical tag to begin with! You might want to check your sitemap XML, to check that it's not 'undoing' your canonical tags by feeding these URLs to Google. Indexation tags must be perfectly aligned with your sitemap XML, or you are sending Google mixed messages (e.g: a URL is in sitemap XML so Google should index it, but when it is crawled it contains a canonical tag citing itself as non-canonical, which is the opposite signal)
Everything which Gaston said is right on the money
-
I think you need to show some examples.
-
Hi there,
Its important to note that canonicals are a signal. Google can obey them if its algorithm considers that those pages are actually canonicals between each other.
In my experience, this does not happen immediately, it usually takes Google some time to figure out if the canonicalization is correct. Keep in mind that pages being canonicalized HAVE TO be nearly identical and refer to the same topic.
And on the indexation part, pages can be indexed and be shown only when you search for that specific URL or using any advanced search parameter (such as site:).
More information about canonicals
- Consolidate duplicate URLs - Google Search supportRegarding the second issue, if you refer to "site crawling" as what you do with an external tool, such as Screaming Frog or Moz, you are getting 5xx errors because that tool is making to many requests, try lowering its crawl frequency. I know for a fact that Screaming Frog allows you to do that.
But, unfortunately, I don't know any other way of discovering URL parameters in bulk but using an external tool.Hope it helps,
Best luck.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I remove certain parameters from the canonical URL?
For example, https://www.jamestowndistributors.com/product/epoxy-and-adhesives?page=2&resultsPerPage=16 is the paginated URL of the category https://www.jamestowndistributors.com/product/epoxy-and-adhesives/. Can I remove the &resultsPerPage= variation from the canonical without it causing an issue? Even though the actual page URL has that parameter? I was thinking of using this: instead of: What is the best practice?
Intermediate & Advanced SEO | | laurengdicenso0 -
What to do with dynamically translated content sharing same urls?
We've just added to an originally English website, Italian and German translations. User can switch between them with right hand drop down language selection menu; then the entire page will be translated (including menu, body, footer) but the urls remain the same. The Italian page have some meta data (titles and descriptions) translated as well. Is it going to be a significantly negative effect on SEO to have the translated pages sharing the same urls?
Intermediate & Advanced SEO | | D2i0 -
New-york-city vs. broadway as a URL parameter
We're a content publisher that writes news and reviews about the theater community, both in New York City (broadway mainly) and beyond. Presently, we display the term 'new-york-city' in news articles about Broadway / New York City theater (see http://screencast.com/t/XlifMdT9QP). Would it be better for us to replace that term with simply 'Broadway' to improve its searchability? I was doing some google trends keyword research and it looks like the search term "Broadway" in various permutations is substantially more popular than "New York City Theater."
Intermediate & Advanced SEO | | TheaterMania0 -
Google: How to See URLs Blocked by Robots?
Google Webmaster Tools says we have 17K out of 34K URLs that are blocked by our Robots.txt file. How can I see the URLs that are being blocked? Here's our Robots.txt file. User-agent: * Disallow: /swish.cgi Disallow: /demo Disallow: /reviews/review.php/new/ Disallow: /cgi-audiobooksonline/sb/order.cgi Disallow: /cgi-audiobooksonline/sb/productsearch.cgi Disallow: /cgi-audiobooksonline/sb/billing.cgi Disallow: /cgi-audiobooksonline/sb/inv.cgi Disallow: /cgi-audiobooksonline/sb/new_options.cgi Disallow: /cgi-audiobooksonline/sb/registration.cgi Disallow: /cgi-audiobooksonline/sb/tellfriend.cgi Disallow: /*?gdftrk Sitemap: http://www.audiobooksonline.com/google-sitemap.xml
Intermediate & Advanced SEO | | lbohen0 -
International URL Puzzle
Hello, I have 4 different URL's going to 4 different countries that all contain the same content and Google is seeing them as duplicate pages. For ecommerce reasons I have to have these 4 pages separated. Here is a example of the pages below so you can see the URL structure: www.example/com/canada www.example.com/australia www.example.com/usa www.example.com/UK How do I fix this duplicate content problem? Thanks!
Intermediate & Advanced SEO | | digitalops0 -
Help me choose a new URL structure
Good morning SEOMoz. I have a huge website, with hundreds of thousands of pages. The websites theme is mobile phone downloads. I want to create a better URL structure. Currently an example url is /wallpaper/htc-wildfire-wallpapers.html My issue with this, first and foremost is it's a little spammy, for example the fact it's in a wallpaper folder, means I shouldn't really need to be explicit with the filename, as it's implied. Another issue arises with the download page. For example /wallpaper/1234/file-name-mobile-wallpaper.html Again it's spammy but also the file ID, is at folder level, rather than within the filename. Making the file deeper and loses structure. I am considering creating sub domains, based on model, to ensure a really tight silo. i.e htc.domain.com/wallpaper/wildfire/ and the download page would be htc.domain.com/wallpaper/file-name-id/ But due to restrictions with the CMS, this would involve a lot of work and so I am considering just cleaning up the url structure without sub domains. /wallpaper/htc/wildfire/ and the download page would be /wallpaper/file-name-id/ What are your thoughts? Somebody suggested having the downloads in no folder at all, but surely it makes sense for a wallpaper, to be in a wallpaper folder and an app to be in an app folder? If they were not in a folder, I'd need to be more explicit in the naming of the files. Any advice would be awesome.
Intermediate & Advanced SEO | | seo-wanna-bs0 -
What Should I Do With My URL Names?
I release property on my blog each week, and it has come to the point we will get property in the same area as we have had in the past. So, I name my URL /blah-blah-blah-[area of property]/ for the first property in that area right. Now I get a different property in that same area and the URL will have to be named /blah-blah-blah-[area of property]-2/. Now I'm not sure if this is a major issue or not, but I'm sure there must be a better way than this, and I don't really want to take down our past properties - unless you can give me good reason too, of course? So before I start getting URLs like this: /blah-blah-blah-[area of property]-2334343534654/ (well, ok, maybe not that bad! But you get my point) I wanted to see what everyones opinion on it is 🙂 Thanks in advance!
Intermediate & Advanced SEO | | JonathanRolande0 -
Is it OK to have a site that has some URLs with hyphens and other, older, legacy URLs that use underscores?
I'm working with a VERY large site that has recently been redesigned/recategorized. They kept only about 20% of the URLs from the legacy site, the URLs that had revenue tied to them, and these URLs use underscores. Whereas the new URLs created for the site use hyphens. I don't think that this would be an issue for Google, as long as the pages are of quality, but I wanted to get everyone's opinion on this. Will it hurt me to have two different sets of URLs, those with using hyphens and those using underscores?
Intermediate & Advanced SEO | | Business.com0