Caps in URL creating duplicate content
-
Im getting a bunch of duplicate content errors where the crawl is saying
www.url.com/abc has duplicate at www.url.com/ABC
The content is in magento and the url settings are lowercase, and I cant figure out why it thinks there is duplicate consent. These are pages with a decent number of inbound links.
-
I checked and it is a magento feature to rewrite caps to lower case.
I added this to htaccess anyway
<code>RewriteMap lc int:tolower RewriteCond %{REQUEST_URI} [A-Z] RewriteRule (.*) ${lc:$1} [R=301,L]</code>
One last question before I take this question to a magento forum - how can I look at a page with a caps URL and lower URL and see if they are really different pages or link to the same address.
When you change random letters to caps in our site it sends you to the right page but my browser still shows the mixed caps url instead of replacing with an all lower url - but is that really a different page or is the browser just not changing the caps display when it is really getting the lower case page ```
-
Hi John,
I checked the URL you sent me. You do have duplicate pages:
http://www.madebysurvivors.com/destiny
http://www.madebysurvivors.com/DESTINY
both work and return the same page..
I also tried clicking on other links on your site, and then just changing a few letters to the upper case something like this
http://www.madebysurvivors.com/LEArn-human-trafficking-slavery
and it returns the same page
From what I can tell its one of the features in Magento that is making this possible. I would go into settings and disable that setting that forces Magento to use lower case.
Then test it make sure that you DO get a 404 page if you change the letter case on any of your links. Once you test it and you do get a 404 page.
I'm not familiar with Magento so not sure if it has that option or not, but many CMS and ecommerce platforms have a field where you can specify the URL for that page, I would change that field to all lower case.
Test it again, if it works there is one more step that you have to do if you want to keep the same juice from the pages that had the uppercase URL.
You need to duplicate your pages, but you need to make sure that the URL address is the same as it was before (in all CAPS) and then do a 301 redirect to the new page which is in lower case.
Hope this helps and makes sense.
-
This is intended functionality in Magento. It's supposed to help the user experience, as a user can navigate to a page even if they aren't sure on the casing of the words.
Of course that's bad for SEO. You'll need to put in the concept of canonicalization. Here's a free extension by Yoast:
http://www.magentocommerce.com/magento-connect/canonical-url-for-magento.html
Cheers.
Update: seeing your response, your solution of putting in redirects wouldn't be possible. You'd have to cover all combinations of caps/non-caps, and well, that's more work than you should want :). As for why this happens, the uppercase character is being lowercased when checking if something in the database matches the URL. Again, this is intended functionality.
-
Looks like I do need some more help.
I get a redirect loop if I enter a redirect from
http://www.madebysurvivors.com/DESTINY
to
http://www.madebysurvivors.com/destiny
but I checked and there is no redirect the other way in our database or htaccess.
If I leave the redirect off I get duplicate content - but in the CMS parts of magento there is only one table for this page.
-
I actually moved all the content from a drupal install so I dont have that many URLs that have the problem. It looks like the faster way to do this is just redirects the caps to lower case as thats what we use elsewhere..
I dug into the underlying database and cant find any duplicate entries for these pages or odd redirects so I have no idea of the cause.
For some of the pages I think you are right that magento is moving caps down to lower, but there are a few others where it is lower to caps - but it was caps in the drupal site.
Anyway -good to know google sees them differently so Ill put in redirects. Its only about 20 pages
-
Hello John,
If you can provide us with a URL we might be able to dig in to see what is going on. Without it its almost impossible to tell. Also it doesn't matter if you have a decent number of inbound links, duplicate content only refers to pages with similar content. I'm not familiar with Magento platform so this is just a guess, when you created (or imported) pages or categories in Magento originally were they lowercased? If not its possible that Magento added them as all in CAPS and Magento might be forcing it to lower case, therefore you might have duplicates, but once again this is just a guess and without a URL to your site I doubt that someone will be able to help you further.
-
www.url.com/abc and www.url.com/ABC are two completely different pages according to Google
I would redirect any and all pages with capitals to the corresponding lower case URL's.
Dont worry about the link juice as it will pass over via the redirect. It will also be much better than having 2 identical pages competing with eachother (according to Google)
Greg
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Tricky Duplicate Content Issue
Hi MOZ community, I'm hoping you guys can help me with this. Recently our site switched our landing pages to include a 180 item and 60 item version of each category page. They are creating duplicate content problems with the two examples below showing up as the two duplicates of the original page. http://www.uncommongoods.com/fun/wine-dine/beer-gifts?view=all&n=180&p=1 http://www.uncommongoods.com/fun/wine-dine/beer-gifts?view=all&n=60&p=1 The original page is http://www.uncommongoods.com/fun/wine-dine/beer-gifts I was just going to do a rel=canonical for these two 180 item and 60 item pages to the original landing page but then I remembered that some of these landing pages have page 1, page 2, page 3 ect. I told our tech department to use rel=next and rel=prev for those pages. Is there anything else I need to be aware of when I apply the canonical tag for the two duplicate versions if they also have page 2 and page 3 with rel=next and rel=prev? Thanks
Technical SEO | | znotes0 -
Duplicate content on charity website
Hi Mozers, We are working on a website for a UK charity – they are a hospice and have two distinct brands, one for their adult services and another for their children’s services. They currently have two different websites which have a large number of pages that contain identical text. We spoke with them and agreed that it would be better to combine the websites under one URL – that way a number of the duplicate pages could be reduced as they are relevant to both brands. What seamed like a good idea initially is beginning to not look so good now. We had planned to use CSS to load different style sheets for each brand – depending on the referring URL (adult / Child) the page would display the appropriate branding. This will will work well up to a point. What we can’t work out is how to style the page if it is the initial landing page – the brands are quite different and we need to get this right. It is not such an issue for the management type pages (board of trustees etc) as they govern both identities. The issue is the donation, fundraising pages – they need to be found, and we are concerned that users will be confused if one of those pages is the initial landing page and they are served the wrong brand. We have thought of making one page the main page and using rel canonical on the other one, but that will affect its ability to be found in the search engines. Really not sure what the best way to move forward would be, any suggestions / guidance would be much appreciated. Thanks Fraser .
Technical SEO | | fraserhannah0 -
Woocommerce Duplicate Page Content Issue
Hi, I'm receiving a duplicate content error. It says that this url: https://kidsinministry.org/childrens-ministry-curriculum/?option=com_content&task=view&id=20&Itemid=41 is a duplicate of this: http://kidsinministry.org/childrens-ministry-curriculum I'm using wordpress, woocommerce, and not really sure how to even address this. I tried adding this to .htaccess but it didn't redirect the url: 301 Redirects Redirect 301 https://kidsinministry.org/childrens-ministry-curriculum/?option=com_content&task=view&id=20&Itemid=41 http://kidsinministry.org/childrens-ministry-curriculum/ Anyone have any ideas? Thanks!
Technical SEO | | a_toohill0 -
Duplicate Content Issues on Product Pages
Hi guys Just keen to gauge your opinion on a quandary that has been bugging me for a while now. I work on an ecommerce website that sells around 20,000 products. A lot of the product SKUs are exactly the same in terms of how they work and what they offer the customer. Often it is 1 variable that changes. For example, the product may be available in 200 different sizes and 2 colours (therefore 400 SKUs available to purchase). Theese SKUs have been uploaded to the website as individual entires so that the customer can purchase them, with the only difference between the listings likely to be key signifiers such as colour, size, price, part number etc. Moz has flagged these pages up as duplicate content. Now I have worked on websites long enough now to know that duplicate content is never good from an SEO perspective, but I am struggling to work out an effective way in which I can display such a large number of almost identical products without falling foul of the duplicate content issue. If you wouldnt mind sharing any ideas or approaches that have been taken by you guys that would be great!
Technical SEO | | DHS_SH0 -
Https Duplicate Content
My previous host was using shared SSL, and my site was also working with https which I didn’t notice previously. Now I am moved to a new server, where I don’t have any SSL and my websites are not working with https version. Problem is that I have found Google have indexed one of my blog http://www.codefear.com with https version too. My blog traffic is continuously dropping I think due to these duplicate content. Now there are two results one with http version and another with https version. I searched over the internet and found 3 possible solutions. 1 No-Index https version
Technical SEO | | RaviAhuja
2 Use rel=canonical
3 Redirect https versions with 301 redirection Now I don’t know which solution is best for me as now https version is not working. One more thing I don’t know how to implement any of the solution. My blog is running on WordPress. Please help me to overcome from this problem, and after solving this duplicate issue, do I need Reconsideration request to Google. Thank you0 -
How to prevent duplicate content at a calendar page
Hi, I've a calender page which changes every day. The main url is
Technical SEO | | GeorgFranz
/calendar For every day, there is another url: /calendar/2012/09/12
/calendar/2012/09/13
/calendar/2012/09/14 So, if the 13th september arrives, the content of the page
/calendar/2012/09/13
will be shown at
/calendar So, it's duplicate content. What to do in this situation? a) Redirect from /calendar to /calendar/2012/09/13 with 301? (but the redirect changes the day after to /calendar/2012/09/14) b) Redirect from /calendar to /calendar/2012/09/13 with 302 (but I will loose the link juice of /calendar?) c) Add a canonical tag at /calendar (which leads to /calendar/2012/09/13) - but I will loose the power of /calendar (?) - and it will change every day... Any ideas or other suggestions? Best wishes, Georg.0 -
URL content format - Any impact on SEO
I understand that there is a suggested maximum length for a URL so as not to be penalized by search engines. I'm wondering if I should if should optimize our ecommerce categories to be descriptive or use abbreviations to help keep the URL length to a minimum? Our products are segmented into many categories, so many products URL's are pretty long if we go the descriptive route. I've also heard that removing the category component entirely from a product URL can also be considered. I'm fairly new to all this SEO stuff, so I'm hoping the community can share their knowledge on the impact of these options. Cheers, Steve
Technical SEO | | SteveMaguire0 -
Duplicate Content
We have a main sales page and then we have a country specific sales page for about 250 countries. The country specific pages are identical to the main sales page, with the small addition of a country flag and the country name in the h1. I have added a rel canonical tag to all country pages to send the link juice and authority to the main page, because they would be all competing for rankings. I was wondering if having the 250+ indexed pages of duplicate content will effect the ranking of the main page even though they have rel canonical tag. We get some traffic to country pages, but not as much as the main page, but im worried that if we remove those pages and redirect all to main page that we will loose 250 plus indexed pages where we can get traffic through for odd country specific terms. eg searching for uk mobile phone brings up the country specific page instead of main sales page even though the uk sales pages is not optimized for uk terms other than having a flag and the country name in the h1. Any advice?
Technical SEO | | -Al-0