Domain.com and domain.com/index.html duplicate content in reports even with rewrite on
-
I have a site that was recently hit by the Google penguin update and dropped a page back. When running the site through seomoz tools, I keep getting duplicate content in the reports for domain.com and domain.com/index.html, even though I have a 301 rewrite condition. When I test the site, domain.com/index.html redirects to domain.com for all directories and root. I don't understand how my index page can still get flagged as duplicate content.
I also have a redirect from domain.com to www.domain.com.
Is there anything else I need to do or add to my htaccess file?
Appreciate any clarification on this.
-
Hello Anthony,
Saw this still open.
If your index.html "Rewrite" code is accurate, could the issue be WWW, i.e. http://www.domain.com vs. http://domain.com?
RewriteCond %{HTTP_HOST} ^domain.com
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=permanent,L] -
I checked one of your campaigns, and it does seem like the 301-redirect is working properly. I'm also not seeing any evidence of links to the "index.htm" version or other issues. I don't see evidence of both version sin Google's index. Not sure exactly what's going on here, but I'll run it by the support team. I don't think you have cause for concern.
-
Thank you for the feedback and help.
I have looked up url removal in webmaster tools and it states that the page must be removed from the site. If I remove index.html I wont have a home page. Am I understanding you correctly? Heres what google states on url removal.
To remove a page or image, you must do one of the following:
- Make sure the content is no longer live on the web. Requests for the page must return an HTTP 404 (not found) or 410 status code.
- Block the content using a robots.txt file.
- Block the content using a meta noindex tag.
Please clarify when you get a moment.
I would have thought the htaccess 301 redirects from www.domain.com/index.html to www.domain.com would be enough.
Thank you in advance.
-
a) request removal of the /index.html URL in webmaster tools and it will go away in Google's index quickly.
b) make sure that when you link to your homepage on your site you are not linking to the /index.html URL - I bet you are somewhere do a sitewide search in dreamweaver to find all instances and do a global replace.
-
It could take a little time. I did some redirects myself earlier this year, but the old pages are still in Google's index.
Maybe someone else can confirm that it can take a little time before the old pages are dropped from Google's index?
-
HTTP/1.1 301 Moved Permanently => Date => Tue, 08 May 2012 13:44:26 GMT Server => Apache/2.0.52 (CentOS) Location => http://www.domain.com/ Content-Length => 330 Connection => close Content-Type => text/html; charset=iso-8859-1
-
Did you verify with a tool like http://www.webconfs.com/http-header-check.php that you get a 301 redirect?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Before Migration/after(www/non-www/http/https) - Good concentration needed :p
Hi all, Im confusing between those www's and http's. If i go to searchbar (chrome) and ENTER: www.mywebsite.nl, It changes to https://www.mywebsite.nl
Moz Pro | | Dreamgame2016
( with www, and https:// not used) / Its OK next: typing in searchbar and enter: mywebsite.nl, It changes to https://mywebsite.nl (without www and https:// ) / OK Next: www.mywebsite.nl, it stay the same, just https:// added: https://mywebsite.nl (used with https://) / OK Now its comes: If I do it again without http**(s)://mywebsite.nl, **It changes to https://www.mywebsite.nl/?SID=bccbuhvi1cf53r188bpvskn597 / NOT OK 😛 In google search console (webmastertool) I gave property for the https://mywebsite.nl and https://www.mywebsite.nl Each of the website, Im seeying data clicks/ volume keywords etc, so both of them functionating By search console: https://www.mywebsite.nl (With www) I see crawlfaults/errors: 1633 (the url has not linked existing page) I see again: "?SID=..." after urls, example: mywebsite.nl/blabla/?SID=m07ev6lliefbf0tfhe4kf0ih54 By search console - other website: https://mywebsite.nl **(none-www) **you see two crawlfaults/errors! Bad influance for my SEO, because of no existed pages, bad urls and dubble content. Bye bye keywords! Lets analyze/crawl with Moz tool ofcourse ^^: Pages with High Priority Issues: | 2646 | Duplicate Page Content |
| 14 | 4XX Client Error |
| 3 | Crawl Attempt Error |
| 1 | Title Missing or Empty | Medium priority: | 9618 | Temporary Redirect |
| 2688 | Duplicate Page Title |
| 13 | Title Element is Too Long |
| 1 | Missing Meta Description Tag | After seeying this results what is the best option (no losing link-juice)? redirect 301? www to none-www (https://) ? Shortly I am going to change my domain provider and the website template in magento. After that I am going to focus on the SEO implementation. First, I have to solve this problem. Who can give me an advice for this situation? Regarding, Newbee0 -
Open Site Explorer produces better results for www.50campfires.com than 50campfires.com
Our website url is 50campfires.com and www.50campfires.com redirects to 50campfires.com Look at the Open Site Explorer results for the two and www.50campfires.com has better results. Does google have an issue with this or may it be confusing them? We have redirects in place. How would I go about fixing this? More on what strategies would improve Page Authority and Page Rank. Thanks in Advance. 79JtBKA.png lrHwsOU.png
Moz Pro | | revonick0 -
301 Redirects - But still duplicate content?
Our website domain website.com redirects to website.com/en (since it's in English). Therefore, all pages on website.com redirects to website.com/en. In my Moz analytics, it says I have duplicate content, and lists all of these pages. Didn't the 301 redirects take care of the duplicate content? Or do I still have to add canonical tags?
Moz Pro | | Taulia0 -
Website Issues - Duplicate Content
Hello, I'm fairly new to using Moz and I logged on this morning to find Issues have been found in one of the websites - 22 High Priority and 44 Medium. I know it's due to duplicate content in the blog, but i can't figure out what is duplicated? I've only recently come on board this website so I don't know if the content has been plagiarised or what? The link to the site is here: delacyspa.co.uk Any help would be appreciated. Thanks zFxQmmd
Moz Pro | | Cowbang0 -
404 : Errors in crawl report - all pages are listed with index.html on a WordPress site
Hi Mozers, I have recently submitted a website using moz, which has pulled up a second version of every page on the WordPress site as a 404 error with index.html at the end of the URL. e.g Live page URL - http://www.autostemtechnology.com/applications/civil-blasting/ Report page URL - http://www.autostemtechnology.com/applications/civil-blasting/index.html The permalink structure is set as /%postname%/ For some reason the report has listed every page with index.html at the end of the page URL. I have tried a number of redirects in the .htaccess file but doesn't seem to work. Any suggestions will be strongly appreciated. Thanks
Moz Pro | | AmanziDigital0 -
Why does Crawl Diagnostics report this as duplicate content?
Hi guys, we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools. Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler. Here's an example, taken directly from our Crawl Diagnostics Report: URL with 4 Duplicate Content errors:
Moz Pro | | yacpro13
/safety-lights.html Duplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514 What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.html So why is SEOMoz crawler still flagging this as duplicate content?0 -
CSV reports in SEOmoz
Hello, I would like to export the reports from SEOmoz to an Excel sheet. However when I downoad the report and open it, all the information is random and is hard to work on it. Since Im not an excel expert, I have to ask if there is an Excel sheet ready to receive the SEOmoz reports. Tks for the help, Regards, PP
Moz Pro | | PedroM0 -
May not have a /path after the host
how to enter the competitor domain? on feedback i get: may not have a /path after the host. what is to do? Thanks Christian
Moz Pro | | cnort0