Why does my crawl diagnostics show duplicate content
-
My crawl diagnostics show duplicate content at mysite.com and mysite.com/index.html which are essentially the same file.
-
Michel is right - Google doesn't care that they're one template - if both URLs are being crawled, then they'll see that as two "pages". Every unique, crawlable URL can become an indexed page. That's why duplicate content problems are so common.
The good news is that you can put a canonical tag on just the one template/file and it will cover all of the paths/URLs that land on that file. The tag goes in your section and looks like:
I'd check the internal links, though, and see if you're linking to both versions. It's best to use one, consistent URL in your internal links for any given page.
-
mysite.com is a domain not a file with mysite.com/index.html being the home page. Not sure how I would do what you suggest.
-
If the crawl report found those two URLs, then your website has at least one link to each of those URLs (otherwise Rogerbot wouldn't have found them).
You should follow Collin's advice to define the canonical page.
It also won't hurt to figure out where those links are being used in your content, and then make sure you only use one to point to your page.
Cheers
Michel
-
"Essentially" the same file isn't the same as "the same file." Your best bet is probably to mark one of them (probably mysite.com) with rel=canonical.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages with Duplicate Content Error
Hello, the result of renewed content appeared in the scan results in my Shopify Store. But these products are unique. Why am I getting this error? Can anyone please help to explain why? screenshot-mz.bundledseo.com-2021.10.28-19_53_09.png
Moz Pro | | gokimedia0 -
Moz and HubSpot SSL - crawl error?
I'm getting an error message when Moz tries to crawl my site, however when I check in Google Search Console, they return no errors. Our site is hosted on HubSpot. Is Moz still having trouble crawling HubSpot sites that have enabled their SSL? I read an article that this should have been corrected in early 2017, but I'm getting an error.
Moz Pro | | jennygriffin0 -
How are you using Moz Content?
Hey people, I just subscribed for Moz Content and I am wondering how are other professionals using it as a strategy tool. Today I just released a blog post talking about how larger content impacts PA. Not a big deal. I would appreciate some ideas and insights.
Moz Pro | | amirfariabr0 -
404 Crawl Diagnostics with void(0) appended to URL
Hello I am getting loads of 404 reported in my Crawl report, all appended with void(0) at the end. For example: http://lfs.org.uk/films-and-filmmakers/watch-our-films/1289/void(0)
Moz Pro | | moshen
The site is running on Drupal 7, Has anyone come across this before? Kind Regards Moshe | http://lfs.org.uk/films-and-filmmakers/watch-our-films/1289/void(0) |0 -
Duplicate Page Content, Indexing and Rel Canonical Just DOUBLED! Need Advice to Fix
Last Friday (Penguin 5/2.1) my website shot way off the grid and I noticed in my MOZ PRO Campaign dashboard that all of the following just doubled in numbers on my website: duplicate page content, Google indexing, and rel canonicals. I also noticed that some of my pages, images, tags and categories now added a /page/2/ or a -2. I just changed noindex for tags, but indexing for media, pages, posts, and categories. I'm currently using All In One SEO for a plugin. Any advice would be much appreciated as I'm stuck on the issue. relconical.png Duplicate-Page-Content.png [Duplicate Content II](Duplicate Content II) index1.png
Moz Pro | | CelebrityPersonalTrainer0 -
Why do I see a duplicate content errors when rel="canonical" tag is present
I was reviewing my first Moz crawler report and noticed the crawler returned a bunch of duplicate page content errors. The recommendations to correct this issue are to either put a 301 redirect on the duplicate URL or use the rel="canonical" tag so Google knows which URL I view as the most important and the one that should appear in the search results. However, after poking around the source code I noticed all of the pages that are returning duplicate content in the eyes of the Moz crawler already have the rel="canonical" tag. Does the Moz crawler simply not catch whether that tag is being used? If I have that tag in place, is there anything else I need to do in order to get that error to stop showing up in the Moz crawler report?
Moz Pro | | shinolamoz0 -
Dot Net Nuke generating long URL showing up as crawl errors!
Since early July a DotNetNuke site is generating long urls that are showing in campaigns as crawl errors: long url, duplicate content, duplicate page title. URL: http://www.wakefieldpetvet.com/Home/tabid/223/ctl/SendPassword/Default.aspx?returnurl=%2F Is this a problem with DNN or a nuance to be ignored? Can it be controlled? Google webmaster tools shows no crawl errors like this.
Moz Pro | | EricSchmidt0 -
"Issue: Duplicate Page Content " in Crawl Diagnostics - but these pages are noindex
Hello guys, our site is nearly perfect - according to SEOmoz campaign overview. But, it shows me 5200 Errors, more then 2500 Pages with Duplicate Content plus more then 2500 Duplicated Page Titles. All these pages are sites to edit profiles. So I set them "noindex, follow" with meta robots. It works pretty good, these pages aren't indexed in the search engines. But why the SEOmoz tools list them as errors? Is there a good reason for it? Or is this just a little bug with the toolset? The URLs which are listet as duplicated are http://www.rimondo.com/horse-edit/?id=1007 (edit the IDs to see more...) http://www.rimondo.com/movie-edit/?id=10653 (edit the IDs to see more...) The crawling picture is still running, so maybe the errors will be gone away in some time...? Kind regards
Moz Pro | | mdoegel0