The Bible and Duplicate Content
-
We have our complete set of scriptures online, including the Bible at http://lds.org/scriptures. Users can browse to any of the volumes of scriptures. We've improved the user experience by allowing users to link to specific verses in context which will scroll to and highlight the linked verse. However, this creates a significant amount of duplicate content. For example, these links:
http://lds.org/scriptures/nt/james/1.5
http://lds.org/scriptures/nt/james/1.5-10
http://lds.org/scriptures/nt/james/1
All of those will link to the same chapter in the book of James, yet the first two will highlight the verse 5 and verses 5-10 respectively. This is a good user experience because in other sections of our site and on blogs throughout the world webmasters link to specific verses so the reader can see the verse in context of the rest of the chapter.
Another bible site has separate html pages for each verse individually and tends to outrank us because of this (and possibly some other reasons) for long tail chapter/verse queries. However, our tests indicated that the current version is preferred by users.
We have a sitemap ready to publish which includes a URL for every chapter/verse. We hope this will improve indexing of some of the more popular verses. However, Googlebot is going to see some duplicate content as it crawls that sitemap!
So the question is: is the sitemap a good idea realizing that we can't revert back to including each chapter/verse on its own unique page? We are also going to recommend that we create unique titles for each of the verses and pass a portion of the text from the verse into the meta description. Will this perhaps be enough to satisfy Googlebot that the pages are in fact unique? They certainly are from a user perspective.
Thanks all for taking the time!
-
Dave,
Thanks for the clarification. You're definitely in a rare circumstance as compared to most web sites.
In reality, since it's the Bible, there is going to be a duplicate content issue regardless, given how many sites currently and how many more will most likely publish the same content now and in the future. From Eternalministries.org to KingJamesBibleOnline.org, concordance.biblebrowser.com, and so many other sites are all offering this content.
If you can find a way to offer your content in a unique way, and within your own site, offer different versions of it (individual verses compared to entire chapters), then ideally yes, you'd want it all indexed.
How you do that without adding your own unique text above or below each page's direct biblical content is the issue though.
Given this challenge,this is why I offered the concept of not indexing variations. Even if you weren't hit by the Panda update, any time Google has to evaluate multiple pages across sites where the content is either identical or "mostly" identical, someone's content is going to suffer to one degree or another. Any time it's a conflict within a single site, some versions are going to be given less ranking value than others.
So unfortunately it's not a simple, straight forward situation where duplication avoidance can be guaranteed to provide the maximum reach, nor is there a simple way to boost multiple versions in a way to guarantee that they'll all be found, let alone show up above "competitor" sites.
This is why I initially offered what are essentially SEO best practices for addressing duplicate content.
If you don't want to lose the traffic you have now that come in by multiple means, the only other way to bolster what you've got already is to focus on high quality long term link building, and social media.
The link building would need to focus on obtaining high quality links pointing to deep content. (Specific chapter pages and specific verse pages), where the anchor text used in those links varies between chapter or verse specific words, broader bible related phrases, and the LDS brand.
On the other hand, by implementing canonical tags, you will definitely reduce at least a number of visits that currently come in by variation URLs. Will that be compensated for by an equal or greater number of visits to the new "preferred" URL? In this rather unique situation there's no way to truly know. It is a risk.
Which brings me back to the concept that you'd potentially be better off finding ways to add truly unique content around the biblical entries. It's the only on-site method I can think of that would allow you to continue to have multiple paths indexed. Combined with unique page Titles, chapter/verse targeted links and social media, it could very well make the difference.
With what, over 1100 chapters, and 31,000 verses, that's a lot of footwork. Then again, it's a labor of love, and every journey is made up of thousands of steps.
-
So you're saying it would not be a good idea to try and get every verse url listed in Google? Perhaps we could try adding a canonical tag to point the the chapter only? For example, browsing the site you can't actually navigate to http://lds.org/scriptures/nt/james/1.5?lang=eng. You can only navigate to /james/1?lang=eng. However, the other URLs exist when someone links externally to a specific chapter and verse. The code on the page will highlight the desired verse. In our example the entire chapter exists on its own url and the content is unique.
Your suggestion may work if we just canonicalize all those "verse" urls like /james/1.5?lang=eng and james/1.5-10?lang=eng to /james/1?lang=eng. Some of the more popular verses with great page authority could actually help prop up the rest of the content on the page.
My concern though is that MUCH of the scripture related traffic comes through queries of the exact chapter/verse reference. So I can see where having individual pages for each passage could be valuable for rankings. But that user experience is poor when someone wants to see a range of passages like ch 5 vs 1-4 or similar. So we are looking for the best way to get our URLs indexed and ranked as individual passages or ranges of passages that are popular on search engines.
I can tell you that this section was not hit by the Panda update. The content is not "thin" as could be the case if we put each verse on a single page.
The ?lang=eng parameter is how we handle language versions. We have the scriptures online in several languages. I'm sure there are better ways to handle that as well. Due to the size of the organization we're certainly trying to get the low hanging fruit out of the way first.
-
Dave,
You're facing a difficult challenge - satisfy the needs of SEO, or user experience. In light of all that Google has done going back to their May Day update last year and right through the Panda/Farmer update, duplicate content, as well as "thin" content, is more of a concern than ever.
Just having unique titles on each page is not enough. It's the entire weight of uniqueness.
Since you're not intending to go to individual pages for each verse, as long as you've got multiple methods of getting tocontent that is found by other methods, only one method should be designated as the primary search engine preferred method. All others should be blocked from being indexed.
From there, users can choose to explore other methods of finding content as they bookmark your site if they find it of help to their goals.
Unfortunately, this does of course, mean that you're going to end up with many less pages indexed. However every page that is indexed will become stronger in their individual rankings, and that in turn will boost all of the pages above them, and the entire site over time.
And here's another issue - when I go to any of the URLs you posted above, your site automatically tacks on "?lang=eng" using 301 Redirects. This means any inbound links you have pointing to the non-appended URLs are not providing maximum value to your site, since they point to pages designated as permanently moved.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical Tags - Do they only apply to internal duplicate content?
Hi Moz, I've had a complaint from a company who we use a feed from to populate a restaurants product list.They are upset that on our products pages we have canonical tags linking back to ourselves. These are in place as we have international versions of the site. They believe because they are the original source of content we need to canonical back to them. Can I please confirm that canonical tags are purely an internal duplicate content strategy. Canonical isn't telling google that from all the content on the web that this is the original source. It's just saying that from the content on our domains, this is the original one that should be ranked. Is that correct? Furthermore, if we implemented a canonical tag linking to Best Restaurants it would de-index all of our restaurants listings and pages and pass the authority of these pages to their site. Is this correct? Thanks!
Technical SEO | | benj20341 -
Duplicate content
Hello mozzers, I have an unusual question. I've created a page that I am fully aware that it is near 100% duplicate content. It quotes the law, so it's not changeable. The page is very linkable in my niche. Is there a way I can build quality links to it that benefit my overall websites DA (i'm not bothered about the linkable page being ranked) without risking panda/dupe content issues? Thanks, Peter
Technical SEO | | peterm21 -
Affiliate Url & duplicate content
Hi i have checked passed Q&As and couldn't find anything on this so thought I would ask.
Technical SEO | | Direct_Ram
I have recently noticed my URLS adding the following to the end: mydomain.com/?fullweb=1 I cant seem to locate where these URLS are coming from and how this is being created? This is causing duplicate content on google. I wanted to know ig anyone has had any previous experience with something like this? If anyone has any information on this it would be a great help. thanks E0 -
Duplicate page content & titles on the same domain
Hey, My website: http://www.electromarket.co.uk is running Magento Enterprise. The issue I'm running into is that the URLs can be shortened and modified to display different things on the website itself. Here's a few examples. Product Page URL: http://www.electromarket.co.uk/speakers-audio-equipment/dj-pa-speakers/studio-bedroom-monitors/bba0051 OR I could remove everything in the URL and just have: http://www.electromarket.co.uk/bba0051 and the link will work just as well. Now my problem is, these two URL's load the same page title, same content, same everything, because essentially they are the very same web page. But how do I tell Google that? Do I need to tell Google that? And would I benefit by using a redirect for the shorter URLs? Thanks!
Technical SEO | | tomhall900 -
Duplicate content, how to solve?
I have about 400 errors about duplicate content on my seomoz dashboard. However I have no idea how to solve this, I have 2 main scenarios of duplication in my site: Scenario 1: http://www.theprinterdepo.com/catalogsearch/advanced/result/?name=64MB+SDRAM+DIMM+MEMORY+MODULE&sku=&price%5Bfrom%5D=&price%5Bto%5D=&category= 3 products with the same title, but different product models, as you can note is has the same price as well. Some printers use a different memory product module. So I just cant delete 2 products. Scenario 2: toners http://www.theprinterdepo.com/brother-high-capacity-black-toner-cartridge-compatible-73 http://www.theprinterdepo.com/brother-high-capacity-black-toner-cartridge-compatible-75 In this scenario, products have a different title but the same price. Again, in this scenario the 2 products are different. Thank you
Technical SEO | | levalencia10 -
Duplicate Content Issues - Should I build a new site?
I'm currently working on a site which is built using Zen Cart. The client also has another version which has the same products on it. The product descriptions and the vast majority of the text has been re-written. I've used the duplicate content tool and these are the results: HTML fingerprint: 0000a7ee1f07a131 0000a7ec1f07a931 92.31% Total HTML similarity: 76.33% Standard text similarity: 66.72% Smart text similarity: 45.81% Total text similarity 56.27% I considered using a different eCommerce system like Magento or Volusion. So I had a look at a few templates, chose one and then used the tool again and got the following: HTML fingerprint: 0000a7e41b012111 0000a7ec1f07a931 72.00% Total HTML similarity: 64.65% Standard text similarity: 11.69% Smart text similarity: 17.90% Total text similarity 14.80% Do you think its worth doing this? thanks Dan
Technical SEO | | TheYeti0 -
SEO with duplicate content for 3 geographies
The client would like us to do seo for these 3 sites http://www.cablecalc.com/ http://www.solutionselectrical.com.au http://www.calculatecablesizes.co.uk/ The sites have to targetted in US, Australia, and UK resoectively .All the above sites have identical content. Will Google penalise the sites ? Shall we change the content completly ? How do we approach this issue ?
Technical SEO | | seoug_20050 -
Getting rid of duplicate content with rel=canonical
This may sound like a stupid question, however it's important that I get this 100% straight. A new client has nearly 6k duplicate page titles / descriptions. To cut a long story short, this is mostly the same page (or rather a set of pages), however every time Google visits these pages they get a different URL. Hence the astronomical number of duplicate page titles and descriptions. Now the easiest way to fix this looks like canonical linking. However, I want to be absolutely 100% sure that Google will then recognise that there is no duplicate content on the site. Ideally I'd like to 301 but the developers say this isn't possible, so I'm really hoping the canonical will do the job. Thanks.
Technical SEO | | RiceMedia0