PDF's - Dupe Content

Dan-Lawrence

Hi

I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?

Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?

Cheers

Dan

CleverPhD

Should be different, but you would have to look at them to make sure.

Dan-Lawrence

ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?

CleverPhD

That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article

https://support.google.com/webmasters/answer/139394?hl=en

As another option, you can just block access to the PDFs to keep them out of the index as well.

Dan-Lawrence

thanks Chris

yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)

GPainter

Hi Dan,

Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.

Hope that enlightens you a bit.

Dan-Lawrence

Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/

Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?

Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.

GPainter

What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.

Hope that helps.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

PDF's - Dupe Content

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Can bots crawl this homepage's content?

How to fix thin content issue?

Unique Pages with Thin Content vs. One Page with Lots of Content

Multilingual site with untranslated content

How to solve duplicate content issue???

When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?

International Website(s)

Quick question about bold italics keywords in today's SEO world