PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Taking more than a day to index after the content changed
Hi everyone, As i got stuck with the confusion that - one of our website pages for the business located in Sharjah contents has been moderated and inspected the URL to google for index with new tags and contents. This is the URL which made the changes: https://www.socprollect-mea.com/sharjah-free-zone-company-registration/ and As i came to know that our page reflecting an issue "Valid items with warnings" once after inspecting the URL in the search console. Something which seems interesting and never experienced before - which is showing: "Products"warning - something like that. I came to know that - Missing field "Brand" and showing no global identifier. Does anybody know what it is and can u able to rectify this concern and get me a solution to index our URL faster on Google search. please?
On-Page Optimization | | nazfazy0 -
Does Google's algo look at all traffic mediums with regs to onpage metrics or only organic traffic metrics?
Hi folks, This is something I've pondered for a while. I've ask a couple of Googlers but no reponse yet and I don't I'll get one! In your opinion, do you think Google looks at on page metrics like bounce rate for example from all traffic mediums (organic, paid, email, social referral etc etc) or they only look at on page metrics from organic traffic? I'm not talking about direct correlations from other mediums. I'm only talking about when a user lands on a website, do the actions they take matter with regards to Google's search algo no matter of the referring medium, or do Google only look at onpage metrics on visits which came to the site via organic search as a medium. Option 1 As a very simplified example: Google gives extra weight in the SERPs to website A which has an average bounce rate of 30% from all mediums compared to website B which has a bounce rate of 50% from all mediums. Option 2 Google gives extra weight in the SERPs to website A which has an average bounce rate of 30% from organic traffic only compared to website B which has a bounce rate of 50% from organic traffic only. I'm not sure if anyone outside Google has the answer/proof of this but was keen to get other people's thoughts. If you think the also uses one or the other, can you give an insights/proof of one or the other? For me it would make sense for them only to use onpage metrics from sessions which came from organic seach traffic, but who knows! Merci buckets, Gill.
On-Page Optimization | | Cannetastic0 -
Thousands of 404's showing up from Wordpress Blog!?!?
Hey guys, Have recently seen thousands of 404 errors thrown up from my wordpress blog in Google Search Console. These are URL's trying to link (i'm not sure where from) to other parts of my site, but they are not relative to the site root... infact they are a mix of random folders/subfolders and pages on my site. E.g: http://www.MYSITE.co.uk/blog/how-to/driving-to-the-alps/attachment/2013-land-rover-range-rover-evoque-front-snow-1/st-martin-de-belleville/chalet-st-martin-de-belleville/ski-holidays/ski-holidays/summer/st-martin-de-belleville/summer/your-stay-st-martin-de-belleville.html This is a link to a picture on the blog: http://www.MYSITE.co.uk/blog/how-to/driving-to-the-alps/attachment/2013-land-rover-range-rover-evoque-front-snow-1/ And the rest of it is finding it's own way there! Any ideas? This is Wordpress by the way. Cheers, Paul. p.s. I got no help from the Wordpress community so am posting here! p.p.s I forgot to mention that MOZ is reporting these issues too, but running Screaming Frog does NOT show any 404's at all on my site...
On-Page Optimization | | SnowTrippin0 -
Google Treating these URL's as diff, but they are same. please help
Google is treating, below URL's as two different URL's when they are same. How to solve this. Please help. Case 1:/2570/Venture-Capital-and-Capital-Markets/2570/venture-capital-and-capital-marketsCase 2: /xxx/Java-Programming//xxx/Java-ProgrammingPlease help, how to solve this. Thanks in advance
On-Page Optimization | | AnkammaRao0 -
Duplicate content issues?
Our company consists of several smaller companies, some of whom deal with very similar things. For instance, two of our companies resell accounts software, but only one provides after-sales support. Because of the number of different companies and websites we have, sometimes it would be easier to simply copy content from one site to the other, optimised in the same manner as, in some instances, we would want different websites to rank for the same keywords. I have been asked my opinion on the potential impact of this practice and my initial response was that we should avoid this due to potential penalties. However, I thought I'd garner opinion from a wider audience before making any recommendations either way. What do people think? Thanks.
On-Page Optimization | | HBPGroup0 -
Using phrases like 'NO 1' or 'Best' int he title tag
Hi All, Quick question - is it illegal, against any rule etc to use phrases such as 'The No 1 rest of the title tag | Brand Name' on a site?
On-Page Optimization | | Webrevolve0 -
What Should I Do With Low Quality Content?
As my site has definitely got hit by Panda, I am in the process of cleaning my website of low quality content. Needless to say, shitty articles are completed being removed but I think lots of this content is now of low quality because it is obsolete and dated. So what should I do with this content? Should I rewrite those articles as completely new posts and link from the old posts to the new ones? Or should I delete the old posts and do a 301 redirect to the new post? Or should I rewrite the content of these articles in place so I can keep the old URL and backlinks? One thing is that I've got a lot more followers than I used to so publishing a new post gets a lot more views, like and shares and whatnot from social networks.
On-Page Optimization | | sbrault741 -
Creating optimized content: how to standardize the process?
Hello there, we are creating the new content for a website. For each web page we have created a “Pages file” to have the advantage of the spell checker. For each page, in the “Pages file” we have written the title tag (70 characters) and the meta description (155 character), so we have a kind of “template” like this in every page: title tag meta desciption text content (included the alt of the images inside the text) Every page is optimized for a single keyword/keyword phrase. What we wanna know from you guys if does exist a kind of “best practice” to test keyword density to avoid keyword stuffing penalities. In our case we opted to use “Pages” as editor, does exist a “standard Numbers/Excel spreadsheet” to understand if a keyword is over optimized in a page and so might look spammy? And in your opinion guys, what’s the best way to standardize the process of creating optimized content? Take care and thank you in advance for sharing your experience. YESdesign guys.
On-Page Optimization | | YESdesign0