I have a page where you can download a PDF of the material - should I exclude the PDF from the search engines?
-
In my niche, there is a controversial research article that is very popular. I am writing a rebuttal to this article and giving another point of view.
My article has the potential to be really good link bait for my site.
The original article is often printed out to be shown to professionals in my niche. My hope is that people will do the same with mine. So, I plan to have a PDF version of my article available on my page. The article that is visible on my site (i.e. non PDF) will be a graphic rich article that is easy for the reader to go through. I plan to have the PDF have all of the same text, but it won't have as many graphics - it will look more like a scientific research article.
So, should I exclude the pdf from search engines so that it isn't duplicate content? Or does that even matter seeing as it is a duplicate of my own content? I want people to link to the main article, not the pdf.
Any tips would be greatly appreciated!
-
Thank you! This is exactly the kind of information I needed!
I was thinking contacting webmasters who published the original article to tell them about mine. But now, perhaps what I will do is not just contact them but attach a copy of the pdf for them to use.
-
Do not exclude.
People will link to it.
PDF documents can rank in the SERPs if you complete the properties portion of the document. The title in the properties will serve as a title tag for Google SERPs.
PDF documents can accumulate pagerank and pass that pagerank though any links in the PDF document. (Be sure to place a few links to your website in the PDF. Because....pdf, .ppt, .xls and many other file times display in my google webmaster tools backlinks).
Encourage other webmasters to download your pdf and post it on their server and link to it from their website. That will give you backlinks from their domain. You can get a kickass number of backlinks from this. (I usually don't advocate giving content away but I have seen success from "whitepapers" like this. You might consider offering them a "branded" copy of the document to post on their own site - you would add their branding for them.)
Its a good idea to lock the .pdf document so that others can't change it. They can always make their own document from your content but don't make it too easy for them.
I have used .pdfs and have not seen a duplicate content problem from them. However, the content of the pdf is not exactly the same as what is on an .html page of my site. It sounds like you are planning to have richer content on your site than in the .pdf so I would not worry about dupe content. Just be sure that there is a significant difference.
-
I don't think there's a problem with hosting the PDF. Just make sure you've got strong branding in the PDF and links back to your online article. People will most likely pass your PDF around to others and you want them to come visit the source --> YOU.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hi, on SEO article submissions, do I only include the link to the page I am trying to promote or is it best practice to also include a link to home page or parent page?
Good day. I am writing articles for submission, I would just like some help with the page structure. Do I only include the link for the page that I would like to promote or is it advisable to include other page links, such as home page or the parent category too? Any help would be appreciated
Content Development | | thebedguy0 -
What's brewing on YouMoz? (And how you can Help)
In the last year, we've searched for ways to make YouMoz more interesting, more exciting and more inviting for the Moz community. The blog really does belong to the community, for it's the place where many novel ideas are shared, discussed, and further developed. Aside from being a great place to share ideas, though, YouMoz is also the primary vehicle by which many now-household names in online marketing were discovered. (Many of the top posts on YouMoz eventually find their way onto the main Moz Blog.) YouMoz belongs to the community. The blog was created as a place for the community to share and engage around bright ideas, in addition to being a vehicle for provoking thought around new concepts, strategies and tactics. For both aspiring and established authors, YouMoz has become a popular destination in the online marketing space. In the quest to make YouMoz even better, we’ve come with a few ideas to ensure that everyone continues to feel as though they can contribute to the blog. Beginning today, we’re introducing what we hope becomes four common formats for YouMoz: My Story, Headsmacking Tips, Problem Solved and Here’s How: My Story: The name pretty much says it all. Share with the community an interesting story related to online marketing. The story could be funny, personal or informational. As long as it’s interesting, well-written, and a benefit to the community, we’d love to hear it. A great example of the type of post we’re looking for is Mike Ramsey’s From Zero to a Million: 20 Lessons for Starting an Internet Marketing Agency. Headsmacking Tip: We’re bringing this format, first shared by Rand years ago on the main blog, out of the mothballs. Simply share with the audience an awesome online marketing-related tip that could make their jobs easier. (Example:Headsmacking Tip #21: Write Better Headlines Than Anyone Else.) Problem Solved: Tell the audience how you solved a significant marketing problem, making it easier for you to do your job. Share the nitty-gritty details, and include any graphics or tips needed for the community to solve the problem for themselves. (Example: A Simple Guide to Overcoming Ad Blindness for Publishers.) Here’s How: This style of post is meant to be a little more wide-ranging, allowing you to share with the audience ways they can successfully deploy a technique, tactic, strategy, tool or anything else you’ve gleaned that might be of value to marketers. (Examples: How to Write Emails That Get Opened Every Time and The 10 Tools I Use to Monitor Social Media More Effectively. A big shout out to Katy Katz for the inaugural post in this category: Here’s How to Write an Email That Throws off a Whole Room’s Productivity.) Sounds easy enough, doesn’t it? Don’t overthink it. Read our guidelines, then dive in and get started. Also, we’d love to hear what you think about these new formats. Plus, we welcome your comments or questions. Feel free to share your thoughts below
Content Development | | ronell-smith7 -
My keywords have low search volume - is it still worth starting a blog?
I'm thinking of starting a new blog, but when I did my keyword research I found that my keywords all have low search volume (under 100 searches per month, with the occasional keyword having 480 searches a month). Is this a deal breaker? Any recommendations would be great - thanks everyone!
Content Development | | Trevorneo1 -
Google "blog" search
Anyone notice a while ago - the "more" drop down used to include "blogs" which really helped with finding like minded blogs for content marketing. Anyone finding this frustrating and or find a solution? I know they supply us with: http://www.google.com/blogsearch Any other hints? Your pal, Chenzo
Content Development | | Chenzo0 -
Gallary Pages
We have multiple Gallery Pages on a website and they are all being indexed as duplicate content. I am assuming it's because there's no content on those pages. So, it's picking up the pages header/footer navigation and considering it content. I am not sure what the best way is to deal with Gallery pages. I want the images to get indexed, but not sure how to do this if I need to set the gallery pages with the thumbnails on it to noindex. Would it be smart to set the pages to "noindex, follow" or "index, nofollow" or do you have any other suggestions?
Content Development | | cmaseattle0 -
Adding a picture page - Good or Bad?
I have a lot of cool pics that just did not quite make it on one of my pages. Not necessarily because I did not want to, but space reasons they just happened to lose out to another photo. What I was thinking was, maybe I can add like a gallery page? Possibly with links back to the pages that each photo was considered for? Would this be a decent idea or just a page deemed as having low quality/value and end up hurting my site. Or maybe you can add an idea that may make it work for me!
Content Development | | VictorVC0 -
How can i solve duplicate problem with different url needed?
My client is a big international firm with 10 websites with different url (.co.uk, .com, .com.au, .pl... etc). All websites are exactly the same except the price. I suggested them to only use .com and use region as a sub domain like au.xxx.com instead of xxx.com.au. However they cannot do that for some reason. I am trying to solve the duplicate issue. I dont think i can use 301 redirect or canonial link because all regions are making even traffics. Any suggestions?
Content Development | | ringochan0 -
Please help me stop google indexing https pages on my wordpress site
I added SSL to my wordpress blog because that was the only way to get a dedicated IP address for my site at my host. Now I am noticing Google has started indexing posts both as http and https. Can some one please help how to force google not to index https as I am sure its like having duplicate content. All help is appreciated. So far I have added this to top of htaccess file: RewriteEngine on Options +FollowSymlinks RewriteCond %{SERVER_PORT} ^443$ RewriteRule ^robots.txt$ robots_ssl.txt And added robots_ssl.txt with following: User-agent: Googlebot Disallow: / User-agent: * Disallow: / But https pages are still being indexed. Please help.
Content Development | | rookie1230