PDFs and webpages
-
If a website provides PDF versions of the page as a download option, should the PDF be no-indexed in your opinion?
We have to offer PDF versions of the webpage as our customers want them, they are a group who will download/print the pdfs. I thought of leaving the pdfs alone as they site in a subdomain but the more I think about it, I should probably noindex them. My reasons
- They site in a subdomain, if users have linked to them, my main domain isn't getting the rank juice
- Duplication issues, they might be affecting the rank of the existing webpages
- I can't track the PDF as they are in a subdomain, I can see event clicks to them from the main site though
On the flipside
- I could lose out on the traffic the pdfs bring when a user loads it from an organic search and any link existing on the pdf
What are your experiences?
-
Cool. It's advisable to add canonical HTTP headers to the PDFs too, if you can.
-
Thanks Alex,
I do have canonical tags on the webpages to ensure they are seen as the main one. I'll look into tracking subdomains.
-
Google now class subdomains pretty much as part of your main domain: http://www.youtube.com/watch?v=_MswMYk05tk - so you will be getting some of that rank juice.
I'd think that the major search engines wouldn't have a problem knowing that an HTML version of a page is preferred over a PDF. However, you can use canonical HTTP headers to make sure there are no problems with duplicate content: http://moz.com/blog/how-to-advanced-relcanonical-http-headers
If you use Google Analytics you will be able to track the subdomain. You can do it as part of your existing profile or by setting up a separate one: https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSite (ensure this is the version of Analytics you have installed).
There's a short guide here on getting more data about PDFs through Google Analytics: http://moz.com/ugc/how-to-track-pdf-traffic-links-in-google-analytics-open-site-explorer
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Creating Redirect Maps -To include PDFs or Not to include PDFs?
When creating a redirect map for a site re-build or domain change, it is necessary to include .PDFs or any other non-HTML URLs? Do PDFs even carry "seo juice" over? When switching CMS, does it even matter to include them? Thanks!
Intermediate & Advanced SEO | | emilydavidson0 -
How to make AJAX content crawlable from a specific section of a webpage?
Content is located in a specific section of the webpage that are being loaded via AJAX.
Intermediate & Advanced SEO | | zpm20140 -
Thinking about not indexing PDFs on a product page
Our product pages generate a PDF version of the page in a different layout. This is done for 2 reasons, it's been the standard across similar industries and to help customers print them when working with the product. So there is a use when it comes to the customer but search? I've thought about this a lot and my thinking is why index the PDF at all? Only allow the HTML page to be indexed. The PDF files are in a subdomain, so I can easily no index them. The way I see it, I'm reducing duplicate content On the flip side, it is hosted in a subdomain, so the PDF appearing when a HTML page doesn't, is another way of gaining real estate. If it appears with the HTML page, more estate coverage. Anyone else done this? My knowledge tells me this could be a good thing, might even iron out any backlinks from being generated to the PDF and lead to more HTML backlinks Can PDFs solely exist as a form of data accessible once on the page and not relevant to search engines. I find them a bane when they are on a subdomain.
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Does Unique Content Need to be Located Higher on my webpages?
I have 1 page that ranks well with unique written content located high up on page (http://www.honoluluhi5.com/new-condos-in-honolulu/). I struggle to rank for 200+ other pages where unique content requires scrolling (ex: http://www.honoluluhi5.com/oahu/honolulu-homes/). I am thinking to do as follows: Change layout of all my pages to have unique content higher on page When users are on my site (not coming from search engines) and use my search filters, then users will land on pages where unique content is lower on page (so keep this layout: http://www.honoluluhi5.com/oahu/honolulu-homes/). I will then add these pages to my robots.txt file so they do not show in Google's index. Reason: unique content lower on page offers best user experience. With unique content higher on page, I expect bounce rate to increase about 10% (based on the 1 page I have with unique content higher), but I think it is worthwhile, as I am sure search engines will start having my pages rank higher.
Intermediate & Advanced SEO | | khi50 -
Javascript to fetch page title for every webpage, is it good?
We have a zend framework that is complex to program if you ask me, and since we have 20k+ pages that we need to get proper titles to and meta descriptions, i need to ask if we use Javascript to handle page titles (basically the previously programming team had NOT set page titles at all) and i need to get proper page titles from a h1 tag within the page. current course of action which we can easily implement is fetch page title from that h1 tag being used throughout all pages with the help of javascript, But this does makes it difficult for engines to actually read what's the page title? since its being fetched with javascript code that we have put in, though i had doubts, is anyone one of you have simiilar situation before? if yes i need some help! Update: I tried the JavaScript way and here is what it looks like http://islamicencyclopedia.org/public/index/hadith/id/1/book_id/106 i know the fact that google won't read JavaScript like the way we have done with the website, But i need help on "How we can work around this issue" Knowing we don't have other options.
Intermediate & Advanced SEO | | SmartStartMediacom0 -
"Authorship is not working for this webpage" Can a company G+ page be both Publisher AND Author?
When using the Google Structured Data testing tool I get a message saying....... **Authorship Testing Result - **Authorship is not working for this webpage. Here are the results of the data for the page http://www.webjobz.com/jobs/ Authorship Email Verification Please enter a Google+ profile to see if the author has successfully verified an email address on the domain www.webjobz.com to establish authorship for this webpage. Learn more <form id="email-verification-form" action="http://www.google.com/webmasters/tools/richsnippets" method="GET" data-ved="0CBMQrh8">Verify Authorship</form> Email verification has not established authorship for this webpage.Email address on the webjobz.com domain has been verified on this profile: YesPublic contributor-to link from Google+ profile to webjobz.com: YesAutomatically detected author name on webpage: Not Found.Publisher | Publisher markup is verified for this page. |
Intermediate & Advanced SEO | | Webjobz
| Linked Google+ page: | https://plus.google.com/106894524985345373271 | Question - Can this company Google plus account "Webjobz" be both the publisher AND the author? Can I use https://plus.google.com/106894524985345373271 as the author of this and all other pages on our site? 98emVv70 -
Webpages look like they have been de-indexed
Hi there, My webpages seem that they have been de-indexed, I have no page rank anymore for my webpages, my homepage which was a PR4, is now saying N/A, plus lots of my rankings have dropped, what check should I been making to identify that this is the case? Kind Regards
Intermediate & Advanced SEO | | Paul780