PDF best practices: to get them indexed or not? Do they pass SEO value to the site?
-
All PDFs have landing pages, and the pages are already indexed. If we allow the PDFs to get indexed, then they'd be downloadable directly from google's results page and we would not get GA events.
The PDFs info would somewhat overlap with the landing pages info. Also, if we ever need to move content, we'd now have to redirects the links to the PDFs.
What are best practices in this area? To index or not?
What do you / your clients do and why?
Would a PDF indexed by google and downloaded directly via a link in the SER page pass SEO juice to the domain? What if it's on a subdomain, like when hosted by Pardot? (www1.example.com)
-
repeatedly noticed that google index PDF files. But only their headers, without the contents of the file itself.
If you format the file description correctly, you can do it through the PDF Architect (http://pdf-architect.ideaprog.download/) program, or any other convenient for you.
-
PDFs can be canonicalized using .htaccess. Google is usually very slow to discover and obey this but it can be done. However, if your PDF is not close to being an exact copy of the target page, Google will probably not honor the canonicalization and they will index the PDF and the html page separately.
PDFs can be optimized (given a title tag) by editing the properties of the document. Most PDF - making software has the ability to do this.
You can insert "buy buttons" and advertising in PDFs. Just make an image, paste it into the document and link it to your shopping cart or to your target document.
PDFs accumulate linkjuice and pass it to other documents.
Use the same strategies with PDFs as you would with an html page for directing visitors where you want them to go and getting them to do what you want them to do.
Some people will link to your PDF, others will grab your PDF and place it on their website (in that situation, you lose the canonical but still get juice from any embeded links), and benefit from ads and buttons that might be included. Lock the PFD with your PDF-creating software to prevent people from editing your PDF (but they can always copy/paste to get around it).
Other types of documents such as Excel spreadsheets, PowerPoint documents, Google images, etc can have embedded text, embedded links and other features that are close to equivalent to an html document.
-
PDF documents aren't written in HTML so you can't put canonical tags into PDFs. So that won't help or work. In-fact, if you are considering any types of tags of any kind for your PDFs, stop - because PDF files cannot have HTML tags embedded within them
If your PDF files have landing pages, just let those rank and let people download the actual PDF files from there if they chose to do so. In reality, it's best to convert all your PDFs to HTML and then give a download link to the PDF file in case people need it (in this day and age though, PDF is a backwards format. It's not even responsive, for people's pones - it sucks!)
The only canonical tags you could apply, would be on the landing pages (which do support HTML) pointing to the PDF files. Don't do that though, it's silly. Just convert the PDFs to HTML, then leave a download button for the old PDFs in-case anyone absolutely needs them. If the PDF and the HTML page contain similar info, it won't affect you very much.
What will affect you, is putting canonical tags on the landing pages thus making them non-canonical (and stopping the landing pages from ranking properly). You're in a situation where a perfect outcome isn't possible, but that's no reason to pick the worst outcome by 'over-adhering' to Google's guidelines. Sometimes people use Google's guidelines in ways Google didn't anticipate that they would
PDF documents don't usually pass PageRank at all, as far as I know
If you want to optimise the PDF documents themselves, the document title which you save them with is used in place of a <title>tag (which, since PDFs aren't in HTML, they can't use <title>). You can kind of optimise PDF documents by editing their document titles, but it's not super effective and in the end HTML conversions usually perform much better. As stated, for the old fossils who still like / need PDF, you can give them a download link</p> <p>In the case of downloadable PDF files with similar content to their connected landing pages, Google honestly don't care too much at all. Don't go nutty with canonical tags, don't stop your landing pages from ranking by making them non-canonical</p></title>
-
Yes, the PDFs would help increase your domain rank as they are practically considered as pages by Google, as explained in their QnA here.
Regarding hosting the PDFs on a subdomain, Google has stated that it's almost the same as having them on a subfolder, but that is highly contested by everyone since it's much harder to rank a subdomain than a subfolder.
Regarding the canonical tags, they are created for "Similar or Duplicate Pages", so the content doesn't have to be identical, and you'll be good so long as most of the content is the same. Otherwise, you can safely have them both be and have backlinks linking from the pdf to the main content to transfer "link juice", as they are considered as valid links.
I hope my response was beneficial to you and that the included proof was substantial.
Daniel Rika
-
Thank you.
Could you address my question about what's best practice? What do most companies do?
I am not sure what the best choice would be for us -- to expose PDFs which compete with their own landing pages or not.
Also, do you know if PDFs pass SEO "juice" to the main domain? Even if they are hosted at www2.maindomain.com?
Where can I see some proof that this is the case?
If the PDFs have a canonical tag pointing to the parent page, wouldn't this be confusing for the search engines as these are two separate files with differing content? Canonical tags are usually used to eliminate duplicates for differing URLs with identical content.
-
Whether you want to index the pdf directly or not will mostly depend on the content of the pdf:
- If you are using the pdf as a way to gather e-mails for your newsletter, or if you are offering the pdf as a way to get users to your site, then it would be best not to have them indexed directly, but instead have the users go to your site first.
- If the pdf in itself is a way for you to promote your website or content then you can index it so that it can be accessed directly and may help you to get a bit more rank or clicks.
If you are looking to track pdf views, there are options to connect GA and track your pdf views, such as this plugin.
If the content is similar to the web page, then you can put a canonical tag to transfer the ranking. You can add it to the http header using the .htaccess file as explained here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does intercom support pages and redirect issue can affect the SEO performance of our website?
I noticed that in the redirect issues I have, most of the issues are coming from our Intercom support links. I want to ask, does intercom support pages and redirect issue can affect the SEO performance of our website?
Reporting & Analytics | | Envoke-Marketing0 -
Site property is verified for new version of search console, but same property is unverified in the old version
Hi all! This is a weird one that I've never encountered before. So basically, an admin granted me search console access as an "owner" to the site in search console, and everything worked fine. I inspected some URL's with the new tool and had access to everything. Then, when I realized I had to remove certain pages from the index it directed me to the old search console, as there's no tool for that yet in the new version. However, the old version doesn't even list the site under the "property" dropdown as either verified or unverified, and if I try to add it it makes me undergo the verification process, which fails (I also have analytics and GTM access, so verification shouldn't fail). Has anyone experienced something similar or have any ideas for a fix? Thanks so much for any help!
Reporting & Analytics | | TukTown1 -
Is there a way to map your on-page SEO changes with the organic growth?
Hi Mozzers, I was just wondering if there's a way we can map our on-page SEO changes with the increase/decrease in organic traffic. For instance, I introduced brand pages' link the product page breadcrumbs and suddenly organic traffic for my brand pages increase from X to 2X in 1 couple of weeks. Now, this can be because of this breadcrumb change purely or because of some algorithm update or may be, bots started finding the content interesting and hence, started ranking them up (in case the brand pages were launched recently). So, you can't say which change should be mapped to what increase/decrease in organic traffic. Or, is there a way to map this?
Reporting & Analytics | | _nitman0 -
High Bounce Rate on traffic generating area of our site
Hi, Our eCommerce site currently includes a blog section known as Igloo which we have filled with unique and helpful content that is useful to a fair few people, not just customers of ours. It currently attracts a large number of visitors (more than the actual eCommerce side of the site in actual fact) organically who aren't currently customers of ours. Very few of these turn in to paying clients so it's not really a money spinner but it has worked quite well from a linkbait perspective / traffic generation perspective and undoubtedly a few of these people do end up making a purchase on the actual shopping end of our site. We're look at ways to encourage these people finding help on this free resource to take a look at our homepage and hopefully make an order but in the meantime I am worried that there may be a few downsides to us creating this content: Google may see us more as a help site than a shopping site. Since selling products is where we make our money this could ultimately be a bad thing. Our bounce rate is REALLY high (I'm talking around 94%) on the help site versus around 20% on the eCommerce site. I guess people land on the article they want, read it and then disappear. Would this bounce rate skew our entire site stats and ultimately result in decreased performance in the SERPS. I would appreciate your opinions and, in the event you do feel it may be hurting us overall perhaps some suggestions on how to mitigate the effects? Many thanks!
Reporting & Analytics | | ChrisHolgate0 -
How can I get the Google Analytics advanced segments beta?
Is there a way that I can get access to the Google Analytics new segmenting features? I've been reading about them for some months now, but still nothing in my GA account. Thank you in advance.
Reporting & Analytics | | LinusB0 -
Our SEO is garbage. can someone answer a few questions for me?
I've seen our SEO drop to more or less the bottom of the barrel, and I don't have any answers yet as to why. SEOMoz is running it's crawl currently, and I have a few errors about duplicate titles and content, so I know I have some work cut out for me. But, why the sudden drop? The only change I made at the time was a change to our URL structure, but all links were 301'd to their new location. Does this still hurt SEO that terribly? Also, our robots.txt file is getting indexed and showing up as the first result at times. Very embarrassing. It's doing better than our other pages. 😞 I don't get what's happening here. Yrzu3,fkbYu
Reporting & Analytics | | stagl0 -
Should we add the city to our keywords for a site that is only local?
This is one of those things I have done for a long time and all of a sudden asked myself was it necessary: For our local clients, we add the city name (Houston, KC, Birmingham) after each keyword. An example would be TestSite.com/big-tester-houston A Title Tag might be Big Tester Houston | Test Site, etc. Where appropriate we do the same with H1 or H2's and occasionally in the content we will use the city name. The thought being that since the site is only for a given city, it will be deemed more relevant than a site from outside.( I understand there are other factors in SEO; this is a specific question around adding the city). Yes, we also optimize with local directories/citation sites. Is this overkill, is it even worthwhile? Is there any evidence one way or another? I would love some strong opinions backed up with something other than anecdotal evidence where possible.
Reporting & Analytics | | RobertFisher0 -
Sort referring sites by visit change over time comparison in GA
I can't believe I've never done this before, so I'm going to assume that I previously must have figured it out via excel, but I'm hoping there's an easier way. So I want to compare the referring sites between April and May and see which have sent (specifically) less traffic. The problem with doing a comparison in GA is that it only sorts by the highest traffic for May, when actually I want to see the largest negative change (by number, not percentage) between April and May. Is there a way to do this via the dashboard or am I just going to have to play about in excel for 10 minutes?
Reporting & Analytics | | StalkerB0