Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
-
Ok...
So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links?
Example url: www.mysite.com/subject/?tracking-info-goes-here-1234
- Canonical tag is: www.mysite.com/subject/
- I'm blocking anything with "?tracking-info-goes-here" via robots.txt
- The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters.
What are your thoughts?
- Should I nix the robots.txt stuff since I already have the canonical tag in place?
- Do you think I'm getting the backlink "juice" from all the links with the tracking parameter?
What would you do?
Why?
Are you sure?
-
Thanks Guys...
Yeah, I figure that's the right path to take based on what we know... But I love to hear others chime in so I can blame it all on you if something goes wrong - ha!
Another Note: Do you think this will cause some kind of unnatural anomaly when the robots.txt file is edited? All of a sudden these links will now be counted (we assume).
It's likely the answer is no because Google still knows about the links.. they just don't count them - but still thought I'd throw that thought out there.
-
I agree with what Andrea wrote above - just one additional point - blocking a file via robots.txt doesn't prevent the search engine from not indexing the page. It just prevents the search engine from crawling the page and seeing the content on the page. The page may very well still show up in the index - you'll just see a snippet that your robots.txt file is preventing google from crawling the site and caching it and providing a snippet or preview. If you have canonical tags put in place properly, remove the block on the parameters in your robots.txt and let the engines do things the right way and not have to worry about this question.
-
If you block with robots.txt link juice can't get passed along. If your canonicals are good, then ideally you wouldn't need the robots. Also, it really removes value of the social media postings.
So, to your question, if you have the tracking parameter blocked via robots, then no, I don't think you are getting the link juice.
http://www.rickrduncan.com/robots-txt-file-explained
When I want link juice passed on but want to avoid duplicate content, I'm more a fan of the no index, follow tags and using canonicals where it makes sense, too. But since you say your URLs with the parameters aren't being indexed then you must be using tags anyway to make that happen and not just relying on robots.
To your point of "are you sure":
http://www.evergreensearch.com/minimum-viable-seo-8-ways-to-get-startup-seo-right/
(I do like to cite sources - there's so many great articles out there!)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Same URL-Structure & the same number of URLs indexed on two different websites - can it lead to a Google penalty?
Hey guys. I've got a question about the url structure on two different websites with a similar topic (bith are job search websites). Although we are going to publish different content (texts) on these two websites and they will differ visually, the url structure (except for the domain name) remains exactly the same, as does the number of indexed landingpages on both pages. For example, www.yyy.com/jobs/mobile-developer & www.zzz.com/jobs/mobile-developer. In your opinion, can this lead to a Google penalty? Thanks in advance!
Intermediate & Advanced SEO | | vde130 -
Google has discovered a URL but won't index it?
Hey all, have a really strange situation I've never encountered before. I launched a new website about 2 months ago. It took an awfully long time to get index, probably 3 weeks. When it did, only the homepage was indexed. I completed the site, all it's pages, made and submitted a sitemap...all about a month ago. The coverage report shows that Google has discovered the URL's but not indexed them. Weirdly, 3 of the pages ARE indexed, but the rest are not. So I have 42 URL's in the coverage report listed as "Excluded" and 39 say "Discovered- currently not indexed." When I inspect any of these URL's, it says "this page is not in the index, but not because of an error." They are listed as crawled - currently not indexed or discovered - currently not indexed. But 3 of them are, and I updated those pages, and now those changes are reflected in Google's index. I have no idea how those 3 made it in while others didn't, or why the crawler came back and indexed the changes but continues to leave the others out. Has anyone seen this before and know what to do?
Intermediate & Advanced SEO | | DanDeceuster0 -
URL Parameters
Hi Moz Community, I'm working on a website that has URL parameters. After crawling the site, I've implemented canonical tags to all these URLs to prevent them from getting indexed by Google. However, today I've found out that Google has indexed plenty of URL parameters.. 1-Some of these URLs has canonical tags yet they are still indexed and live. 2- Some can't be discovered through site crawling and they are result in 5xx server error. Is there anything else that I can do (other than adding canonical tags) + how can I discover URL parameters indexed but not visible through site crawling? Thanks in advance!
Intermediate & Advanced SEO | | bbop330 -
Content From One Domain Mysteriously Indexing Under a Different Domain's URL
I've pulled out all the stops and so far this seems like a very technical issue with either Googlebot or our servers. I highly encourage and appreciate responses from those with knowledge of technical SEO/website problems. First some background info: Three websites, http://www.americanmuscle.com, m.americanmuscle.com and http://www.extremeterrain.com as well as all of their sub-domains could potentially be involved. AmericanMuscle sells Mustang parts, Extremeterrain is Jeep-only. Sometime recently, Google has been crawling our americanmuscle.com pages and serving them in the SERPs under an extremeterrain sub-domain, services.extremeterrain.com. You can see for yourself below. Total # of services.extremeterrain.com pages in Google's index: http://screencast.com/t/Dvqhk1TqBtoK When you click the cached version of there supposed pages, you see an americanmuscle page (some desktop, some mobile, none of which exist on extremeterrain.com😞 http://screencast.com/t/FkUgz8NGfFe All of these links give you a 404 when clicked... Many of these pages I've checked have cached multiple times while still being a 404 link--googlebot apparently has re-crawled many times so this is not a one-time fluke. The services. sub-domain serves both AM and XT and lives on the same server as our m.americanmuscle website, but answer to different ports. services.extremeterrain is never used to feed AM data, so why Google is associating the two is a mystery to me. the mobile americanmuscle website is set to only respond on a different port than services. and only responds to AM mobile sub-domains, not googlebot or any other user-agent. Any ideas? As one could imagine this is not an ideal scenario for either website.
Intermediate & Advanced SEO | | andrewv0 -
Massive URL blockage by robots.txt
Hello people, In May there has been a dramatic increase in blocked URLs by robots.txt, even though we don't have so many URLs or crawl errors. You can view the attachment to see how it went up. The thing is the company hasn't touched the text file since 2012. What might be causing the problem? Can this result any penalties? Can indexation be lowered because of this? ?di=1113766463681
Intermediate & Advanced SEO | | moneywise_test0 -
Blocked from google
Hi, i used to get a lot of trafic from google but sudantly there was a problem with the website and it seams to be blocked. We are also in the middle of changing the root domain because we are making a new webpage, i have looked at the webmaster tools and corrected al the errors but the page is still not visible in google. I have also orderd a new crawl. Anyone have any trics? do i loose a lot when i move the domainname, or is this a good thing in this mater? The old one is smakenavitalia.no The new one is Marthecarrara.no Best regards Svein Økland
Intermediate & Advanced SEO | | sveinokl0 -
Can you see the 'indexing rules' that are in place for your own site?
By 'index rules' I mean the stipulations that constitute whether or not a given page will be indexed. If you can see them - how?
Intermediate & Advanced SEO | | Visually0 -
Export list of urls in google's index?
Is there a way to export an exact list of urls found in Google's index?
Intermediate & Advanced SEO | | nicole.healthline0