How can I tell Google not to index a portion of a webpage?
-
I'm working with an ecommerce site that has many product descriptions for various brands that are important to have but are all straight duplicates. I'm looking for some type of tag tht can be implemented to prevent Google from seeing these as duplicates while still allowing the page to rank in the index. I thought I had found it with Googleoff, googleon tag but it appears that this is only used with the google appliance hardware.
-
Correct you should make sure is not used elsewhere.
But I can't refrain from stressing again to hide the content is unlikely the best strategy.
-
So what should it look like in the code?
If my area to block was a product description it might say
"Product Description
bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla"
Secondly, in the robots.txt if I disallow /iframes/ then I would need to make sure we are not using iframes anywhere else?
-
Just as Chris Painter pointed out, you shouldn't worry too much about duplicate content if your site is legit (not an autoblog for example) and if you really want hide it from google, iframes are the way to go.
-
Matt Cutts has said (source:http://youtu.be/Vi-wkEeOKxM ) not to worry too much about duplicate content especially with the sheer volume of it there is on the internet. You may find you're looking more like you are trying to cheat Google or similar which could cause you a bigger head ache not to mention you may slow your webapge down, duplicate content isn't the worse enemy for seo. If you are worried put all the effort of trying to hide stuff from Google into making the product description unique.
-
Hello Brad,
Just to get your question clear.
I'm I correct that you want a method that does let Google (and other search engines) know a portion of your pages are duplicates while you want both duplicated pages and original pages to rank in the SERP's?
If you could provide us with an example (link) that would help a great deal as well.
-
Hi Brad,
You can prevent Google from seeing portions of the page by putting those portions in iframes that are blocked by
robots.txt.
Disallow: /iframes/
Thanks
-
You can iframe those chunks of content and block with robots.txt or just meta tagging noindex in the iframe source.
But I would not, if you can't build a plan to make the content unique just canonicalize, or let google choose which page to pick among the duplicate bunch.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I "no-index" two exact pages on Google results?
Hello everyone, I recently started a new wordpress website and created a static homepage. I noticed that on Google search results, there are two different URLs landing on same content page. I've attached an image to explain what I saw. Should I "no-index" the page url? Google url.JPG In this picture, the first result is the homepage and I try to rank for that page. The last result is landing on same content with different URL. So, should I no-index last result as shown in image?
Technical SEO | | amanda59640 -
Removing a site from Google index with no index met tags
Hi there! I wanted to remove a duplicated site from the google index. I've read that you can do this by removing the URL from Google Search console and, although I can't find it in Google Search console, Google keeps on showing the site on SERPs. So I wanted to add a "no index" meta tag to the code of the site however I've only found out how to do this for individual pages, can you do the same for a entire site? How can I do it? Thank you for your help in advance! L
Technical SEO | | Chris_Wright1 -
Can google bots read my internal post links if they are all listed in a javascript accordian where I list my sources?
I post a JavaScript accordion drop down tab [ a collapsible content area ] at the end of all my posts. I labeled the accordion "Show Article Sources"., and when a user clicks it, then the accordion expands open and it shows all the sources I listed for my article. And this is where I post all of my articles links that I reference per each article. But I read somewhere that google crawlers can not read text in a drop down JavaScript tab. So I am wondering now if this is true because that would mean I have no internal linking SEO going on since it cant read the links? ..... if it is true, then I should remove the accordion from all my articles and some how include the links I reference in the actual body text so I can get SEO benefits from external linking similar content? If that's true, what is an aesthetic way to do this, any example links? Tips ? Thoughts ?
Technical SEO | | ianizaguirre0 -
What's going on with google index - javascript and google bot
Hi all, Weird issue with one of my websites. The website URL: http://www.athletictrainers.myindustrytracker.com/ Let's take 2 diffrenet article pages from this website: 1st: http://www.athletictrainers.myindustrytracker.com/en/article/71232/ As you can see the page is indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:dfbzhHkl5K4J:www.athletictrainers.myindustrytracker.com/en/article/71232/10-minute-core-and-cardio&hl=en&strip=1 (that the "text only" version, indexed on May 19th) 2nd: http://www.athletictrainers.myindustrytracker.com/en/article/69811 As you can see the page isn't indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:KeU6-oViFkgJ:www.athletictrainers.myindustrytracker.com/en/article/69811&hl=en&strip=1 (that the "text only" version, indexed on May 21th) They both have the same code, and about the dates, there are pages that indexed before the 19th and they also problematic. Google can't read the content, he can read it when he wants to. Can you think what is the problem with that? I know that google can read JS and crawl our pages correctly, but it happens only with few pages and not all of them (as you can see above).
Technical SEO | | cobano0 -
Google Sitelinks
Is there anyway to control the sitelinks under a listing in Google? I have a group of lawyers where 1 of the them is showing up in the sitelinks. They want all of the lawyers to show up. Right now it is showing 1 lawyer, about page, contact us page, etc. Thanks!!!!
Technical SEO | | SixTwoInteractive0 -
Google Page speed
I get the following advice from Google page speed: Suggestions for this page The following resources have identical contents, but are served from different URLs. Serve these resources from a consistent URL to save 1 request(s) and 77.1KiB. http://www.irishnews.com/ http://www.irishnews.com/index.aspx I'm not sure how to fix this the default page is http://www.irishnews.com/index.aspx, anybody know what need to be done please advise. thanks
Technical SEO | | Liammcmullen0 -
What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT. I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap. Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
Technical SEO | | DotCar0 -
If a page isn't linked to or directly sumitted to a search engine can it get indexed?
Hey Guys, I'm curious if there are ways a page can get indexed even if the page isn't linked to or hasn't been submitted to a search engine. To my knowledge the following page on our website is not linked to and we definitely didn't submit it to Google - but it's currently indexed: <cite>takelessons.com/admin.php/adminJobPosition/corp</cite> Anyone have any ideas as to why or how this could have happened? Hopefully I'm missing something obvious 🙂 Thanks, Jon
Technical SEO | | TakeLessons0