Medium sizes forum with 1000's of thin content gallery pages. Disallow or noindex?
-
I have a forum at http://www.onedirection.net/forums/ which contains a gallery with 1000's of very thin-content pages. We've currently got these photo pages disallowed from the main googlebot via robots.txt, but we do all the Google images crawler access.
Now I've been reading that we shouldn't really use disallow, and instead should add a noindex tag on the page itself.
It's a little awkward to edit the source of the gallery pages (and keeping any amends the next time the forum software gets updated).
Whats the best way of handling this?
Chris.
-
Hey Chris,
I agree that your current implementation, while not ideal, is perfectly adequate for the purposes of ensuring you don't have duplicate content or cannibalisation problems - but still allows Google to index the UCG images.
You're also preventing Googlebot from seeing the user profile pages, which is a good idea, since many of them are very thin and mostly duplicate.
So, from a pure SEO perspective, I think you've done a good job.
However... I think you should also consider the ethical implications of potentially blocking the image googlebot as well. By preventing Google from indexing all those images of young girls fawning over the vacuous runners up of a televised talent show, you would undoubtedly be doing the world a great service.
-
Hi Chris, I second Jarno's opinion in this regard. If it is going to be a huge overhead to add the page level blocking, you can rely on your current robots.txt setup. There is a small catch here though. Even if you block using robots.txt file, if Google finds a reference to the blocked content elsewhere on the Internet, then it would index the blocked content. In situations like this, page level content blocking is the way forward. So to fully restrict Google bot indexing your content, you should ideally be using the page level robots meta tag or x-robots-tag.
Here you go for more: https://support.google.com/webmasters/answer/156449?hl=en
Hope it helps.
Best,
Devanur Rafi.
-
Chris,
is the disallow meta update is too complicated for you to add due to software issues etc. then I feel that your current method is the right way to go. Normally you would be absolutely right for the simple reason that page level overrules the robots.txt. But if a software update overrules the rules places in your code then you have to manually add it after each and every update and i'm not sure you want to do that.
regards
Jarno
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Specific pages won't index
I have a few pages on my site that Google won't index, and I can't understand why. I've looked into possible issues with Robots, noindex, redirects, canonicals, and Search Console rules. I've got nothing. Example: I want this page to index https://tour.franchisebusinessreview.com/services/franchisee-satisfaction-surveys/ When I Google the full URL, I get results including the non-subdomain homepage, and various pages on the subdomain, including a child page of the page I want, but not the page itself. Any ideas? Thanks for the help!
Technical SEO | | ericstites0 -
Thin Content due to Photo Galleries
Hi folks, i've got a question: we have about 3 million image sites with unique URL on our site. All images with a caption are transmitted to Google index, which regards 2/3 of all images. We are afraid that this could cause some problems due to thin content. Please take a look at one of our article sites with such a photo gallery: http://goo.gl/hq6bxG All gallery pics with a caption are indexed: http://goo.gl/gd9TQ6 Do you have any advices how to handle those photo galleries? How should they be flaged for Google? Every pic "noindex" and "canonical"-Tag to the article? Thx a lot! Matthias
Technical SEO | | Mulle0 -
Woocommerce Duplicate Page Content Issue
Hi, I'm receiving a duplicate content error. It says that this url: https://kidsinministry.org/childrens-ministry-curriculum/?option=com_content&task=view&id=20&Itemid=41 is a duplicate of this: http://kidsinministry.org/childrens-ministry-curriculum I'm using wordpress, woocommerce, and not really sure how to even address this. I tried adding this to .htaccess but it didn't redirect the url: 301 Redirects Redirect 301 https://kidsinministry.org/childrens-ministry-curriculum/?option=com_content&task=view&id=20&Itemid=41 http://kidsinministry.org/childrens-ministry-curriculum/ Anyone have any ideas? Thanks!
Technical SEO | | a_toohill0 -
Why is the report telling I have duplicate content for 'www' and No subdomain?
i am getting duplicate content for most of my pages. when i look into in your reports the 'www' and 'no subdomian' are the culprit. How can I resolve this as the www.domain.com/page and domain.com/page are the same page
Technical SEO | | cpisano0 -
What's the best canonicalization method?
Hi there - is there a canonicalization method that is better than others? Our developers have used the
Technical SEO | | GBC0 -
Unnatural Link Warning Removed - WMT's
Hi, just a quick one. We had an unnatural link warning for one of our test sites, the message appeared on the WMT's dashboard. The message is no longer there, has it simply expired or could this mean that Google no longer sees an unatural backlink profile? Hoping it's the latter but doubtful as we haven't tried to remove any links.. as I say it's just a test site. Thanks in advance!
Technical SEO | | Webpresence0 -
One landing page with lots of content or content hub?
Interested in getting some opinions on if it's better to build one great landing page with tons of content or build a good landing page and build more content (as blog posts?) and interlink them back to the landing/hub page? Thoughts and opinions? Chris
Technical SEO | | sanctuarymg0 -
Where to put content on the page? - technical
The new algo update says any images at the top of the page negatively affect user experience if they are adverts? how does google know if its an advert or relevant banner? When trying to put text as far up as possible on the page, is it ok to make it appear higher in the code but appear further down using css? Or does Google not go from the code top to bottom when working this out, more how it renders? Any advice much appreciated.
Technical SEO | | pauledwards0