Getting rid of low quality
-
If I wanted to get rid of a batch of low quality pages from the index, Is the best practise to let them 404 and remove them from sitemap files?
Thanks
-
Thanks, Wayne, I never thought about link juice flowing to those pages, I'll have to check that out before making a decision. All the pages I want to remove are in the same directory, so would adding the text below to robots.txt remove all the pages in that directory from the index?
User-agent: * Disallow: /directory/
-
Hi Peter,
Great question considering the latest Panda update. A lot of people have been scrambling to remove content that Google might deem "shallow" or of no value to users. We implemented a couple of practices to see which worked best with regard to moving content:
A: We simply added a 'robots.txt' command. This is designed to not allow Google crawl the content.
B: If you have the luxury of moving it to an entirely different domain, that could also be a choice. We found this to be the better of the two in terms of aesthetics. We simply didn't want to gunk up our site with a lot of "shallow" content. It also seemed that the engines responded better to this approach.
Your 404 is another option if you simply want to remove it from the indexes. However, I'd be sure to check that no link juice is flowing through the pages. If so, then a 301 re-direct might be appropriate. Depending on your intentions, each of the three could serve your purpose!
Let me know if I've confused you, or if you need additional opinion!
Best of luck
W
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When rogerbot tried to crawl my site it gets a 404\. Why?
When rogerbot tries to craw my site it tries http://website.com. My website then tries to redirect to http://www.website.com and is throwing a 404 and ends up not getting crawled. It also throws a 404 when trying to read my robots.txt file for some reason. We allow rogerbot user agent so unsure whats happening here. Is there something weird going on when trying to access my site without the 'www' that is causing the 404? Any insight is helpful here. Thanks,
Technical SEO | | BlakeBooth0 -
Will getting backlinks to landing page from low quality sites negatively affect SEO?
I've recently started an initiative at my company to get our customers to publish a blog post about our company and to include a link to a landing page which sits on a subdomain attached to our main domain. The reason for directing visitors to the post to a landing page is to help with conversion. I've recently been thinking that couldn't the backlinks to this landing page from our customers' blogs (generally small sites) have a negative impact on the overall SEO of my companies domain? Thanks in advance.
Technical SEO | | JustinButlion0 -
Sitemap do they get cleared when its a 404
Hi, Sitemap do they get cleared when its a 404. We have a drupal site and a sitemap that has 60K links and i want to know if in these 4 years we deleted 100's of links and do they have them automatically cleared from Sitemap or we need to build the sitemap again? Thanks
Technical SEO | | mtthompsons0 -
Cant get my head around this duplicate content dilemma!
Hi, Lets say you have a cleaning company, you have a services page, which covers window cleaning, carpet cleaning etc, lets say the content on this page adds up to around 750 words. Now lets say you would like to create new pages which targeted location specific keywords in your area. The easiest way would be to copy the services page and just change all the tags to the location specific term but now you have duplicate content. If I wanted to target 10 locations, does this now mean I need to generate 750 words of unique content for each page which is basically the services page rewritten? Cheers
Technical SEO | | activitysuper0 -
If a page isn't linked to or directly sumitted to a search engine can it get indexed?
Hey Guys, I'm curious if there are ways a page can get indexed even if the page isn't linked to or hasn't been submitted to a search engine. To my knowledge the following page on our website is not linked to and we definitely didn't submit it to Google - but it's currently indexed: <cite>takelessons.com/admin.php/adminJobPosition/corp</cite> Anyone have any ideas as to why or how this could have happened? Hopefully I'm missing something obvious 🙂 Thanks, Jon
Technical SEO | | TakeLessons0 -
How can i get the Authors photo to show in the Google search result?
I added the rel="author" tags to the blog posts last week and updated the authors page with a link to the Google+ account, but I have yet to see the authors photo surface in the Google Results. Example URL: http://spotlight.vitals.com/2011/10/dr-richelle-cooper-testifies-against-dr-conrad-murray-in-trial/ Can anyone identify what else needs to be done?
Technical SEO | | irvingw0 -
Why am i still getting duplicate page title warnings after implementing canonical URLS?
Hi there, i'm having some trouble understanding why I'm still getting duplicate page title warnings on pages that have the rel=canonical attribute. For example: this page is the relative url http://www.resnet.us/directory/auditor/az/89/home-energy-raters-hers-raters/1 and http://www.resnet.us/directory/auditor/az/89/home-energy-raters-hers-raters/2 is the second page of this parsed list which is linking back to the first page using rel=canonical. i have over 300 pages like this!! what should i do SEOmoz GURUS? how do i remedy this problem? is it a problem?
Technical SEO | | fourthdimensioninc0 -
How to use overlays without getting a Google penalty
One of my clients is an email subscriber-led business offering deals that are time sensitive and which expire after a limited, but varied, time period. Each deal is published on its own URL and in order to drive subscriptions to the email, an overlay was implemented that would appear over the individual deal page so that the user was forced to subscribe if they wished to view the details of the deal. Needless to say, this led to the threat of a Google penalty which _appears (fingers crossed) _to have been narrowly avoided as a result of a quick response on our part to remove the offending overlay. What I would like to ask you is whether you have any safe and approved methods for capturing email subscribers without revealing the premium content to users before they subscribe? We are considering the following approaches: First Click Free for Web Search - This is an opt in service by Google which is widely used for this sort of approach and which stipulates that you have to let the user see the first item they click on from the listings, but can put up the subscriber only overlay afterwards. No Index, No follow - if we simply no index, no follow the individual deal pages where the overlay is situated, will this remove the "cloaking offense" and therefore the risk of a penalty? Partial View - If we show one or two paragraphs of text from the deal page with the rest being covered up by the subscribe now lock up, will this still be cloaking? I will write up my first SEOMoz post on this once we have decided on the way forward and monitored the effects, but in the meantime, I welcome any input from you guys.
Technical SEO | | Red_Mud_Rookie0