Best way to address duplicate news sections within site
-
A client has a news section at www.clientsite.com/news and also at subdomain.clientsite.com/news. The stories within each section are identical:
www.clientsite.com/news/story-11-5-2011
subdomain.clientsite.com/news/story-11-5-2011
What's the best way to avoid a duplicate content issue within the site? A 301 redirect doesn't seem appropriate from the user experience point of view.
Is applying a rel=canonical <www.clientsite.com news="" story-a-b-c="">to each story within the subdomain news section the best option? They have 100's of stories, wondering if there might be an easier way?</www.clientsite.com>
Also, the news pages list the story headline and the first 3 lines of copy. Do these summaries present duplicate content issues with the full story page?
Thank you!
-
Alan, I appreciate your effort here. These are the sources I already shared
A complete summary of everything shared in those articles you quote:
1. It doesn't make a difference to google which method is used. When I examine all the information and analysis, it seems to indicate Google will index the content either way. How well that content will rank in Google is a different topic. There are reasons to keep content separate, such as when discussing topics unrelated to the main site, in which case a subdomain would be best.
2. Matt uses the directory approach, and he recommends for others to do the same.
AT BEST you can get that it is close to even with a slighter preference towards subfolders based on that information.
The Rand offers outstanding analysis as to why subfolders are the superior choice. Rand's analysis is in 2009, 2 years after the original articles quoted from Matt. http://www.seomoz.org/blog/understanding-root-domains-subdomains-vs-subfolders-microsites
The bottom line, it's up to you how much you care about your site and it's performance. Personally, I am a fighter. I also micro-manage website architecture because in many aspects, it is a one-time set it and forget it type of thing. Whether to use subdirectories vs subfolders, whether to use underscores in URLs vs dashes, etc. are things you do one time and then it is automated forever.
A detailed list of reasons supporting the subfolder approach has been offered. The DA, time, costs, etc. all support subfolders. If you wish to ignore all those strong, positive benefits and go with a subdomain then that is your choice.
Good luck.
-
The originals
http://googlewebmastercentral.blogspot.com/2008_01_01_archive.htmlhttp://www.mattcutts.com/blog/subdomains-and-subdirectories/
here is a better example from Matt
Deb December 11, 2007 at 1:01 am
<dd class="comment odd alt thread-odd thread-alt depth-1">
Matt thanks for your reply, just a query (if you don’t mind) if I add content in mattcutts.com/blog – it effect in seo because I add directly content in the domain mattcutts.com but if I add content in blog.mattcutts.com is the effect is same? I don’t think so – because this is a subdomain not directly related with the domain?
If I disturb you please don’t mindThanks
Deb</dd>
<dd class="comment odd alt thread-odd thread-alt depth-1">Matt Cutts December 10, 2007 at 10:55 am</dd>
<dd class="comment byuser comment-author-matt-cutts bypostauthor odd alt thread-odd thread-alt depth-1">
Deb, it really is a pretty personal choice. For something small like a blog, it probably won’t matter terribly much. I used a subdirectory because it’s easier to manage everything in one file storage space for me. However, if you think that someday you might want to use a hosted blog service to power your blog, then you might want to go with blog.example.com just because you could set up a CNAME or DNS alias so that blog.example.com pointed to your hosted blog service.
</dd>
I was trying to find video matt made where he makes a simular claim. but i have to get back to work
-
Alan,
We will have to agree to disagree on this one.
There is a ton of what can only be referred to as "SEO bullshit" published. When I quote a source it will usually be Matt Cutts directly, or Google, or a highly respected SEO who shares an opinion on a topic AND who offers very solid research to back up that opinion. In short, credibility is everything when quoting a source to support a given position.
You are quoting a site I have never heard of, alexander.holbreich.org. Is it just me? Do others know and recognize this site as a reputable source of SEO information?
The author's About page is a total of 4 lines of text. Line 1 = his name, Line 3 & 4 is where he lives. Line 2 = he has a degree in "Business Information" but doesn't even state where or when he received this degree. This web page is a solid example of a page that has absolutely zero trust on SEO.
I think it is great that you read various sources of SEO for ideas, but that is a big difference from depending on those sources as credible information.
If you want to quote, try the main source article. Doing such would add higher credibility to your position. I can agree there is a lot of confusion on this topic, but it is propagated mostly by pages like the one you linked which should probably never be read.
Using the source you quoted and some common ground I would share the following:
-
Matt Cutts stated he uses folders "My personal preference on subdomains vs. subdirectories is that I usually prefer the convenience of subdirectories for most of my content. A subdomain can be useful to separate out content that is completely different."
-
Matt Cutts recommended for others to use folders "If you’re a newer webmaster or SEO, I’d recommend using subdirectories until you start to feel pretty confident with the architecture of your site."
-
Matt shared a specific example of when a subdirectory would be appropriate, and it is an example I had shared as well in response to the original question "A subdomain can be useful to separate out content that is completely different. Google uses subdomains for distinct products such news.google.com or maps.google.com, for example."
The above aside, one site is easier to maintain then two. There are lower costs all around (software, trust badges, SSL, etc). There is less time involved as well. All that time and money can be put into other aspects of SEO such as link building and creating great content.
Further, by combining your content into one site, all your content benefits from the higher DA of your site.
I hope you take the information I am sharing the right way Alan. My professional experience leads me to almost always use a folder unless there is a clear and specific reason to use a subdomain such as trying to separate out content which is not related to the main site. The difference is strong enough to where I would recommend for most clients who have a subdomain to delete it and move to the subfolder structure.
If you find a differing opinion, I would love to hear it. All I ask is for it to be from a highly credible SEO source who preferably shares detailed examples or logic to support the position.
Best Regards,
-
-
"With respect to the general subfolder vs domain discussion, as far as I have seen most of the "debate" ended with subfolders being the winner."
For what reasons is it the winner? I use subdomains a lot, thats why I have looked for evidence, and Matt Cutts has stated it makes no difference.
Rand states, it is his personal belief, but google and Matt Cutts have stated many times it makes no difference to rankings
http://alexander.holbreich.org/2008/01/subdomains-vs-subdirectories/" otherwise irrelevant change during this discussion only serves to confuse an otherwise muddy topic"
I dont think its confusion, it is information clearly stated (not to do with rankings) for one to consider. it is an indication of googles thinking. It is stated correcly and all informmation should be considered. One could say that stating rands personal belief is confusing.
-
I take a different view on this topic then Alan.
As Alan mentioned, the recent Google change sole effect is how links to sub-domains from the root domain visually appear in Google WMT. They have absolutely no ranking weight difference. Bringing up that otherwise irrelevant change during this discussion only serves to confuse an otherwise muddy topic.
With respect to the general subfolder vs domain discussion, as far as I have seen most of the "debate" ended with subfolders being the winner.
There are a couple situations where a subdomain would be preferable to a folder. One example is when a different, unrelated topic or product is being offered. Keith, you brought up the example of Google Maps. A few comments I would share:
-
Google Maps is a different product then Google search. Really the main thing they have is they are being offered by the same company. The idea of providing satellite images and driving directions is really quite different then providing the best search results. These two products happen to be offered by the same company but if you think about it, they are really very distinct products. It would be the same idea if Ford created their own version of Sirius radio. Yes, the radios would be offered in Ford cars but the product is truly distinct of the cars and can stand completely alone.
-
Google's site was set up years ago before this topic was analyzed to this depth. Many changes have been made over the years.
A couple great discussions on this topic:
http://www.seomoz.org/blog/understanding-root-domains-subdomains-vs-subfolders-microsites
A quote Rand shared in a different article "99.9% of the time, if a subfolder will work, it's the best choice for all parties." I agree for the overwhelming majority of cases, a subfolder is preferred. There are some corner cases but normally speaking the subfolder is the preferred approach.
-
-
Subdomains or folder is an old debaiting point, but matt cutts has said it makes no difference.
I have also noticed that google includes subdomain links in its site links, as well as google WMT now shows subdomain links as internal(I know this is seperate to ranking, but it makes but with the other evidence it gives weight to what matt cutts stated). -
Good catch on the subdomains! That is a separate issue, and I am recommending they move everything to a clientsite.com/folder setup. The sub-domains do have unique content (except for the news) and they set it up that way because they've seen other sites, like Google, set up sub-domains for maps and their other products.
What's a good explanation to the client for why other large sites like Google set up different content sections as subdomains vs. the folder approach I am recommending?
-
the news pages list the story headline and the first 3 lines of copy. Do these summaries present duplicate content issues with the full story page?
No
With respect to the subdomain, what is the purpose of having the subdomain? It seems likely the best course of action would be to merge any unique content from the subdomain into the main site, then remove the subdomain. Your articles would benefit from the (presumably) stronger DA on the main site. Also your efforts would be reduced by allowing you to fully focus on one site rather then maintain two sites.
How does this subdomain benefit anyone?
If you insisted on keeping the subdomain, then yes the canonical meta tag would work.
-
canonical would be best here. but you would want to do it with code, or use rewrite outbound rules on the server
I would not worry about the sumery problem
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know the linking of hashtags on Wix sites does it negatively or postively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please?
Does anyone know the linking of hashtags on Wix sites does it negatively or positively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please? For example at the bottom of this blog post https://www.poppyandperle.com/post/face-painting-a-global-language the hashtags are linked, but they don't go to a page, they go to search results of all other blogs using that hashtag. Seems a bit of a strange approach to me.
Technical SEO | | Mediaholix0 -
Subdomain as News Section instead of Source in Google News?
Hi, trying to dig into Google News for a large site, mostly containing news.
Technical SEO | | m.m
The structure of the site network is subdomain.domain.se, and each subdomain has it's own brand with it's own news: x.domain.se
y.domain.se
z.domain.se
etc... Each brand/subdomain is more or less to equate with its own subjectfield/section. In Google News every subdomain is configured with it's own Site Source url, but also having the set up with one section with the same url. It seems like they're getting conflicts in Google News, Google can't always figure out which news article to which brand. Example: an article owned by brand A, but it is sometimes happens that articles getting labeled as brand B in the news SERP, though the link takes you correctly to brand A. I am thinking that this config in News Publisher Center may be a problem? Anyone having any thoughts if that would be better if we delete all source urls except for domain.se-brand and then put all the other subdomains as sections? www.domain.se x.domain.se y.doamin.se z.domain.se Any smart thoughts on this one? Or anything else that could make this wrong labeling (all content included images are hosted in same domain for example). Regards,
Magnus0 -
How bad is it to have duplicate content across http:// and https:// versions of the site?
A lot of pages on our website are currently indexed on both their http:// and https:// URLs. I realise that this is a duplicate content problem, but how major an issue is this in practice? Also, am I right in saying that the best solution would be to use rel canonical tags to highlight the https pages as the canonical versions?
Technical SEO | | RG_SEO0 -
Image centric site and duplicate content issues
We have a site that has very little text, the main purpose of the site is to allow users to find inspiration through images. 1000s of images come to us each week to be processed by our editorial team, so as part of our process we select a subset of the best images and process those with titles, alt text, tags, etc. We still host the other images and users can find them through galleries that link to the process and unprocessed image pages. Due to the lack of information on the unprocessed images, we are having lots of duplicate content issues (The layout of all the image pages are the same, and there isn't any unique text to differentiate the pages. The only changing factor is the image itself in each page) Any suggestions on how to resolve this issue, will be greatly appreciated.
Technical SEO | | wedlinkmedia0 -
Merging two sites into a new one: best way?
Hi, I have one small blog on a specific niche and let's call it firstsite.com (.com extension) and it's hosted on my server. I am going to takeover a second blog on same niche but with lots more links, posts, authority and traffic. But it his on a .info domain and let's call it secondsite.info and for now it's on a different server. I have a third domain .com where I would like join both blogs. Domain is better and reflects niche better and let's call it thirdsite.com How should I proceed to have the best result? I was thinking of creating a new account at my server with domain thirdsite.com After that upload all content from secondsite.info and go to google webmaster to let they know that site now sits on a new domain. Also do a full 301 redirect. Should it be page by page or just one 301 redirect? And finally insert posts (they are not many) from firstsite.com on thirdsite.com and do specific redirects. Is this a good option? Or should I first move secondsite.info to my server and keep updating it and only a few weeks later make transition to thirdsite.com? I am worried that it could be too much changes at once.
Technical SEO | | delta440 -
Site Crawl
I was wondering if there was a way to use SEOmoz's tool to quickly and easily find all the URLs on you site and not just the ones with errors. The site that I am working on does not have a site map. What I am trying to do is find all the URLs along with their titles and description tags. Thank you very much for your help
Technical SEO | | pakevin0 -
How should I structure a site with multiple addresses to optimize for local search??
Here's the setup: We have a website, www.laptopmd.com, and we're ranking quite well in our geographic target area. The site is chock-full of local keywords, has the address properly marked up, html5 and schema.org compliant, near the top of the page, etc. It's all working quite well, but we're looking to expand to two more locations, and we're terrified that adding more addresses and playing with our current set-up will wreak havoc with our local search results, which we quite frankly currently rock. My question is 1)when it comes time to doing sub-pages for the new locations, should we strip the location information from the main site and put up local pages for each location in subfolders? 1a) should we use subdomains instead of subfolders to keep Google from becoming confused? Should we consider simply starting identically branded pages for the individual locations and hope that exact-match location-based urls will make up for the hit for duplicate content and will overcome the difficulty of building a brand from multiple pages? I've tried to look for examples of businesses that have tried to do what we're doing, but all the advice has been about organic search, which i already have the answer to. I haven't been able to really find a good example of a small business with multiple locations AND good rankings for each location. Should this serve as a warning to me?
Technical SEO | | LMDNYC0 -
Is 100% duplicate content always duplicate?
Bit of a strange question here that would be keen on getting the opinions of others on. Let's say we have a web page which is 1000 lines line, pulling content from 5 websites (the content itself is duplicate, say rss headlines, for example). Obviously any content on it's own will be viewed by Google as being duplicate and so will suffer for it. However, given one of the ways duplicate content is considered is a page being x% the same as another page, be it your own site or someone elses. In the case of our duplicate page, while 100% of the content is duplicate, the page is no more than 20% identical to another page so would it technically be picked up as duplicate. Hope that makes sense? My reason for asking is I want to pull latest tweets, news and rss from leading sites onto a site I am developing. Obviously the site will have it's own content too but also want to pull in external.
Technical SEO | | Grumpy_Carl0