Welcome to the Q&A Forum

AlanBleiweiss

I'd just add that if the solution chosen is noindex, to do the noindex, follow method, just to give the extra cue if there are links on those pages.

AlanBleiweiss

More important than Moz data is what the impact would be of getting thousands of links from one domain from an SEO perspective.

When you get a link to a site, the "natural" type would be one link, or a couple, unless all the links are coming from a site where you are an author and those links are sometimes within content but always within author bio boxes. (and those are only links to an author's site, NOT using any type of SEO keywords not specific to the author, if even THEY are going to have real value).

Getting hundreds, or worse, thousands of links from one source is typically not otherwise "natural". It instantly gives the appearance of being bought/paid for.

While some sites can get away with that under some circumstances, if you look at it from an SEO best practices perspective, it's dangerous, now more than ever. So I would SERIOUSLY consider recommending that your client turn such an offer down. The risk is probably not worth the total value even if you did get such high weight based on Moz reports.

AlanBleiweiss

Once you fix the noindex, here's some other stuff. It's a "quick hit - what stands out" kind of an audit to see if there are any really obvious red flags.

1. Odd Links

Looking at the source of your POS Nightclub and Bar page I found some odd things going on related to links. Specific examples:

A) You've got a link off to the right side of the page just under the main navigation bar (under the "News" link). This link is titled "News" and it rotates different links to different news items. One of them goes to a site called Spoke.com and the rest go to other DInerware.com pages.

Each of these links has a link "title" attribute that appears to contain the intro text of whatever it's pointed to. The problem here is it's significantly filling content at the source level that's totally irrelevant to the page. If this is happening on the entire site,there's a lot of topical dilution.

While this issue itself shouldn't be a problem big enough to be concerned with, I do believe its harming your sites quality from a topical perspective. And since this is stuff only search engines see, or is only seen when hovering over whatever individual article is viewed in the rotator, it's not good to have so much content there. Not good at all.

Also, the Spoke directory isn't exactly a high quality directory. So linking there isn't helping your site's perceived trust aspects.

2. Apparent mass repetition of video content

Am I correct in that you've got some videos posted to multiple pages of the site? Causing serious duplicate content problems? Many pages seem to have almost no unique text while having several videos. If these are then shown on more than one page, not only do you lose out from a lack of HTML text based content (a significant factor), but you get hammered by the duplication.

3. Links to PDFs

I see links to PDFs in the right column of the Product Training page, and none of them have the php code at the end of them. Yet within the page itself, PDFs have it http://www.dinerware.com/pdf/DinerwareCFGManual.pdf?phpMyAdmin=6e28a551fa44f2aa65e57201d6164da9

What's that about?

4. Markup Language Fail - The biggest problem

When I run your site through the W3C Markup Validation check, it fails and can not process. This alone means you've got a site coding problem that's most likely causing serious problems with search engines.

Go to http://validator.w3.org and enter in http://www.dinerware.com/pos-product/training/

I doubt the complete breakdown that I got when submitting that URL is temporary - and if you see it too, that's a critical issue.

AlanBleiweiss

Andrew's got one path to consider. I've got another. My own most recent example is with a real estate site that has 100,000 property pages that all currently result in a 404 not found. Yes, that's 100k dead pages. So I too feel your pain.

What I recommend to clients is to 301 based on category level criteria. So for example, whatever the highest level category a product had been in - that old page should 301 to the current category page, if one exists. The 301 should append the new URL with a unique identifier for this situation - something like #NLC (for no longer carried) - the # sign being the key, because you can then have an anchor at the top of the content area of those pages that if the referrer includes that #NLC in it, visitors would see a box communicating that the product is no longer carried, and inviting them to browse your current inventory in that category.

Doing this would also require having a canonical URL tag on each category page - just to cover the bases. While anything after the #sign should be ignored as far as causing duplicate content conflicts, it's still best practices to have the canonical URL there in the header.

When no current category exists, then I'd send visitors by 301 to a uniform page (either a product search page or otherwise) yet with the same #NLC string and message.

Of course, getting either Andrew's suggestion or mine implemented will be up to the skills of the programmers doing the implementation. That's a lot of coding that has to be done accurately and thoroughly tested.

AlanBleiweiss

Do not rely on press releases as a source of high quality back links. I recently wrote an article at SearchEngineJournal.com breaking down the issues, explaining why releases are good for SEO, how they offer value and how they do not.

Above all else, press releases are a natural and valid aspect of an overall marketing campaign, and bring many valuable opportunities. High quality / high value links however, are rare indeed from them, and only when those releases end up on highly trusted news sites, yet the overall value is diluted because they're a form of syndicated content - just about the only legitimate form of syndicated content that exists except in rare situations.

The many other values they offer, when they're very well written, and when the original intent is for original old-school reasons, far outweighs the lack of high quality links though.

Check out my article and then if you have any follow-up questions, let me know. I'll be happy to answer them.

AlanBleiweiss

Meta robots refers to the < meta name="robots" > tag at the page header level. This is usually the case when a blog is set up with an SEO program like All In One SEO for example, where you can manually set which content is blocked. It's common to block archives, tags, and other sections, in the theory that allowing these to be crawled could either cause duplicate content issues, or drain link value from the primary category navigation.

AlanBleiweiss

Digging a little and it looks like a totally fake site by some hack SEO.

Any time you see a link in a suspicious site's footer that says "SEO xxxxxx xxxx" (in this case "SEO Company Perth" that's a massive red flag.

To the left of that link is a link "LEF" - that points to a Link Exchange site. How obvious is it then that this is a bogus site?

Other clues:

No other navigation links of any value that communicate "this is a legitimate web site".

The "Remarkable US Presidents main navigation link has links in it that just point to another hack site "MTI-USA.com" and THAT site, even though it's got "USA" in the domain, links to a Perth "digital agency"...

Bottom line, it's a mess. And the lesson here is don't get caught up examining competitor links assuming you see something that could be helping them unless you know what you are doing. All it will result in is your site being slapped with penalties.

AlanBleiweiss

Link equity is not equal across a page. The two most important types of links are main site navigation and in-content. Sidebar navigation is close behind. Footer links are not what they once were.

Think about it from a user experience - how many sites do you go to where you primarily navigate through a site by scrolling to the bottom of pages to find the links you want? Even high ranking sites that fill their footers with lots of links also have those higher up on the page and those footer links not only don't help, but with so many of them, it just causes topical relationship confusion.

Couple options:

1. change the main nav links to images and use alt text. Alt text does have as much value when it's images in main site navigation because how else would search engines know the anchor information on those?

2. See if you can get links to some of those internal pages from within the content area of high level pages on the site - within or directly near descriptive text that talks about the focus of those pages you're linking to.

3. Something that hasn't been mentioned so far is also off-site factors. Without inbound links pointing to some of those internal pages, you're not going to get as much ranking value as you probably need and are trying to get from internal linking.

With inbound links you have more free reign to get the anchor text you prefer, though inbound links should be a mix of keywords, brand and generic words like "for more info".

AlanBleiweiss

What is the relationship between site A and site B? Is it a "legitimate" "owned by or managed by the same company" relationship that would exist even if SEO didn't exist?

If so, it is valid to link from one site to another. If not, those links become more questionable as to "why are these here?" understanding.

Even if there is a legitimate reason outside of SEO, what is the overall signal? If they're keyword based links and not just using anchor text that shows the name of the destination site, they're more vulnerable to suspicion.

Even in those situations where there is a legitimate "we're just providing links to another property we own" links, and they're not using keyword anchor text, there is still some "slight" vulnerability, however it's minor compared to keyword anchors or "not here for reasons outside SEO" issues. Because of that, the best practice is to nofollow them just to be "safe" in the age of Penguin and manual reviewers who might mistake a legitimate linking reason for an attempt to over-optimize inbound links.

AlanBleiweiss

400 posts and 400 pages is not very many pages or posts, not by any stretch of the imagination. How much lag are you seeing in tests with the permalink URLs? If it's significant, it's not a WordPress issue - more likely a database corruption or server problem causing the slowdown between the front end and database.

AlanBleiweiss

I'm not going to go into the DA vs. PA issue. What I am going to do is focus on the issue Umar brings up. That's much more of a concern than the "Science" of DA vs PA since those are really only a tiny consideration across a much broader number of ranking, authority, quality, relevance and trust factors for links from one site to another and not something you would even be able to figure out if you thought it mattered.

The risk of cross-linking the way you described on the other hand, is something that could risk both sites being penalized. How much is the deal worth when considering what the harm would be if either, or both sites were penalized?

Gamble if you want. Just recognize the price paid for gambling that gets snagged in Google's spam trap.

AlanBleiweiss

if we're talking about thousands of pages falling off, yes, to me, that's a high priority. If you go the 301 route, they should go to the highest page in the chain that product would be associated with that's relevant to the topical intent and relative closeness of match..

So if it's a laser mouse, I wouldn't redirect to the top "desktop computers" or even the "laser mouse" category, but I would 301 it to the mouse optical/trackball category page.

The reason for this is two-fold - it's low enough in the food chain to be highly related, but not so highly related that if the current laser mouse sub-cat disappears altogether that you'd end up in a bad loop of redirects.

That does, then, maintains at least some of the original page authority and boost the parent category.

AlanBleiweiss

EGOL,

As always, you infuse wisdom into this discussion. I have always been an advocate of "content first, content last". Yet in 2015, search engines are only one piece of the puzzle, and until and unless other efforts for brand visibility / authority / trust are made, the overwhelming majority of sites on the web will leave way too much money on the table.

I happen to believe links need to be generated through our own efforts yet it's not the "traditional" link building. Instead, it's more about advocacy of brand, community service, and participation in the community in which our prospective/existing clients/customers live.

If we are not active in those ways, we build a house on sand.

Just my take on it.

AlanBleiweiss

Ryan's point about localized search being drastically different. So the real question is whether you offer products or services that require localized identification. If so, having your initial local area in the domain will definitely not help your effort.

As for the example of the New York times, they can get away with showing up when not searching for local specifics because they're one of the biggest sites with some of the highest SEO authority from 3rd party sites on earth. So of course they can get away with it. If you want to achieve the same (for non-local search phrases), you'll need to go to extreme lengths to build your site's SEO authority as well.

Personally I'd say that if your site depends on local related search, you'd be better off with a domain that doesn't have the local aspect in the name. Build out content in a locations funnel - starting with the geographic areas you determine to be a mix of the most important and some that are semi-important (and thus easier to rank for over time).

That way, you can create individual pages (or ideally sections) that have each geographic location in the URLs. This is much less challenging to get ranking for over time than the root domain being about just one location, because the root domain placement of a keyword is much stronger than a sub-folder.

High quality SEO will be key in the geographic funnel. Citations from other sites in each of those locations will be really helpful as well.

AlanBleiweiss

Each tool processes pages differently, attempting to emulate the actual Googlebot crawler. You may want to jump over to SEOmoz's Help Desk to get specific info on the Moz version, however the only way to know that you'll always be able to see what Googlebot actually sees, even when the Googlebot might change over time, is to use Google Webmaster Tools.

Sign into GWT, then click to "Diagnostics" and then "Fetch as Googlebot". There you'll be able to enter a URL. It may take a few minutes to get the results, but you'll see what they see.

AlanBleiweiss

If you have different recipe's on every page in the pagination, it's not a duplicate content issue. So unless there's a valid reason to NOT allow them all, without canonicalization, I recommend to clients that they don't use the canonical - let them all be indexed. As long as you include "page X" in the Title, Description, URL, and h1 of the page.

The alternate reasoning I usually see is you don't want to dilute the link value that first page gets. Personally I prefer to show search engines "look - ALL of these pages are about this topic".

AlanBleiweiss

301 Redirecting an internal page that you want visitors to discover is never advised as a way to drive the home page up instead.

Is the main keyword actually the most important keyword for your entire site?

Long term best practices would have it so that If it's one of the top few phrases your offering revolves around, then individual pages should be laser focused on variations of it - to the point where you have enough content across X pages that search engines see "enough of the pages across this site are on this specific topic that the home page of the site deserves to be ranked for it. And one top level page (a main navigation link) should be highly optimized for that specific phrase as well.

Ideally it's that page that should come up first in the SERPs for that phrase, followed by your home page coming up in the SERPs for it as well, but below that entry.

Alternatively, your home page should come up first, with Google SiteLinks coming up just below that. But only after you've got a number of very strong pages built around the topic.

AlanBleiweiss

It's a balancing act that requires as much art as science.

It essentially requires building simultaneous strength to both home page and internal page, and there's no quick fix way to do it.

I like to think of it as running right up against the edge of "too much" focus on that internal page, without falling into the "oops too much" hole.

And sometimes that does happen, so then you just focus a bit more on the home page for a while.

Then there's the long-term reality - competitors taking actions you can't anticipate, algorithm changes...

Given that long-term reality, I prefer not to get too stuck, on that one task or goal.

AlanBleiweiss

I have had several new clients come to me after Panda and Panda 2. Lots of audits. The client who had the worst problems, and has since corrected the worst issues based on my audit just bounced back in an epic way, and while it could be a short-term thing, I don't believe that's the case - it's just too big of a jump back - full recovery.

I'm curious to find out if anyone sees a similar recovery on your sites.

FYI the biggest problems (most of which have been resolved now) include:

Content organization - it was a mess of a site
Extreme over-use of ads on the page and in the content
Topical focus - there was so much going on across every page of the site that confused Google
Major site speed issues

5ewacr

AlanBleiweiss

there should always be one way search bots can crawl your site to get to the content - even the deepest content. If you've got that in place, any other path can be noindexed, or index/followed.

AlanBleiweiss

can you use those individually to get to a real page on the site? or even when you do them manually, do they lead to a 404 not found error? Links in Google's system come from somewhere - either within a site's architecture that site owners weren't aware of, or from 3rd party sources that got something wrong in how they found or scraped content. Many of the URLs Google reports show a "links" column off to the right you can click on to see where those URLs are located on the web either on your own site or another...

AlanBleiweiss

Hi Mark

You are going to need to rely on WordPress' own 301 redirect solution. 301 Redirects have to happen on the server where the original content resided (you can't set up a 301 redirect on your own site's server, since the original files and domain weren't hosted there).

Here's the official solution http://en.support.wordpress.com/site-redirect/

AlanBleiweiss

how's the site doing in Bing? Bing tends to not have such an easy time discovering pages, and I've often found a sitemap.xml file will help - and pagination pages would be included in that file. If you combine this with using the new rel=next / rel=prev markup Google has been asking for, you have all your big bases covered.

AlanBleiweiss

this is the correct approach when using the robots.txt method of blocking. Be aware however, that the only secure way to ensure 100% that such locations are not indexed is to put them behind a password protected gate-way. I always recommend to design agencies that there be a simple single log-in screen between the front end and design folders. This can be as complex as unique UIDs and Passwords for every client, or a single shared login, if all you want to do is bar search engines from seeing the content.

AlanBleiweiss

You're facing one of the most challenging issues e-commerce sites face - and here's what I recommend to manufacturers -

Using canonical references might be beneficial to your site, however it leaves retailers unable to rank for those products. So every retail site that carries your products should have their own unique version of content - completely unique descriptions. This can be a challenge when there are a lot of products in the database, and of course would not apply to technical specifications within product detail pages, however it's vital for everyone's long term success to get the descriptive text to be truly unique. To the point where the majority of content on each product detail page that is unique outweighs the portion copied (product name, technical specifications, category assigned, etc.).

Whether YOU provide that unique content (and thus control the message), or require that your retailers do the heavy lifting is up to you to decide.

AlanBleiweiss

If you leave everything else the same, and simply migrate the site to a different hosting provider the only way you can cause a problem (or alternately improve your situation) from an SEO perspective is related to bad neighborhoods.

So if your site currently resides on an IP or C block that's known to Google as having too many suspicious sites, moving it off and onto a "clean" IP or C block can help. And moving it from a clean IP or C block to one that's bad can lead to your site being labeled and eventually you may suffer.

Second tier considerations are overall site performance and speed. This isn't specific to changing hosts, but even server to server within a host. If you get more reliability of up-time and site speed related to server calls, that can help as well.

AlanBleiweiss

thats a great question Peter. Not one I think anyone has addressed in a blog post. yet. Without evidence one way or another, I'm going with my intuition, and saying it's probably not a good thing. Unless Google formally comes out and says they can detect the combination of scraped content and footer links, to sort out that those are not intentionally malicious SEO links, I wouldn't trust their algorithm to sort it out.

Just an opinion at this point.

AlanBleiweiss

Christian, definitely read the article Keri linked to. Note in the article the section on "Where should you 301 redirect your pages to?"

Whether you 301 them to one of the suggested "most relevant" pages, or 302 them to a custom dynamically generated close matching page higher up in the hierarchy, it's the best practice. 301 would be for products you aren't going to carry. 302s would be for out of stock, as long as the 302 generates a custom message at the top of that page they're pointing to that informs them you're out of stock.

A final consideration would be to create a form below that "sorry, out of stock" message, that invites visitors to be notified by email when a product is back in stock. I've seen that extra functionality save a lot of otherwise lost business.

AlanBleiweiss

Oh and 1 additional concept - as in Stephanie's article, the 3rd alternative is keeping the page, informing them its out of stock, and showing them other similar products, right on that page that you may not normally include when something is in stock. Couple that with the "notify me when its in stock" form on the page as well, and that could very well be the best solution.

AlanBleiweiss

Check out Google's Source Attribution tag - that's the way I'd go in this situation rather than author, unless the "scraped" copy changes author info, which is a whole different issue..

AlanBleiweiss

I get this same issue a lot - just about or nearly every time I'm hired to perform a forensic audit on an ecommerce site...

Here's how I responded in one of my recent audits to this question:

Search engines struggle to then determine “which of these two nearly identical pages is the original source, which is more authoritative, and which is merely an attempt to own two positions in search results for the same company.

Sometimes search engines overcome that struggle in a positive way, other times their automated systems fail miserably. More often than not, on an initial look, you don’t even realize how much of a problem it is if you think you’re doing well in your organic search based visits.

In reality, every page that competes with every other page results in a cannibalization effect. Every page suffers, at least a little, and cumulatively, entire sites suffer way more than you might even comprehend.

Solutions for consideration:

Keep all copies of each product but make them unique. If they are kept, every version or instance of a product needs to have its content completely re-written so that it is truly unique compared to every other instance.
Keep all copies of each product but decide which ones you want the search engines to find and rank - every other version should be blocked from indexing. Do not rely on Google to figure out which to keep and which to rank and which to not.
Eliminate as many copies as possible by considering consolidation of products detail pages while maintaining access to them from multiple categories. 301 Redirect all copies of every version of those product details pages except the primary one you intend to keep indexed and ranked in search engines.

There's a lot more to consider such as canonical implementation, however in addition to the issue with canonical you already described, the fact is that canonical tags are only signals. they are NOT directives, so that's relying on Google to figure it out.

AlanBleiweiss

I have to agree with GIGS20 on the manual submission (obviously only to engines you care about). Why? because in my own tests, I have consistently been able to get new sites ranked faster via sitemap direct submission than waiting for crawlers.

As far as the whole fluctuation thing - think about it this way - there are multiple algorithms in the Google system, it's not just one algorithm. And every site that gets an evaluation for it's on-site merits alone, then needs to have that evaluation held up in relation to other signals (off-site links and mentions, off-site social signals, etc.) and then all of that has to then be weighed against every other site that their system determines might be a topical focus match.

When a site is new, there's not a lot to go on so every time Google churns another update, every time they run another algorithm, things will likely change for some sites.

Then if you throw in the fact that many other sites are also being changed, worked on, further optimized (or hurt) every day in that topical focus, and the end result is an even more unstable ranking situation for new sites, especially when those are in highly competitive markets.

AlanBleiweiss

a 301 redirect is a server level or site level command to a web browser to jump to the page the redirect is pointing to. A canonical tag within a page is only a signal to a search engine to not count / index this page, but count/index the page in the canonical tag.

AlanBleiweiss

ouch. If I understand what you just communicated, you've got a site that had previously worked with the www version, but now, due to technical changes, the www version doesn't function? And won't for a couple months?

That's a scary scenario and having worked with many different developers and systems administrators over many years, I've never allowed one to tell me "we can't fix that for a couple months" and get away with that claim.

It can either be addressed or it can't at the site level or the server level, one way or another. Regardless of development framework, you should be able to set the non-www to redirect to the www version at the server level and it should work right. If there's a massive bug in the Magento implementation, that sounds like a very serious flaw in the developer's skill set as far as I can tell.

So - IF you're stuck, you're going to have a major SEO problem for longer than a couple months.

By all rights the only solution in that scenario is to scrap the www version altogether and NOT revert back to it in a couple months. Change all the 301 redirect settings and within GWT to now point to the non www version. Then work to build up more links to the non www version over time.

Because that's the only short-term solution you can do now from a best practices approach if the failure can't be quickly addressed.

And down the road, if you do this, you'd have to once again reverse everything, just causing you more problems.

So either get a developer / IT specialist who can fix it immediately, or scrap the www version altogether.

AlanBleiweiss

The question is this - why would you want to keep Site A given the current insurmountable challenge you describe?

Do you still hope there's some value in it being kept alive?

Do you still hope there's some SEO value or that the site will or does continue to bring some traffic you believe to be valuable?

Because (and this is just my opinion) if you are convinced you cannot or will not (for whatever reason) work to clean the mess up, you'd be better off completely killing off Site A.

If you don't even if you migrate site B to a different server, the links still exist. The footprint remains.

AlanBleiweiss

If Site A is a cash cow that does not need SEO, then I would block the entire site from search engines via robots.txt file. Even on separate hosts, all the links pointing to Site B are a big negative due to the sheer volume, given that there's likely a "bad rap" label associated with SEO on site A.

Duplicate content does not need a "same server" relationship to be a big problem either. All duplicate content is a problem regardless of location.

If a client I represent is doing things that I believe are impeding their success, I personally believe it's important to communicate my concern. However, if they choose to ignore that communication, that's their right to do so.

AlanBleiweiss

This is a sad reality that many business owners face. SEO is a very complex process and unfortunately it's a case of two-fold barriers to success.

On the one hand, not all SEO's really know, or even if they do, don't consider long-term ramifications of what they might be recommending. Many of us in the industry think and act otherwise, however it is a problem nonetheless.

On the other hand, Google constantly changes their rules to a certain extent - as more people look for ways to game the system, what may have been acceptable previously can become unacceptable as Google tries to clean up the mess. It's a vicious circle.

So...

You can get "rid" of indexed pages by blocking them through a robots.txt file - if there are patterns to their URLs. if it's an entire site, you can block the whole site in one line in the robots.txt file. If it's multiple sections of a site, you can block entire sections while leaving other sections open for search indexing. A professional should be tapped to help you with that.

Its important to consider whether pages should be blocked, or instead, redirected to other pages that you want indexed or are indexed that are similar in nature.

Bad link evaluation is a professional process and should not be undertaken lightly. In many cases, site owners will ignore your requests, so it's important to at least get the request process right and to document the process. Again, a professional is needed for that.

And yes, you can submit a request after that's done.

Unfortunately, anyone you task to do the work that would be able to help you will both charge you for their time and cannot guarantee that what is done will be enough. It's another reality of the world we operate in because Google cannot reveal trade secrets to help you know what exactly needs to be done.

AlanBleiweiss

site:trophycentral.com -www shows all content indexed not within the www subdomain.

AlanBleiweiss

I recommend implementation of Google's Rel-Publisher markup - to designate the original publisher. There's at least one WordPress plug-in for this however I can't speak to whether it works or not - only that the Google markup is currently best practices...

Combined with Rel-Author, it's how Google determines content ownership

AlanBleiweiss

yes, inbound links are those coming from other sites - and quality is important there- as well as diversity, both in the types of links as well as ensuring to keep the majority of links coming from as many independently separate domains as possible.

AlanBleiweiss

It's correct that you can't directly control or dictate which pages Google includes in sitelinks. You can, however, help influence this by better optimizing the pages you want included through more emphasis on your brand within the content of those pages - integrate brand references within on-page content. Then, work to get a mix of brand-centric anchor text into links coming from other sites that point to those pages.

It's not a guarantee, however I've seen some success in this method for various clients.

AlanBleiweiss

The more emphasis, signals and depth of content, supported by a stronger individual page focused inbound link effort, the more likely the pages you care about will be to end up in sitelinks.

There isn't one formula unique to sitelinks that Google specifies, so I've only ever just applied best practices SEO concepts to my desired goal. And have seen those pages sometimes become sitelinks.

AlanBleiweiss

bitly is one of the most well established shortener services available. They provide click-through data across several data points. This helps site owners track the relative success or shortcomings of content they wish to promote, and can be used to determine what works and what doesn't from a marketing and promotion perspective.

AlanBleiweiss

there are many ways to track traffic, however if you use a shortener service like bitly, you get a shorter URL for easier / cleaner distribution, you can rapidly evaluate a specific URL's performance tied directly to a specific marketing campaign method, and when other people spread that shortened link, it's just as effortless to track click-through for that specific initiative regardless of how many times that is shared. Much easier than digging into analytics and setting up unique campaigns.

It is of course, far from perfect, and there are many ways to get the data - bitly is just one and it's handy. Especially with the functionality of generating the unique shortened URL on the fly which is much faster than setting up custom tracking campaigns in GA.

AlanBleiweiss

If you know that they were valid pages, and if the relevance of the no-longer-existing pages matches relevance on specific pages on the new site, set up 301 (not 302) redirects at the server level to regain some of their previous value.

If you are unsure of their value, or there is no relevance match, you can use Google Webmaster Tools to have the pages removed from Google's index.

AlanBleiweiss

Rel-Canonical is a signal. While it helps mitigate problems of duplication, it is just one signal. If enough other overriding factors exist, Google's automated ability to "figure things out" can become weak and confused. So for example, if all of the pages are linked to from multiple points within the site, or if any of that content is linked to from outside the site, there could be negative impact. It's not supposed to occur, however it can due to imperfect layering of multiple algorithms.

Ultimately, there may be a need to block some content more forcefully, such as via robots.txt file. Unfortunately there's no definitive method for evaluating this specific issue in an isolated manner without serious live site testing over extended periods of time. So my recommendation to clients is to not mess with it unless you've seen a serious drop in organic results, and only even consider testing if you can't identify other potential primary causes.

AlanBleiweiss

Structured data is extremely beneficial when executed properly and under the right circumstances. It's not usable for every single piece or type of content on a site though. So the first challenge is to determine what on your site is viable for being coded with structured data markup.

For example, on the article you referenced, Google gives some examples of the types of content that would benefit from this:

There are many more uses nowadays - article authorship information, publisher information, and a host of possibilities thanks to Schema.org - a collaborative effort between Google, Bing and Yahoo.

Want to go even further, you can then layer in the Facebook OpenGraph system on top of that - a completely different way to markup content information for crawl-ability improvements.

As you scan through all the various types of data that could qualify, from there it's a matter of following implementation guidelines.

If you've done any type of search recently for sports scores, weather, searches with "reviews" in the search phrase, "Current events in ____", you've likely seen structured data integrated into some of the organic results - when you see it, it's pretty obvious in that it provides very specific types of information, presented in very clean visual ways as compared to purely seeing a page title and two line description of the page.

AlanBleiweiss

It's not just the "average position" that matters - total number of impressions is a critical consideration in understanding what's going on. When you've got less than 50 impressions, it means that your listing showed up in results less than 50 times, regardless of position (not all instances necessarily had a particular entry in the actual #1 position). It's not unusual for a new site or new content within an existing site to fluctuate in rankings.

Google has several algorithms - as one is run, a particular page might be placed in a certain ranking position temporarily. Then when another algorithm is run, the results could impact the original placement up or down. Over time, part of the process also includes Google varying placement in order to gauge reactions - if not enough people click on a listing in a certain ranked position, that will further impact results the next time a similar search is performed.

The key is to to focus on the longer-term experience and steadily build signals to reinforce "this page deserves to be ranked highly".

AlanBleiweiss

If the information contained in the academic policies is not needing to be kept confidential, and if that information is valuable to students (or even perhaps people considering becoming students, or parents of existing or potential students), then for those reasons, it would be legitimate to make it accessible from the main site without them having to go hunt for it.

Given those reasons, I believe it would be perfectly valid to have it be crawlable and indexable by search engines.

I would also group it all together in a dedicated location (such as a sub-folder hierarchically) with it's own sectional sub-navigation because it's no different than any other quality content - grouping topically focused similar content is proper for user experience.

As for the student profiles that's a completely different issue. This one involves the reality that it is most likely most student profiles are going to have very little depth of unique content. I assume students will fill out the content themselves. That leaves the door open for all sorts of good, bad and ugly.

Further, if there is some reason for students, faculty or other staff to be able to access it without having to sign in to a secure area, that is not a reason to have it found in search engines.

There are privacy concerns (so a secure area is then in fact, the best option if that's the case).

Most likely being "thin" or even some low quality or perceived duplicate content, if it's not hidden behind a log-int, it really should be blocked from search engines in a robots.txt file or use noindex,nofollow meta tags. (no valid reason to do noindex,follow).

Having said all that, I would suggest it could just as well go in a sub-folder of the main site or a separate sub-domain. Since it will be blocked from indexation/crawling, either would work.

One final reason it shouldn't be indexed or followed is as students come and go, that way you don't need to worry about a "301" redirect system to deal with them.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

AlanBleiweiss

@AlanBleiweiss

Best posts made by AlanBleiweiss