How to deal with old, indexed hashbang URLs?

Celts18

I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here

Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/).

My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries.

So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #).

I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way?

If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue.

Best,

-G

belasco

Celts,

Did you ever resolve this? What you were discussing back in 2012 is called a "hashbang", and you can learn more about it here on Google. It is technically a way to get AJAX-loaded pages indexed on their own URL.

You asked this question a couple of years ago, and things have changed since then with push states and HTML 5 being preferred over hashbangs, and not loading a page's content with AJAX still the recommendation when possible.

Celts18

Thanks for your answer. Yeah, I've seen the hash tag function as you've described it when being used for named anchors. However, in my case, Google IS indexing the URLs that contain the #! and it is also grabbing my homepage's title and using it in the SERPs on those results. So, given that that's happening, I'm concerned that the #! IS hurting me in this case.

In thinking more about this, I think what I'll do is put a canonical tag on the homepage and that should hopefully provide the extra guidance/insurance that I need to tell spiders that there is only ONE version of the homepage.

jenmcardle

Google ignores the hash tag when indexing URLs. You can offer your home page with various versions of hash tags appended to the end of the URL and Google will not mind a bit. It will not case any issue for SEO.

A few more notes:

Hash tags are used in HTML as an onpage anchor. Wikipedia is a good example. Take a look at the following page: http://en.wikipedia.org/wiki/Guitar. If you hover over the HISTORY link in the Table of Contents at the top of the page, notice the URL for the HISTORY link is http://en.wikipedia.org/wiki/Guitar#History. When you click the link, you remain on the same page but move to the History part of the page.

If you search Google.com for "Guitar History" you will notice the WIki page is listed first. (see attachment). The URL offered by Google is the page URL without any hash tag. Google does offer the ability to "Jump to History" which includes the hash tag link. That is a benefit to using anchor text on a page. Otherwise Google does not take the hash tag nor anything after it into account when indexing pages.

Rand offers a short video on this exact topic: http://www.seomoz.org/blog/whiteboard-friday-using-the-hash

I am not familiar with the exclamation point (bang) being used after the hash tag outside of twitter. The standard twitter URLs use it.

Summary - the hash bag is not the reason for your recent drop in rankings.

I am unclear what you mean by "Google still has thousands of the old hashbang (#!) URLs in its index." Can you share an example?

google-hashtag.png

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How to deal with old, indexed hashbang URLs?

Browse Questions

Explore more categories

Related Questions

If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?

How to do Country specific indexing ?

Why is page still indexing?

No-index pages with duplicate content?

Yoast SEO Plugin: To Index or Not to index Categories?

Removing old versions of a page.

What Should I Do With My URL Names?

Best solution to get mass URl's out the SE's index