Internal file extension canonicalization

jg100

Ok no doubt this is straightforward, however seem to be finding to hard to find a simple answer; our websites' internal pages have the extension .html. Trying to the navigate to that internal url without the .html extension results in a 404.

The question is; should a 401 be used to direct to the extension-less url to future proof? and should internal links direct to the extension-less url for the same reason?

Hopefully that makes sense and apologies for what I believe is a straightforward answer;

AlanMosley

As above

example/abc rewrites to example/abc.html

example/abc.html redirects to example/abc

and all internal links link to example/abc

jg100

Thankyou for the replies.

I will try and clarify what I am trying to get at; apologies in advance for any naivety.

I understand homepage canonicalization; the confusion revolves around how this applies to internal pages.

Logically; I am struggling to see how internal pages are any different to a homepage in terms of the need to avoid multiple urls....and thus an extension-less url seemed appropriate. Not too mention the benefit or cleaner urls, easier to link to, remember etc.

i.e.

example/abc

example/abc.html

example/abc.index.html

AlanMosley

As nick said, you dont need to do this, but if you are.

1. REWRITE the new url to the old url, as your webserver needs to know the extention

2. REDIRECT the old url to the new one, incase you already have links to the old urls, you dont want5 duplicate content

3. you need to make surer that all internal links point to the new url, you dont want un-necessary redirects as they leak link juice.

MRCSearch

I'm about to make a whole lot of assumptions about your website to give this answer, just be aware.

Your website is built static, using HTML. Hence the .html file extension. If you're seeing websites that don't have file extension, it's most likely they are using content management systems (or have some serious /folder/index.html stuff going on).

Having a file extension like .html or .aspx or .php is not a bad thing. On websites like yours, it is required (unless you do the above subfolder thing) because it's an actual file the browser is grabbing rather than something being dynamically generated by a CMS. It has nothing to do with future-proofing.

As for 301'ing non-extension URLs to extention'd ones...well I don't know why you'd need to do that for your type of site.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Internal file extension canonicalization

Browse Questions

Explore more categories

Related Questions

Crawl solutions for landing pages that don't contain a robots.txt file?

Can you be penalised in Google for excessive internal keyword linking?

"noindex" internal search result urls

Missing files in Google and Bing Index

Oh no googlebot can not access my robots.txt file

Is my robots.txt file working?

Problem with indexed files before domain was purchased

Use of Robots.txt file on a job site