How to properly abandon mod rewrite?
-
Hi,
We've done mod-rewrite to our .php files to show .htm files several years ago for SEO purposes.
My question is, doing this has become a hassle for adding new pages, etc. and I'd like to make a clean break with the .htm and move to their real file names and or directories (e.g. company.htm --> /company/ ).
What kind of ranking penalty am I looking at if we switch? We're a small company with billion dollar competitors so a rank loss would be fairly devastating.
I assume I'd need to do 301 redirects for all of the old file names (obviously yes for the change from page to directories) but for each individual page?
Thanks,
Matt
-
Maybe I am missing something, but wouldn't a rewrite that removes all the .php instances solve this problem site-wide? Or are you doing it file by file and leaving some pages as-is?
Something like this in your .htaccess should do it:
to remove php:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.phpor to change to htm site-wide:
RewriteEngine on
RewriteBase /
RewriteRule ^([^.]+).htm$ $1.php [L]Another way is to name the files with .htm and use this in htaccess to send htm through your PHP handler:
AddType application/x-httpd-php htm html php
AddHandler application/x-httpd-php .htm .htmlIf you use rewrites like those, you won't be able to also use 301s for the affected URIs as it would probably create a redirect loop.
In a perfect world, you should 301 redirect any page that changes if you stop using the php to htm rewrites. If there are simply too many for this to be practical, you could just redirect the most important pages and leave out any that may not have very many inbound links pointing to it. What I will often do in cases like this is set up the redirects for the important pages, then keep an eye on Google Webmaster Tools. Webmaster Tools will show you the 404 errors and where they found the links. Then you can pick the ones that have a lot of links and 301 those a few at a time. Tedious, but if you do that in your spare time, eventually you will get them all fixed.
If you can implement a "set it and forget it" rewrite so you don't have to add a new rewrite for each file, you won't have to worry about 301 redirecting all those old pages.
Otherwise, there really shouldn't be any major loss of rank from dropping the file types.
All that said, there isn't much of a reason to remove the file type extensions, other than to shorten addresses by a few characters and just look a little cleaner.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need Help On Proper Steps to Take To De-Index Our Search Results Pages
So, I have finally decided to remove our Search Results pages from Google. This is a big dealio, but our traffic has consistently been declining since 2012 and it's the only thing I can think of. So, the reason they got indexed is back in 2012, we put linked tags on our product pages, but they linked to our search results pages. So, over time we had hundreds of thousands of search results pages indexed. By tag pages I mean: Keywords: Kittens, Doggies, Monkeys, Dog-Monkeys, Kitten-Doggies Each of these would be linked to our search results pages, i.e. http://oursite.com/Search.html?text=Kitten-Doggies So, I really think these pages being indexed are causing much of our traffic problems as there are many more Search Pages indexed than actual product pages. So, my question is... Should I go ahead and remove the links/tags on the product pages first? OR... If I remove those, will Google then not be able to re-crawl all of the search results pages that it has indexed? Or, if those links are gone will it notice that they are gone, and therefore remove the search results pages they were previously pointing to? So, Should I remove the links/tags from the product page (or at least decrease them down to the top 8 or so) as well as add the no-follow no-index to all the Search Results pages at the same time? OR, should I first no-index, no-follow ALL the search results pages and leave those tags on the product pages there to give Google a chance to go back and follow those tags to all of the Search Results pages so that it can get to all of those Search Results pages in order to noindex,. no follow them? Otherwise will Google not be able find these pages? Can someone comment on what might be the best, safest, or fastest route? Thanks so much for any help you might offer me!! Craig So, I wanted to see if you have a suggestion on the best way to handle it? Should I remove the links/tags from the product page (or at least decrease them down to the top 8 or so) as well as add the no-follow no-index to all the Search Results pages at the same time? OR, should I first no-index, no-follow ALL the search results pages and leave those tags on the product pages there to give Google a chance to go back and follow those tags to all of the Search Results pages so that it can get to all of those Search Results pages in order to noindex,. no follow them? Otherwise will Google not be able find these pages? Can you tell me which would be the best, fastest and safest routes?
Technical SEO | | TheCraig0 -
Using hreflang tags properly.
On my site "example.com" I have set up the following in the header: The problem is that the tags are universal across the site, so every page has these tags, leading obviously to no return tag errors. I.e. the page www.example.ca/testing.html still has the tags: Not tags with "testing.html" in them. How bad is this? Does it matter?
Technical SEO | | absoauto0 -
Proper method of consolidating https to http?
A client has an application area of the site (a directory) that has a form and needs to be secured with ssl. The vast majority of the site is static, and does not need to be secured. We have experienced situations where a visitor navigates the site as https which then throws security errors. We want to keep static visitors on http; (and crawlers) and only have visits to the secure area display as ssl. How is this best accomplished? Our developer wants to add a rule to the global configuration file in php that uses a 301 redirect to ensure static pages are accessed as http, and the secure directory is accessed as https. Is the the proper protocol? Are there any SEO considerations we should make? Thanks.
Technical SEO | | seagreen0 -
301 Redirect Properly To Keep the Juice
I have a bunch of WP Blogs and was thinking of taking all linkjuice from these to my main money site. The most of the other WP Blogs is hosted at godaddy.com (domain and site) and I know they have a URL Redirects page in site manager but I`m not sure this is the right way to go. Also I wonder some of these sites have hundreds of blogposts there is no way I can "re-create" those on the money site but I am sure that is not a must-thing to do in order to keep the "juice" right or wrong? Last but not least, I was wondering if you think it would be best to redirect the sites to relevant pages on money sites. For instance if i had a domain called cheap-ties.com with 100 blogposts about this and on money site a webshop with a category called ties, should redirect to this or to main domain or doesnt it matter?
Technical SEO | | fAgBxa8b0 -
What are the impact of doing URL Rewriting instead of 301 redirections whille optimizing a blog?
In WordPress, with the ALL In ONE SEO pluggingm we've optimze the permalinks to show more keewords in the URL'. What can be the impact?
Technical SEO | | webit400 -
How do you properly handle syndicated content?
The same piece of content is pulled in and presented (syndicated) within a frame on different web sites (owned by the same company). However, I would like only one web site to rank on Google's search results for that content. How do I set this up? Thanks, claudia
Technical SEO | | claudmar0 -
Rel=Canonical to Rewrite or original URL?
Working with a large number of duplicate pages due to different views of products. Rewriting URLs for the most linked page. Should rel=canonical point to the rewritten URL or the actual URL? Is there a way to see what the rewritten URL is within the crawl data? I was taking the approach of rewriting only the base version of each page and then using a rel=canonical on the duplicate pages. Can anyone recommend a better or cleaner approach? Haven't seen too many articles on retail SEO when faced with a less than optimized CMS. Thanks!
Technical SEO | | AmsiveDigital0 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0