Knowledgebase: PHP and Apache
htaccess, Apache, And Rewrites
Posted by CTX Admin on 22 October 2014 03:22 PM

As a web designer or developer, it is important to know how to use the htaccess file to your advantage. It is a very powerful tool, and can even work as a deterrent for bandwidth thieves, exploits, and hackers. Below are some common examples of rules to consider when developing websites.

Rewrite Engine

In order for any of these rules to work turn on the rewrite engine.

Options -Indexes +FollowSymLinks
RewriteEngine On

The first line of code is a security feature that must be enabled for the rewrite rules to work. In most cases, this is already set up by your host, but it won't hurt to list it again here. This is just telling the server to disable directory listings and follow symbolic links. If you are a Windows user, this would be the same as a shortcut. You may not need the first line, but it is important to understand why you might need to add it.

The second line is what actually turns the rewrite engine on. It does nothing for us though.

Basic Redirect

There are a couple of reasons why you would use a basic redirect. Let's say you just re-structured and organized your files, but you wanted visitors who were still using the old filename to be able to access the new one. They might have bookmarked the page, or found it in a search engine.

RewriteRule ^old-filename.php$ /new-filename.php [R=301,L]

The first part of the rule is looking for a request for old-filename.php. If it finds it, it will visually redirect the visitor to new-filename.php. In the last part of the rule, we're using a redirect flag with a 301 code attribute attached, which is a permanent redirect. If we had just used [R] without adding the code parameter, the code would have been a 302, which is a temporary redirect. We want to use a permanent one so search engines know that the old file does not exist anymore.

There is also a caret in front of old-filename.php and a dollar sign at the end. These are regular expressions, and the caret tells us to the request must start with this, while the dollar sign means it must end with this. We will talk more about flags and regular expressions in an upcoming article.

One more thing... about that forward slash in front of new-filename.php - If we were to specify the base path for rewriting, we would not need this, but since we did not, we need to add it. If we wanted to drop it though, we would add the following directive after we turn the rewrite engine on:

RewriteBase /

It just depends on what you find easier. The base for rewriting URLs will always be the root of your website.

There is also another way to perform redirects without using mod_rewrite. We can use mod_alias instead:

Redirect 301 /old-filename.php http://www.domain.com/new-filename.php
Redirect Permanent /old-filename.php http://www.domain.com/new-filename.php

Both lines of code work, but you only need to use one. We could also do a temporary redirect:

Redirect /old-filename.php http://www.domain.com/new-filename.php
Redirect 302 /old-filename.php http://www.domain.com/new-filename.php
Redirect Temp /old-filename.php http://www.domain.com/new-filename.php

All three lines would work, but we only need to use one. If you don't specify anything after the Redirect, it will use the default temporary redirect (a 302), just like our redirect flag.

HTML to PHP

What if you recently converted your site to PHP, but all of the old filenames were using the .html extension? It wouldn't make sense to create a redirect rule for each file. So instead you could do this:

RewriteRule ^(.*).html /$1.php [R=301,L]

Anytime a request for a file with the .html extension is made, it will be redirected to the same file but with your new.php extension.

Of course, you could always change the way Apache handled HTML files, letting them act like PHP files instead.

AddType application/x-httpd-php .php .html .htm

Remove File Extensions

But file extensions are so ugly! Maybe you want to give the illusion that your individual files are actually directories:

RewriteRule ^(.*)/$ /$1.php [L]

So we could have a bunch of files in our root directory that looked like this:

http://www.domain.com/services.php
http://www.domain.com/products.php
http://www.domain.com/about.php
http://www.domain.com/links.php
http://www.domain.com/contact.php

And they could instead look like this:

http://www.domain.com/services/
http://www.domain.com/products/
http://www.domain.com/about/
http://www.domain.com/links/
http://www.domain.com/contact/

Add WWW To Domain

One of the first things you should decide when you create a new site, or at least early on, is if you're going to have the 'WWW' in your domain or not. This is important not only aesthetically, but for search engine optimization.

By forcing visitors and search engines to your preferred domain, you can guarantee that you won't end up with duplicate results or different page ranks for your domain with or without the 'WWW'. I personally think domains look naked without them, and there is a reason why we use the 'WWW' in the first place.

Not everyone enters the 'WWW' when they type in a domain. Some leave it out, while others always type it in. If you're familiar with the keyboard shortcuts built in to your browser, simply typing the domain without the TLD extension (domain instead of domain.com) and then holding the CTRL + Enter keys down will add the 'www.' before the domain name, and a '.com' after it (other keyboard shortcuts will use '.org' or '.net').

You might have been to a site before and noticed that if you don't type the 'WWW' in, you get an error page telling you that the page cannot be found. It all depends on how your host or server administrator has decided to set this up, but usually your domain will work with or without the 'WWW'.

RewriteCond %{HTTP_HOST} ^domain.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

The first line is a rewrite condition that is looking at the current hostname, which can be something like www.domain.comdomain.comsub.domain.com, or an IP address.

The second part of this line is checking to see if the hostname does not have a 'WWW' in front of it, and if it does not find one it moves to the second line and redirects all requests to the same domain with the 'WWW' in front of it. If the 'WWW' is already there, then this rule is ignored because it would not meet the condition in the first line.

The NC flag at the end of the first line just means to ignore the case of what the request is, so we could type in DoMain.com and it wouldn't matter. The R=301 flag means we are going to visually redirect the visitor to our domain with the 'WWW'.

A 301 code is a permanent redirect, and that is important because it also tells search engines that this isn't a temporary thing (which is what a 302 code would be).

The L flag means this is the last rule to follow in this set. Any other rules or conditions after this will not be factored into this rule.

Remove 'WWW' From Domain

Others prefer to remove the 'WWW', and for this rule, we're just doing the exact opposite of the previous example.

RewriteCond %{HTTP_HOST} ^www.domain.com$ [NC]
RewriteRule ^(.*)$ http://domain.com/$1 [R=301,L]

Add Trailing Slash

The use of a trailing slash is also important to consider. By default, your web browser will add a trailing slash to the end of a URL. It makes sense to have the trailing slash too, except for files, because it means we're in a directory.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.domain.com/$1/ [R=301,L]

The first line is a special variant that says if a file is requested and it exists, then we should ignore this rule and not add a trailing slash to it. Remember, file names do not have a trailing slash after them, but directories do.

The second line checks to see if a trailing slash already exists, and if it does, then it can ignore this rule too because it does not need to add one.

The last line, of course, adds a trailing slash if the previous two conditions are not met, and we are using the R andL flag here like we did in the previous two examples. You will see both of these flags throughout our examples.

There are times when you do not want the trailing slash. If you want to do something like this, we can specify which directories to not apply this rule to.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !directory/(.*)$
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.domain.com/$1/ [R=301,L]

The second line is a new condition we have added to our rule. It is saying that if the requested URL contains a directory named directory, that this directory, and everything inside of it, should not receive a trailing slash.

If your host has mod_dir enabled, make sure that you turn off the directory slash, which is enabled by default. This directive will add a trailing slash at the end of a directory regardless of the rules you set up. To disable this, add this to the top of your htaccess file:

DirectorySlash Off

Be careful when turning off the trailing slash. Please read the following:

Turning off the trailing slash redirect may result in an information disclosure. Consider a situation where mod_autoindex is active (Options +Indexes) and DirectoryIndex is set to a valid resource (say, index.html) and there's no other special handler defined for that URL. In this case a request with a trailing slash would show the index.html file. But a request without trailing slash would list the directory contents.

If you are adding a trailing slash to everything, then you don't even need to use this rewrite rule, but if you wanted to have a specific directory not receive a slash, you are going to want to add this slash rule and then this directive.

Remove Trailing Slash

Just like with the 'WWW' example, some prefer to remove the trailing slash. It's a commonly debated question that you'll find around the Internet but it just depends on what you prefer.

Your browser and even your server, by default, add a trailing slash to a directory. It is done for a reason. If you must strip the trailing slash though this is how you would do it:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)$
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

The explanation for this rule is the same as it is for when we want to add a trailing slash, just in reverse. We can also specify specific directories that we don't want apply this rule to.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !directory/(.*)$
RewriteCond %{REQUEST_URI} !(.*)$
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

Please see the note about mod_dir and the DirectorySlash directive in the previous example. You might need to turn this directive off.

Remove Query String

If you've taken the time to convert your ugly URLs to pretty ones, you may not want people typing in query strings. This rule will strip any query string attached to the end of a URL.

RewriteCond %{QUERY_STRING} .
RewriteRule ^(.*)$ http://www.domain.com/$1? [R=301,L]

The first line is checking to see if there is a query string. If it finds a query string, it removes it.

Remove Trailing Question Mark

Sometimes a rogue question mark will remain at the end of your URL. It's certainly not pretty, and this rule will make sure you never see it again.

RewriteCond %{THE_REQUEST} ? HTTP [NC] 
RewriteRule .? http://www.domain.com%{REQUEST_URI}? [R=301,L]

If you're using the previous rule to remove the query string entirely, you will want to use this rule after that one to remove the rogue question mark.

Remove Trailing Index File

With more sites and freely available PHP software using rewritten URLs, a lot of people don't like to have any files in their URL. With the index.php file being the most common, we can strip this out of the URL. A common place for this is the very root of your website:

RewriteCond %{THE_REQUEST} /index.php HTTP
RewriteRule (.*)index.php$ /$1 [R=301,L]

The first line is looking at the request, and if it finds index.php, it moves to the second line, where it is stripped from the end. If you had other file extensions, like .html, we could add those to the condition as well:

RewriteCond %{THE_REQUEST} /index.(php|html) HTTP
RewriteRule (.*)index.(php|html)$ /$1 [R=301,L]

We just wrap the file extensions in parenthesis and separate them by a pipe.

Remove Index File From URL

You might have been to a website where the URL looks something like this:

http://www.domain.com/<strong>index.php</strong>/blog/my-first-article/

Blogs and CMS applications typically have an index file that handles all of the requests, which is why you see this. It's definitely worthless, and quite ugly. Not a problem though, we can easily remove that.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [L]

We have seen the first special variant before, but the second one is checking to see if the request contains a directory that exists. If neither of these conditions is met, we are going to process all requests to our index.php file, but without showing it in the URL.

HTTP to HTTPS

This comes in handy if you have something like a secure order form or login area.

We can make sure people are using the secure version for these sections with the following rule. Just replace the word "directory" on the second line with the name of the directory that needs to use the secure version.

RewriteCond %{HTTPS} !=on
RewriteRule ^directory$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

HTTPS to HTTP

If visitors to your site are leaving a secure area and going back to other non-secure portions of your site it's a good idea to make sure that we go back to the standard HTTP protocol.

On the second line, enter the name of the directory where you do not want to send visitors to the non-secure version. It is most likely going to be the same as the directory you entered in the previous example.

RewriteCond %{HTTPS} =on
RewriteCond %{REQUEST_URI} !^/directory
RewriteRule ^(.*)$ http://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

Deny Access To Htaccess

If you don't want people looking at your .htaccess file, then this rule will take care of that.

<files .htaccess>
    order allow,deny
    deny from all
</files>

Enable Caching

It's a good idea to cache files that hardly ever change. How long the caching lasts for is up to you. If you change things up frequently, try an hour or two. In this example, we're caching media, JavaScript, and CSS files.

# 1 Hour  = 3600
# 1 Day   = 43200
# 1 Week  = 604800
# 1 Month = 2592000
# 1 Year  = 29030400

<filesMatch "\.(flv|gif|jpg|jpeg|png|ico|swf|js|css)$">
    Header set Cache-Control "max-age=2592000, public"
</filesMatch>

If you need to enter an expiration time that isn't listed here, you can use Google to convert it to seconds for you.

Disable Caching On Dynamic Pages

There are times when you want to disable caching completely, and you would typically do this for dynamic pages that change a lot, like a blog article or forum post.

<filesMatch ".(php|cgi)$">
    Header set Cache-Control "max-age=0, private, no-store, no-cache, must-revalidate"
</filesMatch>

Example Htaccess File

Now that we've seen examples for individual rules, here is what your htaccess file might look like:

Options -Indexes +FollowSymLinks
 
RewriteEngine On
 
##### Add WWW ###############################################
RewriteCond %{HTTP_HOST} ^domain.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]
 
##### Remove query string ###################################
RewriteCond %{QUERY_STRING} .
RewriteRule ^(.*)$ http://www.domain.com/$1? [L]
 
##### Remove trailing question mark #########################
RewriteCond %{THE_REQUEST} ? HTTP [NC]
RewriteRule .? http://www.domain.com%{REQUEST_URI}? [R=301,L]
 
##### Remove trailing index file ############################
RewriteCond %{THE_REQUEST} /index.php HTTP [NC]
RewriteRule (.*)index.php$ /$1 [R=301,L]
 
##### Add trailing slash ####################################
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.domain.com/$1/ [R=301,L]
 
##### Deny access to htaccess ###############################
<files .htaccess>
    order allow,deny
    deny from all
</files>
 
##### Cache media files #####################################
<filesMatch "\.(flv|gif|jpg|jpeg|png|ico|swf|js|css)$">
    Header set Cache-Control "max-age=2592000, public"
</filesMatch>

##### Don't cache dynamic pages #############################
<filesMatch ".(php|cgi)$">
    Header set Cache-Control "max-age=0, private, no-store, no-cache, must-revalidate"
</filesMatch>
(0 vote(s))
This article was helpful
This article was not helpful

Comments (0)
Post a new comment
 
 
Full Name:
Email:
Comments:
Help Desk Software by Kayako fusion