Section 7.2. Authentication Methods

7.2. Authentication Methods

This section discusses three widely deployed authentication methods:

Basic authentication
Digest authentication
Form-based authentication

The first two are built into the HTTP protocol and defined in RFC 2617, "HTTP Authentication: Basic and Digest Access Authentication" (http://www.ietf.org/rfc/rfc2617.txt). Form-based authentication is a way of moving the authentication problem from a web server to the application.

Other authentication methods exist (Windows NT challenge/response authentication and the Kerberos-based Negotiate protocol), but they are proprietary to Microsoft and of limited interest to Apache administrators.

7.2.1. Basic Authentication

Authentication methods built into HTTP use headers to send and receive authentication-related information. When a client attempts to access a protected resource the server responds with a challenge. The response is assigned a 401 HTTP status code, which means that authentication is required. (HTTP uses the word "authorization" in this context but ignore that for a moment.) In addition to the response code, the server sends a response header WWW-Authenticate, which includes information about the required authentication scheme and the authentication realm. The realm is a case-insensitive string that uniquely identifies (within the web site) the protected area. Here is an example of an attempt to access a protected resource and the response returned from the server:

$ telnet www.apachesecurity.net 80
Trying 217.160.182.153...
Connected to www.apachesecurity.net.
Escape character is '^]'.
GET /review/ HTTP/1.0
Host: www.apachesecurity.net
   
HTTP/1.1 401 Authorization Required
Date: Thu, 09 Sep 2004 09:55:07 GMT
WWW-Authenticate: Basic realm="Book Review"
Connection: close
Content-Type: text/html

The first HTTP 401 response returned when a client attempts to access a protected resource is normally not displayed to the user. The browser reacts to such a response by displaying a pop-up window, asking the user to type in the login credentials. After the user enters her username and password, the original request is attempted again, this time with more information.

$ telnet www.apachesecurity.net 80
Trying 217.160.182.153...
Connected to www.apachesecurity.net.
Escape character is '^]'.
GET /review/ HTTP/1.0
Host: www.apachesecurity.net
Authorization: Basic aXZhbnI6c2VjcmV0
   
HTTP/1.1 200 OK
Date: Thu, 09 Sep 2004 10:07:05 GMT
Connection: close
Content-Type: text/html

The browser has added an Authorization request header, which contains the credentials collected from the user. The first part of the header value contains the authentication scheme (Basic in this case), and the second part contains a base-64 encoded combination of the username and the password. The aXZhbnI6c2VjcmV0 string from the header decodes to ivanr:secret. (To experiment with base-64 encoding, use the online encoder/decoder at http://makcoder.sourceforge.net/demo/base64.php.) Provided valid credentials were supplied, the web server proceeds with the request normally, as if authentication was not necessary.

Nothing in the HTTP protocol suggests a web server should remember past authentication requests, regardless of if they were successful. As long as the credentials are missing or incorrect, the web server will keep responding with status 401. This is where some browsers behave differently than others. Mozilla will keep prompting for credentials indefinitely. Internet Explorer, on the other hand, gives up after three times and displays the 401 page it got from the server. Being "logged in" is only an illusion provided by browsers. After one request is successfully authenticated, browsers continue to send the login credentials until the session is over (i.e., the user closes the browser).

Basic authentication is not an ideal authentication protocol. It has a number of disadvantages:

Credentials are transmitted over the wire in plaintext.
There are no provisions for user logout (on user request, or after a timeout).
The login page cannot be customized.
HTTP proxies can extract credentials from the traffic. This may not be a problem in controlled environments when proxies are trusted, but it is a potential problem in general when proxies cannot be trusted.

An attempt to solve some of these problems was made with the addition of Digest authentication to the HTTP protocol.

7.2.2. Digest Authentication

The major purpose of Digest authentication is to allow authentication to take place without sending user credentials to the server in plaintext. Instead, the server sends the client a challenge. The client responds to the challenge by computing a hash of the challenge and the password, and sends the hash back to the server. The server uses the response to determine if the client possesses the correct password.

The increased security of Digest authentication makes it more complex, so I am not going to describe it here in detail. As with Basic authentication, it is documented in RFC 2617, which makes for interesting reading. The following is an example of a request successfully authenticated using Digest authentication:

$ telnet www.apachesecurity.net 80
Trying 217.160.182.153...
Connected to www.apachesecurity.net.
Escape character is '^]'.
GET /review/ HTTP/1.1
Host: www.apachesecurity.net
Authorization: Digest username="ivanr", realm="Book Review",
nonce="OgmPjb/jAwA=7c5a49c2ed9416dba1b04b5307d6d935f74a859d",
uri="/review/", algorithm=MD5, response="3c430d26043cc306e0282635929d57cb",
qop=auth, nc=00000004, cnonce="c3bcee9534c051a0"
   
HTTP/1.1 200 OK
Authentication-Info: rspauth="e18e79490b380eb645a3af0ff5abf0e4",
cnonce="c3bcee9534c051a0", nc=00000004, qop=auth
Connection: close
Content-Type: text/html

Though Digest authentication succeeds in its goal, its adoption on the server side and on the client side was (is) very slow, most likely because it was never deemed significantly better than Basic authentication. It took years for browsers to start supporting it fully. In Apache, the mod_auth_digest module used for Digest authentication (described later) is still marked "experimental." Consequently, it is rarely used today.

Digest authentication suffers from several weaknesses:

Though user passwords are stored in a form that prevents an attacker from extracting the actual passwords, even if he has access to the password file, the form in which the passwords are stored can be used to authenticate against a Digest authentication-protected area.
Because the realm name is used to convert the password into a form suitable for storing, Digest authentication requires one password file to exist for each protection realm. This makes user database maintenance much more difficult.
Though user passwords cannot be extracted from the traffic, the attacker can deploy what is called a "replay attack" and reuse the captured information to access the authenticated areas for a short period of time. How long it can do so depends on server configuration. With a default Apache configuration, the maximum duration is five minutes.
The most serious problem is that Digest authentication simply does not solve the root issue. Though the password is somewhat protected (admittedly, that can be important in some situations), an attacker who can listen to the traffic can read the traffic directly and extract resources from there.

Engaging in secure, authenticated communication when using an unencrypted channel is impossible. Once you add SSL to the server (see Chapter 4), it corrects most of the problems people have had with Basic authentication. If using SSL is not an option, then deployment of Digest authentication is highly recommended. There are many freely available tools that allow almost anyone (since no technical knowledge is required) to automatically collect Basic authentication passwords from the traffic flowing on the network. But I haven't seen any tools that automate the process of performing a replay attack when Digest authentication is used. The use of Digest authentication at least raises the bar to require technical skills on the part of the attacker.

There is one Digest authentication feature that is very interesting: server authentication. As of RFC 2617 (which obsoletes RFC 2609), clients can use Digest authentication to verify that the server does know their password. Sounds like a widespread use of Digest authentication could help the fight against numerous phishing attacks that take place on the Internet today (see Chapter 10).

7.2.3. Form-Based Authentication

In addition to the previously mentioned problems with HTTP-based authentication, there are further issues:

HTTP is a stateless protocol. Therefore, applications must add support for sessions so that they can remember what the user did in previous requests.
HTTP has no provisions for authorization. Even if it had, it would only cover the simplest cases since authorization is usually closely integrated with the application logic.
Programmers, responsible for development and maintenance of applications, often do not have sufficient privileges to do anything related to the web servers, which are maintained by system administrators. This has prompted programmers to resort to using the authentication techniques they can control.
Having authentication performed on the web-server level and authorization on the application level complicates things. Furthermore, there are no APIs developers could use to manage the password database.

Since applications must invest significant resources for handling sessions and authorization anyway, it makes sense to shift the rest of the responsibility their way. This is what form-based authentication does. As a bonus, the boundary between programmers' and system administrators' responsibilities is better defined.

Form-based authentication is not a protocol since every application is free to implement access control any way it chooses (except in the Java camp, where form-based authentication is a part of the Servlets specification). In response to a request from a user who has not yet authenticated herself, the application sends a form (hence the name form-based) such as the one created by the following HTML:

<form action="/login.php" method="POST">
<input type="text" name="username"><br>
<input type="password" name="password"><br>
<input type="submit" value="Submit"><br>
</form>

The user is expected to fill in appropriate username and password values and select the Submit button. The script login.php then examines the username and password parameters and decides whether to let the user in or send her back to the login form.

HTTP-based authentication does not necessarily need to be implemented on the web server level. Applications can use it for their purposes. However, since that approach has limitations, most applications implement their own authentication schemes. This is unfortunate because most developers are not security experts, and they often design inadequate access control schemes, which lead to insecure applications.

Authentication features built into Apache (described below) are known to be secure because they have stood the test of time. Users (and potential intruders) are not allowed to interact with an application if they do not authenticate themselves first. This can be a great security advantage. When authentication takes place at the application level (instead of the web-server level), the intruder has already passed one security layer (that of the web server). Applications are often given far less testing than the web server and potentially contain more security issues. Some files in the application, for example, may not be protected at all. Images are almost never protected. Often applications contain large amounts of code that are executed prior to authentication. The chances of an intruder finding a hole are much higher when application-level authentication is used.

When deploying private applications on the public Internet, consider using web-server authentication in addition to the existing application-based authentication. In most cases, just a simple outer protection layer where everyone from the organization shares one set of credentials will do.