Understanding an HTTP Transaction
When you go to a browser and request a Web page, the sequence of events that follow can be considered an HTTP transaction. Here is what is actually going on under the hood:
The user types the URL http://feedster.com/status.php into a browser.
The browser parses this URL and decides the following:
This bit of information is translated to an HTTP transaction that looks like the following lines of text:
GET /status.php HTTP/1.1
Accept: image/gif, image/png, image/jpeg, */*
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; .NET CLR 1.1.4322)
Although some of these transaction items are optional, these few lines of ASCII text accompany virtually every HTTP transaction on the Web. Here's what they mean:
What to do. This is called an HTTP method and it says, "Give me the information located in /status.php and send it back to me using the 1.1 version of the HTTP protocol."
"I can understand information in these formats."
"The language I understand is Englishthe U.S. dialect." This allows the server to respond with different content tailored to the language specified.
"It's okay to send me data in compressed form because I understand both gzip and deflate types of compression." You should understand that just because the browser understands compression, the server won't automatically use it. Most servers on the Internet don't compress content unless the administrator specifically turned compression on.
"The type of browser I am is Microsoft Internet Explorer 6 running on Windows 98."
"Pull the /status.php information from the computer located at Feedster.com."
"Keep the HTTP connection open until the browser specifically closes it." This improves performance because the connection doesn't have to be closed (and then opened again) for each connection. Without Keep-Alive, a Web page with three images on it technically would be four connections (one for each of the images and one for the page itself).
Of these different lines of code that make up an HTTP request, only the first is an actual HTTP method, a command to do something. The other lines are called headers and make up different metadata about the overall transaction.
Now when a Web server receives a request like this, it has to respond, and its response looks like the following:
Look for the information on the server that is represented by /status.php.
If the information actually exists on the server, send it back to the client (browser) as follows:
HTTP/1.1 200 OK
Date: Mon, 08 Dec 2003 16:46:40 GMT
Server: Apache/1.3.27 (Unix) mod_throttle/3.1.2 PHP/4.3.2
Content-Type: text/html; charset=utf-8
<html lang="en-US" xml:lang="en-US" xmlns="http://www.w3.org/1999/xhtml">
[REST OF WEB PAGE OMITTED FOR SPACE REASONS]
When you look at this HTTP response, there are two parts. The beginning is a bit of information about the information that was requested. This is called the response header. Then there is a blank line and the information that was actually requested follows. This second part is called the body, the entity, or the entity-body. Here's what the different headers mean:
The first line tells the client (browser) how the information will be sent (HTTP protocol, version 1.1) and that the requested information was found correctly. An HTTP status code of 200 means "Everything is fine; I found the document and it's about to come to you."
This tells the client the date on the server where the information comes from. The standard for this is in GMT, Greenwich Mean Time.
What type of server is providing the information.
What tool is powering the server (PHP, of course).
What tool is enhancing the server's performance. (These two X-headers are optional and specific to a particular server configuration).
Tells the client that the connection will be closed after the server finishes sending information.
Tells the client what type of content is being sent down.
Additionally, the character set can also be specified.
A very useful tool for understanding HTTP requests is the HTTP Headers bookmarklet, which shows you the HTTP headers for any Web page. You can download this from http://tantek.com/favelets/.