The HTTP request

When it arrives at your server an HTTP request consists of two or maybe three (depends upon the request) parts.

The header is a series of lines, terminated by carriage return/line feed (CRLF) characters (hex 0D13 characters). The first of these lines defines the actual request and the remaining lines contain information from the client about the request and the session. For example any cookie data is in there.

Then there is a blank line (a CRLF string so there are two together, one from the end of the last header line and the blank line one, as in CRLFCRLF) that indicates the end of the  header.

Following that there may be more data, for example form data if the form method is POST or XML from a SOAP request.

So basically the text string you receive in an HTTP request it looks like this:

request line CRLF

options and other stuff on multiple lines CRLF

CRLF                       <– blank line/end of header indicator

payload                 <– for POST type requests from a client.

The request line

The first line of the request tells you what resource the client wants. It is split into three basic components as follows:

The call type

Whilst there are more, I am just going to deal with GET and POST. For a GET request the resource name (next section) tells the server what resource to return and where to find it (usually the path in the resource name matches the directory structure in the server but it does not have to, you can map things differently). With a POST request the resource name is usually the name of something on the server to be executed ( A script say) while the data for the request is contained in the third section (the payload) of the request.

The Resource name

When you type in for example in a browser address bar the /home/index/page1.htm part becomes the resource name that is passed to the server. If you don’t enter any resource name (for example you just enter the resource name defaults to ‘/’ which means the root document of your web server. WHAT the root document is, is up to you. It could be a static page or it could invoke a script to build the page dynamically.

The resource name may also include parameters encoded as part of it but I shall go into those later.

Protocol being used by the client

Finally on the first line is the protocol and version being used by the client. Typically this says HTTP/1.1 which means the client is using the 1.1 version of the HTTP protocol. This is far more complex than the 1.0 version but the cool thing is, is that you don’t have to use what the client is using in your reply. The HTTP 1.0 protocol is much simpler than the 1.1 version and easier to implement, at least initially. All you need to do in your replies to requests is say that you are using the 1.0 version of the protocol and the client (well, most modern browsers anyway) should be able to accept the lower level of service. After all, they may well still be a few old 1.0 web server out there in the real world so browsers have to be able to handle the responses from them.

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: