My.ADVISOR.com Sign-In
ID
Password

Member Center / Sign-Up
   
SUBSCRIPTION STATUS
If you are a subscriber to this publication, sign-in to access locked articles. To subscribe or renew go to www.AdvisorStore.com.
Go to Article
Advanced Search 
 

WEB DEVELOPMENT

Build Your Own Web Server

Get an overview of the basic mechanics of serving a Web page by building your own Java-based server.

 DOWNLOAD (13,255 bytes) -- Accompanying source listings file:
+ WebServer.java -- main class for the Web Server application
+ ClientRequest.java -- class that signifies a specific client request
+ WSUtil.java -- class includes methods used by other classes in the Web server application
+ WSApplet.java
+ MyHomePage.html
+ webserver.properties
By Partha Sarathi Kuchana

This is a tutorial on how to create a basic Web server to serve HTML Web pages.

Overview

Highly versatile, feature-rich commercial Web servers are available -- such as Microsoft's Internet Information Server -- but they are not open for someone who is interested in learning the inner details of Web page serving. And open source projects, such as Apache, require huge amounts of programming and make it difficult to grasp the fundamentals. My objective here is to demonstrate the basic mechanics of serving a Web page. If you're interested, you can extend this idea further to enhance the abilities of the server.

Because you'll be building the Web server from scratch, I'll present an overview of the workings of a Web server. This will give you a more controlled, streamlined approach to enhancing the capabilities of the server.

Requirements

To develop and test the Web server, you need the following:

  • Sun Java Development Kit (JDK) 1.1.2 or newer
  • A Web browser, such as Netscape Navigator or Microsoft Internet Explorer
  • A text editor for editing Java classes and HTML files

Concepts

Java language concepts

I'll assume you have a good understanding of the Java thread model, I/O, and network programming. The following are some features of threads and sockets in Java that you'll use in your Web server implementation.

Here are some notes on threads and sockets related to the Web server implementation:
  • A thread is a line of execution.
  • The Thread class in the java.lang package lets you start multiple threads of execution in a Java program.
  • Threads can be assigned priority.
  • When Thread.start() is called, Runnable's run method executes on a new thread.
  • Java networking classes are available in the java.net package.
  • A socket is one endpoint of a two-way communication link between two programs running on the network.
  • A socket is bound to a port number. The Web server you'll build listens on the port specified in the webserver.properties file.
  • You can use the ServerSocket class to listen on a specific port and from clients. This class is available in the java.net package.
  • A ServerSocket accepts a client connection using the accept() method.
  • The accept method waits until a client requests a connection on the host and port of this server.

Web application communication model

The interaction between the client and the server side in a typical Web application is shown in figure 1.



Figure 1: Internet application communication model -- Internet application communications typically involve at least three or four components, each one sending and receiving requests and responses.

Consider the parts of this interaction:

Client -- Typically, the client is a Web browser. The client initiates a request to the Web server for a server-side resource (such as an HTML page or an image).

Web server -- Typically, a Web server first receives the request from the client, and serves the requested resources to the client. A Web server processes requests from clients for static content. A Web server can't directly process a request for dynamic content.

An example is displaying data from a database. In such cases, the Web server relies on the application server to process the request. If the application server is to process the request, the request will be forwarded to the application server. In a sense, you can think of a Web server as an application that constantly looks for client requests and processes them as they arrive, either directly or in conjunction with the application server.

Application server -- The application server does the processing and sends results to the Web server. The Web server forwards these results to the client in the form of HTML as part of the response. The application server interacts with other necessary resources, such as a database, in order to process the request.

In some implementations, the Web server and application server aren't distinct entities; rather, functionality for both is combined into one server application. In this tutorial, I'll concentrate on the Web server part of the communication flow.

Keep in mind that the Web server you'll create is of limited capability. During the testing phase, I'll demonstrate how the Web server can serve HTML pages with images and applets.

Hyper Text Transfer Protocol (HTTP)

In this tutorial, I'll discuss the features of HTTP from the point of view of implementing a simple Web server.

Overview

HTTP is an application-level protocol that is generic, stateless, and object-oriented. It has the lightness and speed necessary for distributed, collaborative, hypermedia information systems.

The World Wide Web global information initiative has used HTTP since 1990 for communicating with a Web server. HTTP is based on a request/response paradigm: A client establishes a connection with a server and sends a request to it, and the server responds to that request.

HTTP is "connectionless," meaning the connection established by the client prior to each request is closed by the server after it sends the response.

Most HTTP communication is initiated by a user agent and consists of a request to be applied to a resource on some server. In the simplest case, a single connection between the user agent and the server accomplish this.

On the Internet, HTTP communication generally takes place over TCP/IP connections. The default port is TCP 80, but you can use other ports.

Method definitions

The two most widely used methods for HTTP 1.0 are GET and POST. A third method, HEAD, is similar to GET. Here's a synopsis of the two methods:

GET -- The GET method retrieves whatever information is identified by the Request-URI (Uniform Resource Identifier) -- e.g., a simple HTML file, text file, or image. If the Request-URI refers to a data-producing process, the data produced by that process, not the source text of the process, is returned as the entity in the response (unless that text happens to be the output of the process -- e.g., a server-side script such as a JavaServer Pages [JSP] or Active Server Pages [ASP] script).

POST -- The POST method requests that the destination server accept the entity enclosed in the request as a new subordinate of the resource identified by the request URI in the Request-Line -- e.g., submitting an HTML form. The posted data is considered subordinate to that URI. The action performed by the POST method might not result in a resource that a URI can identify.

Status codes

As part of the response, the server sends a status code back to the client. Here's the list of possible status codes:
3339Code Status
  • 200 OK
  • 201 Created
  • 202 Accepted
  • 204 No Content
  • 301 Moved Permanently
  • 302 Moved Temporarily
  • 304 Not Modified
  • 400 Bad Request
  • 401 Unauthorized
  • 403 Forbidden
  • 404 Not Found
  • 500 Internal Server Error
  • 501 Not Implemented
  • 502 Bad Gateway
  • 503 Service Unavailable

Request/response headers

As I stated, the HTTP protocol is based on a request/response paradigm. A client establishes a connection with a server and sends a request to that server in the form of a request method, URI, and protocol version, followed by a message containing request modifiers, client information, and possible body content. The server responds with a Status-Line, including the message's protocol version and a success or error code, followed by a message containing server information and possible body content.

Request

A Full-Request will have, in general, the Request-Line, followed by the Request-Header, Entity-Header. The end of an Entity-Header is marked by a CRLF character. The Entity-Body follows the Entity-Header.

A Request-Line message from a client to a server includes the method to be applied to the resource, the identifier of the resource, and the protocol version in use. Here's an example of a Request-Line:

GET /MyHomePage.html HTTP/1.1

Where method GET signifies a Simple-Request, MyHomePage.html is the resource/Request-URI and HTTP/1.1 is the protocol version. The Request-URI is transmitted as an encoded string.

The Request-Header also lets the client pass additional information about the request or about the client itself using other fields in the Request-Header, such as Authorization, From, Referrer, and User Agent.

Response

An HTTP response message will be sent back to the client after the server receives and interprets the request message. A Full-Response will have, in general, a Status-Line followed by the Response-Header, Entity-Header. The end of an Entity-Header is marked by a CRLF character. The Entity-Body follows the Entity-Header. The Entity-Body actually contains the requested resource.

The first line of a Full-Response message is the Status-Line. The syntax of a Status-Line is as follows:

HTTP-Version SP Status-Code SP Reason-Phrase CRLF
e.g., HTTP/1.0 200 OK

The server can pass additional information about the response, which can't be placed in the Status-Line back, to the client using the Response-Header fields, such as Location, Server, and WWW-Authenticate.

The Entity-Header contains fields, which lets the server define meta-information about the Entity-Body content. Different fields in the Entity-Header are as follows:

Allow -- Lists the set of methods supported by the resource identified by the Request-URI -- e.g., Allow: GET.

Content-Encoding -- Used as a modifier to the media type -- e.g., Content-Encoding: x-gzip.

Content-Length -- Indicates the size of the Entity-Body -- e.g., Content-Length: 1340.

Content-Type -- Indicates the media type of the Entity-Body -- e.g., Content-Type: text/html.

Expires -- Specifies the date/time after which the entity should be considered stale -- e.g., Expires: Thu, 20 Sept 2001 16:00:00 GMT.

Last-Modified -- Indicates the date and time at which the resource was last modified -- e.g., Last-Modified: Wed, 19 Sept 2001 16:00:00 GMT.

Entity body

You've seen that the Entity-Body is part of both a Full-Request and a Full-Response. In the case of a Full-Request, inclusion of a Content-Length field in the Entity-Header indicates the presence of an Entity-Body. In the case of a Full-Response with GET or POST, all responses that don't result in informational(1xx)/no content(204)/not modified(304) status codes must have an Entity-Body.

For example, consider a user agent requesting a Web page (MyHomePage.html) from a Web server. Assume the requested Web page contains references to other resources such as an image (through an <IMG> tag) or an applet (through an <Applet> tag). For the Web browser to display the Web page in the correct format, it has to have the entire Web page, including the image, applet resources, and the other text/HTML content.

The client and the server communicate using request and response messages to make this possible. The Web browser sends different request messages for each resource it requests from the Web browser. This is true even if these resources are contained in the same Web page -- e.g., for MyHomePage.html:

GET /MyHomePage.html HTTP/1.1

Because the Web page is assumed to contain an image and an applet, the Web browser sends three request messages for the HTML file, image, and applet, respectively. The browser makes these requests in parallel. This is the reason why, when a Web page with multiple images is loaded, all images seem to be loading in parallel. For this to happen, the Web server has to be able to process multiple requests at the same time. I'll make use of the Java thread model to implement this functionality.

From the server side, the Web server sends response messages in the form of a Status-Line, Entity-Header, and follows with the actual Entity-Body. The Entity-Body contains the actual resource data.

For example, the Status-Line and Entity-Header from the server side for the MyHomePage.html Web page look like this:

HTTP/1.0 200 OK
Allow: GET
MIME-Version: 1.0
Content-type: text/html
Content-length: 154

Design

The three classes shown in figure 2 are required for the implementation of a simple Web server.



Figure 2: Class diagram -- Implementing a simple Web server involves interaction among three main classes.

I'll discuss these classes in detail:

WebServer class -- This class receives different client requests and starts a thread (ClientRequest class) for each of them. This enables requests to be processed in parallel, not in sequence.
ClientRequest class -- This class signifies the Web server class received a request. This class is derived from the java.lang.Thread class, and has methods to perform any transmission of resources from the server to the client side.
WSUtil class -- This class has miscellaneous utility methods, which are used by both the Web server and CurrentRequest classes. The methods in this class provide functionality for logging and reading property files.
Associations -- As you can see in the class diagram in figure 2, you can associate an instance of the Web server with zero or more instances of the ClientRequest class. Both Web server and ClientRequest classes have dependency and can instantiate the WSUtil class.

Web server properties

There are three parameters you can specify through the Web server.properties file:

HOME_DIRECTORY -- This parameter should have the directory path, which serves as the Web server root. The Web server looks for the requested resource relative to the value specified using this parameter.
LOG_FILE -- This parameter should have the full path to the log file.
PORT_NUMBER -- This parameter should specify the port on which the Web server should listen.

(A sample Web server.properties file is included in the download for this article.)

The functionality and the versatility of the Web server design is kept to a minimum to keep the design simple and easy to comprehend. The Web server I'll build processes requests with the GET method only.

Implementation

You implement the Web server design using the Java programming language for the benefit of portability. Java features a powerful thread model and offers capabilities for network programming.

(Fully functional Java code is provided in the accompanying download file.)

Overview of the implementation algorithm

When you first start the Web server, it creates a server socket on the specified port. The server waits for the client requests to arrive. For every request, the server creates an instance of ClientRequest and starts it.

Inside the run() method, the ClientRequest object checks the name of the Web resource requested by the Web browser. A typical request from the Web browser will be in the form GET /MyHomePage.html HTTP/1.1.

The run() method invokes the sendHeader(resourceName) method, which sends appropriate headers to the Web browser. The actual resource transmission begins after the headers are sent. The headers send information such as the type and size of the data being transmitted from the server to the Web browser.

All events/errors are logged to the log file. The server uses simple file logging to keep the design simple.

Testing

Installation

I'll assume the following locations:

Web Server = \WebServer
Web Server Home Directory = \WebServer\Home
  1. To compile Web server classes, copy Web server classes WebServer.java, ClientRequest.java, and WSUtil.java to the location \WebServer from the Source Listings section.
  2. Execute the command javac *.java
  3. To create the properties file, first create the Web server.properties file with values given for fields LOG_FILE_NAME, HOME_DIRECTORY, and PORT_NUMBER. Then place the server.properties file in \WebServer location. (I've provided a sample Web server.properties file under the Source Listings link.)
  4. Compile the applet class for testing. First, copy the applet class WSApplet.java for testing to the location \WebServer\Home from the Source Listings link. Then, execute the command javac WSApplet.java.
  5. Create a Web page for testing. Copy the applet class MyHomePage.html for testing to the location \WebServer\Home from the Source Listings link. Make sure the directories \WebServer and \WebServer\Home are included in the classpath to avoid "incorrect classpath" errors.

Start Web server

Execute the following command at the command prompt:

java Web server

You will get a message stating "Server Started Successfully."

Start Web client

Start any Web browser, such as Netscape Navigator or Microsoft Internet Explorer.

Request Web page

Request the MyHomePage.html Web page with either of the following URLs:

Use the second format to access the page from a different computer. Check the server log file to see the details.

Yours to enhance

This implementation of the Web server is by no means complete. As I mentioned in the overview, this Web server is quite basic in that it serves HTML pages only. However, you can enhance the server, and even implement features found in most commercial servers.


Web server glossary

Client -- An application program that establishes connections for the purpose of sending requests.
Connection -- A transport-layer virtual circuit established between two application programs for the purpose of communication.
Request -- An HTTP request message.
Response -- An HTTP response message.
Resource -- A network data object or service identifiable by a URI.
Server -- An application program that accepts connections, requests from client applications. It services requests by sending back responses.
URI (Uniform Resource Identifier) -- Formatted string that identifies a network resource by means of a name, location, or any other characteristic.
User agent -- A client that initiates a request. These are often browsers, spiders, or other user tools.

Printer-friendly
page layout

What do YOU think about this topic? Share your advice and thoughts using this form.

Your Name

REQUIRED : PUBLIC

Your E-Mail

REQUIRED : PRIVATE

Job, Company

OPTIONAL : PUBLIC

City, State, Country

OPTIONAL : PUBLIC

Your Web Site

OPTIONAL : PUBLIC

Your Comment

Please help everyone by keeping your comments on-topic, using clean language, and not defaming or making personal attacks.


Your e-mail address is required, but it will not be displayed to the public or given to anyone. See our Privacy Policy. Comments become visible after they pass our spam filter, and spammers and abusers are permanently blocked. Please report spam or abuse.

Partha Sarathi Kuchana is a software consultant with several years of experience in designing and developing Web-based applications. Partha Kuchana is a Sun Certified Enterprise Architect. He has published several technical articles on the Internet and has been offering freeware applications on the Web. http://members.ITJobsList.com/partha, ParthaSarathi_K@yahoo.com.

ARTICLE INFO

Web Edition: 2002.01.21, Doc #09115

FREE ACCESS FREE ACCESS


File: Accompanying source listings file:
+ WebServer.java -- main class for the Web Server application
+ ClientRequest.java -- class that signifies a specific client request
+ WSUtil.java -- class includes methods used by other classes in the Web server application
+ WSApplet.java
+ MyHomePage.html
+ webserver.properties

DOWNLOAD: 13,255 bytes

Keyword Tags: Active Server Pages (ASP), Apache Web Server, ASP (Active Server Pages), Business Software, Business Technology, Development, HTTP, Internet Operations, IT Architecture, IT Networking, Java, Java Server Page (JSP), Microsoft, Microsoft Internet Explorer, Microsoft Internet Information Server, Microsoft .NET Framework, Netscape, Netscape Navigator, Server, Software Development, Sun Java Development Kit, Sun Microsystems, Web Deployment, Web Development, Web Operations

Use of this or any other site, content, product or service of Advisor Media constitutes acceptance of Terms of Use.
Portions copyright ©1983-2010 Advisor Media, LLC. All Rights Reserved.
Reuse or reproduction of any portion or quantity of Advisor Media's copyrighted content, in any form, for any purpose, requires written permission.
ADVISOR®, the ADVISOR logo, and other names and logos that incorporate ADVISOR are registered trademarks, trademarks or service marks of Advisor Media, LLC in the United States and/or other countries.
Other trademarks are used for identification, editorial or descriptive purposes and are the property of their owners.
Hosted by Prominic.NET Website powered by
LOTUS SOFTWARE
oa KUCHP001 posted 2002-1-21 mod 02/08/2010 03:12:19 AM ztdbms/
domino-144.advisor.com www.advisormedia.com 02/09/2010 02:45:03 AM