A basic understanding of HTTP and socket programming

Table of content

Roles on the Internet
What is HTTP?
Layers of the Internet
Coding
Conclusion

Roles on the Internet

Developers and designers can select different arrangements and architecture for their computers to develop an internet service. In this article, we are talking about the most commonly used client-server model. There are mainly two roles in this model:

client: A client computer initiates a request for the resources and the information from the server computer.
server: A server computer processes the client request and responds with appropriate information.

Fig.1: Client-Server model

What is http?

To conduct communication between a client and a server a common language is necessary. We can develop and use any language we desire to make these computers communicate with each other. But to make it usable, compatible, and widely accessible we need a standardized language (aka protocol). This is where the HTTP comes in. - HTTP stand for Hypertext Transfer Protocol.

HTTP protocol defines the structure for request and response messages. It is like sending a parcel to your friend an address structured as street name, city name, state name, zip code, and country code.
HTTP defines operations that can be performed in a given resource which is called HTTP request methods. It is also referred to as HTTP verbs. Some examples are:
- The GET method requests a representation/state of the specified resource/data. It should only retrieve data.
- The POST method submits an entity to the specified resource, often causing a change in the server.
HTTP protocol also defines different sets of status codes to indicate different states that may occur during communication like successful responses (200-299), client error responses (400-499), server error responses (500-599), etc.
HTTP requests and responses share a similar structure and are composed of:
- A single start-line describing the requests/response to be implemented, or it's status of whether successful or a failure.
- An optional set of HTTP headers specifying the request/response or describing the body included in the message.
- A blank line indicating all meta-information for the request has been sent.
- An optional body containing data associated with the request (like the content of an HTML form) or the document associated with a response. The presence of the body and its size is specified by the start-line and HTTP headers.

Fig.2: Structure of HTTP message

Layers of the Internet

The software model of the Internet can be defined by two types of layered models: OSI model and IP model Internet Protocol Suite. OSI model stands for Open Systems Interconnection model. It defines the standard for the Internet and networking. Hence, it is also called the reference model. IP model stands for Internet Protocol model. It is practically implemented and used by the Internet. Without going into much depth main points are:

Every computer (client or server), and network devices that supports the Internet implements the IP suite model.
These models are structured as layers hence called layered structure
Each layer has its responsibility and hides its implementation (inner working) from other layers
Each layer provides an interface to layers immediately above or below itself

Note:
In IP suite model, Application layer combines all three layers from OSI model i.e Application + Presentation + Session
In IP suite model, Link layer combines lower two layers from OSI model i.e Physical + Datalink
The lowermost layer from each model connects with physical network devices like routers, switches, etc.

Fig.3: Layered Model of a network

HTTP comes under the Application layer which is accessible to the user process. HTTP protocol uses the services provided by the Transport layer (TCP or UDP) to establish and conduct communication with other computers on the Internet. But to glue these two layers, the operating system provides an interface called socket API. Programming that utilizes socket API is called socket programming.

A socket is like the two ends of a water pipe that handles the flow of water. But the only difference is that the information flows in both directions in a socket. A socket can be identified by a socket address which combines protocol type, IP address, and port number. Socket API makes it easy to use different protocols and develop communication in a network. The following figure shows a general flow of communication using socket API. We will understand more about these socket functions later in our coding part.

Note:
This figure shows the socket functions implemented in the C language. In this article, we will be using socket functions from Java.

Flow of communication between client-server socket using TCP (in C) {831x930} {caption: Fig.4: Flow of communication between client-server socket using TCP (in C)}

Fig.4: Flow of communication between client-server socket using TCP (in C)

Coding

Basic files and project structure

Note: Please first install java JDK for this project, setup your environment variables, and check everything is working well.

Main project folder will contain a public and a resources folder. In the public folder, we will store all our .html pages, and in the resources folder, we will store our resources like images, videos, etc.
Our main server program WebServer.java will be in the project's root directory.

1    <!-- ./public/index.html -->
2    <!DOCTYPE html>
3    <html>
4    <head>
5        <meta charset="UTF-8">
6        <title>Cat Paws</title>
7    </head>
8    <body>
9        <div>
10            <h1  style="text-align: center;">Cat Paws</h1>
11            <a  style="display: flex; justify-content: center;" href="cats">Serve me some cat videos</a>
12        </div>
13    </body>
14    </html>

1    <!-- ./public/cats.html -->
2    <!DOCTYPE html>
3    <html lang="en">
4    <head>
5        <meta charset="UTF-8">
6        <title>Cat Paws</title>
7    </head>
8    <body>
9        <h1 style="text-align: center;">Cat-videos</h1>
10        <div style="display: flex; justify-content: center;">
11            <video width="640" height="480" controls>
12                <!-- 'watch' will be replaced later by the server -->
13                <source src="watch/cat-test-video" type="video/mp4">
14            </video>
15        </div>
16    </body>
17    </html>

If you are coding along with this article please store any video inside the ./resources folder and name it cat-test-video.mp4
The project structure should look something like this.

Fig.5: Project Structure

Creating a socket and listening for new connection

1// Inside WebServer.java file
2// Some necessary imports
3import java.net.*;
4import java.io.*;
5
6import java.nio.file.Files;
7import java.nio.file.Path;
8import java.nio.file.Paths;
9
10import java.security.NoSuchAlgorithmException;
11import java.util.Arrays;
12import java.util.Scanner;
13import java.util.regex.Matcher;
14import java.util.regex.Pattern;

1// Inside WebServer.java file
2public class WebServer{
3    public static void main (String args[]) throws IOException NoSuchAlgorithmException {
4        int serverPort = 9090;
5        // allocating resources for server's socket and assigning a port number
6        ServerSocket listenSocket = new ServerSocket(serverPort);
7        try{
8            System.out.println("Server program started...");
9            while(true) {
10                // listening to the assigned port number and accepting
11                // new client connection
12                System.out.println("\nListening for new connection...");
13                Socket clientSocket = listenSocket.accept();
14                System.out.println("\nA client is connected.....");
15
16                // Initializing 'Connection' class with 
17                // client's socket address
18                Connection c = new Connection(clientSocket);
19            }
20        } catch(IOException e) {
21            System.out.println("Listen error:" + e.getMessage());
22        } finally{
23            listenSocket.close();
24        }
25    }
26}

The above code does the following things:

Creates a socket and assigns a port number to listen (aka server's socket address).
The server's system listens to the assigned port for any new connection request from the client-side
If the connection is successful and accepted by the server, it returns the client's socket address
Initializes the Connection class with the client's socket info to process the client's HTTP request and send back an HTTP response

Creating a Connection class

1// Inside WebServer.java file
2// Thread class is extended to use multi-threading facility
3class Connection extends Thread {
4    DataInputStream in;
5    DataOutputStream out;
6    Socket clientSocket;
7
8    public Connection (Socket clientSocket) {
9        try {
10            this.clientSocket = clientSocket;
11            // accessing input stream from the given socket
12            this.in = new DataInputStream( this.clientSocket.getInputStream());
13            // accessing output stream from the given socket
14            this.out = new DataOutputStream( this.clientSocket.getOutputStream());
15            // starts a new thread
16            this.start();
17        } catch(IOException e) {
18            System.out.println("Connection: "+e.getMessage());
19        }
20    }
21
22    // continue...

The above code does the following things:

Accesses the input stream from the link that bridges the server and the client socket. This stream is used to read the client's HTTP request.
Accesses the output stream from the link that bridges the server and the client socket. This stream is used to send an HTTP response from the server computer.

Processing client's HTTP request and sending HTTP response from server

1    // continue Connection class...
2
3    // overiding run method from Thread class
4    public void run(){
5        // Initializing a Scanner class to read stream input from the
6        // socket connection between server and client
7        Scanner scan = new Scanner(this.in, "UTF-8");
8        try{ 
9            // separating the HTTP request header from rest of the HTTP message
10            // "\\r\\n\\r\\n" below refers to the 'empty line' from fig.1
11            String requestHeader = scan.useDelimiter("\\r\\n\\r\\n").next();
12
13            System.out.println("*********Header Data***********");
14            System.out.println("*******************************");
15            System.out.println(requestHeader);
16            System.out.println("*******************************\n\n");
17
18            // decomposing request header to understand client request message
19            String[] headerLines = requestHeader.split("\r\n");
20            String[] startLine = headerLines[0].split(" ");
21
22            String method = startLine[0];  // GET method
23            String path = startLine[1];    // index.html
24            String version = startLine[2]; // HTTP version
25
26            // handling GET request from the client
27            if (method.equals("GET")) {
28                // calling helper method to search and get the path that
29                // leads to the resource client is looking for
30                Path filePath = getFilePath(path);
31                
32                System.out.println("file path: " + filePath);
33                // checking if resource path exists
34                if (Files.exists(filePath)) {
35                    // file exist
36                    // calling helper method to guess the file extension 
37                    // type like .html, .mp4, etc
38                    String contentType = guessContentType(filePath);
39
40                    System.out.println("file exist");
41                    System.out.println("content type:" + contentType);
42                    
43                    // calling helper method and setting standard HTTP 
44                    // status code, reponse type, and response content
45                    sendResponse("200 OK", contentType, "", Files.readAllBytes(filePath));
46                    
47                } else {
48                    // file not found
49                    System.out.println("File not found");
50                    
51                    byte[] notFoundContent = "<h1> File not found :( </h1>".getBytes("UTF-8");
52
53                    sendResponse("404 Not Found", "text/html", "", notFoundContent);
54                }
55                out.flush();
56                break;
57            }
58        } catch(Exception e) {
59            System.out.println("EOF:"+e);
60        } finally {
61            scan.close();
62            try {
63                clientSocket.close();
64            }catch (IOException e){
65                /*close failed*/
66            }
67        }
68    }
69    // continue...

Some helper methods

1    // continue Connection class...
2
3    // helper method: getting the path to access the requested resource/information hosted by the server (aka routing)
4    private static Path getFilePath(String path) {
5        Matcher match = Pattern.compile("/watch/").matcher(path);
6        if(match.find()){
7            // replacing "watch" with "resources" because we have our videos inside that directory
8            path = path.replaceAll("/watch/", "./");
9            path = path + ".mp4";
10            return Paths.get(path);
11        }
12
13        if ("/".equals(path)) {
14            path = "index.html";
15        }else if("/cats".equals(path)){
16            path = "cats.html";
17        }
18
19        return Paths.get("./public/", path);
20    }
21
22    // helper method: guessing the resource extension eg: .html, .mp4 etc
23    private static String guessContentType(Path filePath) throws IOException {
24        return Files.probeContentType(filePath);
25    }
26
27    // helper method: creating response message for the client's request
28    private void sendResponse(String status, String contentType, String additional_header, byte[] content) throws IOException {
29        byte[] response = (
30            "HTTP/1.1 " + status + "\r\n"
31            + "Content-Type: " + contentType + "\r\n"
32            + additional_header 
33            + "\r\n").getBytes("UTF-8");
34        out.write(response, 0, response.length);
35        out.write(content, 0, content.length);
36        out.write("\r\n\r\n".getBytes("UTF-8"));
37    }
38}

Compiling the WebServer.java file

Open a command line in the project directory and compile the java file with the command javac WebServer.java. Two class files WebServer.class and Connection.class will be created.
Then run the server with the command java WebServer
Open the browser, which will be our client, and you can open the inspect windows and then the network tab in the browser to see the network activity
Type localhost:9090 in the address bar to send a request to our server. Then our index.html page be loaded
Now click the blue link Serve me some cat videos to request the server for a video
In the network tab, we can notice that after clicking the link. Our client sends two requests to the server: one for the cats.html page and one for the video file.

Conclusion

What we have learned in this article is just a speck of technology used on the Internet. We should be thankful for all those brilliant minds and engineering that make the Internet possible. Most importantly stay curious, and learn.