HTTP in detail

15 min readApr 16, 2023

HTTP stands for HyperText Transfer Protocol. It is an application layer protocol used to request resources from a web server. It was developed in 1989 by Tim Berners-Lee at CERN.

Through this article, you will learn:

HTTP Status Codes
HTTP Headers
HTTP Methods
HTTP Versions

HTTP Status Codes

HTTP status codes can be broken down into five categories:

Information Responses (100–199)

These are sent to tell the client the first part of their request has been accepted, and they should continue sending the rest of their request.

Successful Responses (200–299)

This category of status codes indicates that the client’s request was successful. Here are some common successful response codes sent by the web server to the client:

200-OK: The request was completed successfully.

201-Created: The request succeeded and a new resource has been created (for instance a new user, document). This is the response generally sent after a POST requests and after certain PUT requests.

204-No Content: There is no content to send for this request, but headers may be useful.

Redirection Messages (300–399)

These status codes are used to redirect client’s request to another resource (for instance a different webpage). Here are some common redirection codes that you may come across:

301-Moved Permanently: The URL of the resource you want to access has been changed permanently. The web server responds with the new URL of the requested resource.

302-Found: This status code means that the URI of a resource has been changed temporarily. Further changes in the URI might be made in the future.

303-See Other: The server sends this response to direct the client to get the requested resource at another URI with a GET request.

304-Not Modified: This status code is used for cache purposes. It tells to the client that the requested resource has not been modified, thus the client can keep on using the cached version of the response.

Client Error Responses (400–499)

These status codes are used to inform the client that its request was incorrect. Here are a few common client error responses status codes:

400-Bad Request: This tells the client that something was wrong or missing in their request (e.g., malformed request syntax, invalid request message framing).

401-Unauthorized: This tells the client that it needs to authenticate itself (e.g., using its credentials) to get the requested response.

403-Forbidden: This tells the client that it does not have the permission to see the requested resource whether it is authenticated or not.

404-Not Found: This tells the client that the server cannot find the requested resource (the resource you requested does not exist).

NOTES📝:

When using an API, this can also mean that the endpoint is valid, but the resource itself does not exist.
Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client.

405-Method Not Allowed: This tells the client that the requested resource does not allow the method it used.

Server Error Responses (500–599)

These status codes are used to indicate errors on the server-side.

Here are a few common server error responses status codes:

500-Internal Server Error: The server has encountered an error when trying to handle a client request (it does not not know how to handle the client request).

503-Service Unavailable: The server cannot handle the client request as it’s either overloaded or down for maintenance. This status code is also returned by the server when it is facing a DoS or DDoS attack.

HTTP Headers

HTTP headers let the client and the server pass additional information with an HTTP request or response. An HTTP header consists of its case-insensitive name followed by a colon (:), then by its value. Whitespace before the value is ignored (source: Mozilla).

As mentioned in the definition above, there are two types of HTTP headers: the HTTP Request headers and the HTTP Response headers.

HTTP Request Headers

HTTP Request headers are headers that belong to the client.

Here are a few HTTP request headers that can be interesting to take a look at.

User-Agent

The User-Agent is a http request header that indicates to the server which browser version, operating system the client is using for instance.

Here is a syntax of how to define a User-Agent header:

# User-Agent header syntax
User-Agent: <product> / <product-version> <comment>

# Common format for web browsers
User-Agent: Mozilla/5.0 (<system-information>) <platform> (<platform-details>) <extensions>

NOTE📝: It is totally possible for you to change your User-Agent. This can be handy when performing activities such as web scrapping for instance.

Cookie

A cookie is a small piece of text that is sent from a website to a user’s browser and stored on the user’s computer or mobile device. Cookies are commonly used by websites to store user preferences, login information, session data, and other information that can be used to personalize the user’s experience on the site.

The Cookie http request header contains stored HTTP cookies sent by the server to the client.

NOTE📝: The Cookie header is optional and may be omitted if the client’s browser block cookies for instance.

Here is a syntax of how to define a cookie header:

# Cookie header syntax
Cookie: name=value
Cookie: name=value; name2=value2; name3=value3

# Setting a cookie http request header
e.g. Cookie: PHPSESSID=134562198; csrftoken=uch382n89023

In this example, you can notice that the cookie sent to server has two parameters namely: PHPSESSID and csrftoken.

This will be sent to the server each time that the client initiates a new request.

Host

The Host request header is used to indicate the host and port number of the server to which the request is being sent. This is particularly important when a server is hosting multiple virtual hosts which are distincts websites or domains that share the same IP address.

Here is a syntax of how to define a Host header:

# Host header syntax
Host: <host>:<port>

# Setting a host header
e.g. Host: medium.com:443

In this example, we tried to connect to the host medium.com using the port 443.

NOTES📝:

If no port is specified, the default one will be used (80 for HTTP and 443 for HTTPS).
A Host header field must always be sent in all HTTP/1.1 requests (more of that in the next section concerning HTTP versions).

Connection

The Connection http request/response header is used to control the persistence of the connection between the client and the server. It is used by the client to indicate to the server whether it wants to keep the connection open after the current request has been processed (persistent connection) or close it.

# Connection header syntax
Connection: keep-alive
Connection: close

Referer

The Referer HTTP request header is a header that contains the URL of the webpage that the user was on before accessing the current page. This allows a server to identify referring pages that users are visiting from or where requested resources are being used.

# Referer header syntax
Referer: <url>

For instance, Referer: https://medium.com tells the server to whom we send our request that we were browsing a website called medium.com before making this request.

Upgrade-Insecure-Requests

The Upgrade-Insecure-Requests is a http request header used by the client to tell the server to automatically upgrade each HTTP request to HTTPS if possible.

Depending on its configuration, the server can then decide to upgrade or not the connection to HTTPS.

# Upgrade-Insecure-Requests header syntax
# 1 means that the header is enabled
Upgrade-Insecure-Requests: 1

Excellent! Now that we’ve seen some interesting HTTP Request headers, let’s take a look at the HTTP Response headers.

HTTP Response Headers

HTTP Response headers are headers that belong to the server.

Here are a few interesting HTTP Response headers:

Server

The server header describes the software used by the web server. This include the software version as well.

# Server header Syntax
Server: <product>

# Server header example
e.g. Server: Apache/2.4.1 (Ubuntu)

In the example above, Server: Apache/2.4.1 (Ubuntu) lets the client know that the server is using the version 2.4.1 of the Apache web server on a Ubuntu server Operating System.

This can be a gold mine for attackers especially if the version used by the web server has a critical vulnerability and a known public exploit.

That’s the reason why, it is advisable to avoid overly-detailed Server values.

X-Powered-By

The X-Powered-By is an optional http response header that indicates the technology or programming language used by the web application or website in the back-end.

# X-Powered-By header syntax
X-Powered-By: PHP/7.3.27

The example above, indicates that the web application is powered by the version 7.3.7 of PHP, which lets the client know that the server is using PHP as a back-end programming langage. In addition, they also know the version of PHP being used by the server which is lots of information.

Though this header can be useful for troubleshooting and debugging, it presents a security risk see that it can be used by attackers to exploit known vulnerabilities specific to that technology or language running on the server.

That’s why, unsetting this header can help avoid exposing potential vulnerabilities.

Content-Type

The Content-Type header is a http request/response header used to indicate the type of data being sent in the message body. It is used to specify the MIME (Multipurpose Internet Mail Extensions) type of the data, which is a standard way of identifying the format of files and data on the Internet.

# Content-Type header syntax
Content-Type: text/html; charset=utf-8
Content-Type: multipart/form-data; boundary=something

For instance the following header Content-Type: application/json is used to specify that the request body should contain JSON data.

Content-Length

The Content-Length header is a http request/response header used to indicate the size of the body in bytes to the recipient.

# Content-Length header syntax
Content-Length: <length>

For instance Content-Length: 1000 indicates to the recipient that the body has a length of one thousand characters.

Expires

The Expires HTTP header contains the date/time after which the response is considered expired.

# Expires header syntax
Expires: <http-date>

For instance, Expires: Wed, 11 Oct 2024 07:28:00 GMT , specifies the date and time after which the response will expire.

HTTP-Strict-Transport-Security (HSTS)

The Strict-transport-Security is a http response header used by web servers to inform the client that the site should only be accessed using HTTPS, and that any future attempts to access it using HTTP should automatically be converted to HTTPS.

NOTE📝: This is more secure than simply configuring a HTTP to HTTPS (301) redirect on your server, where the initial HTTP connection is still vulnerable to a man-in-the-middle attack.

# Strict-Transport-Security header syntax
Strict-Transport-Security: max-age=<expire-time>; includeSubDomains; preload

e.g. Strict-Transport-Security: max-age=63072000; includeSubDomains; preload

Let’s break down the different fields that can compose a Strict-transport-Security response header:

max-age=<expire-time>: sets the amount of time (in seconds), that the client should remember that a site is only to be accessed using HTTPS.
includeSubDomains: indicates that the HSTS policy should be applied to all subdomains of the domain in the Strict-Transport-Security header. For example, if example.com includes the includeSubDomains attribute in its HSTS header, the HSTS policy will also be applied to subdomains such as www.example.com and api.example.com.
preload: indicates that the domain should be included in the HSTS preload list maintained by web browsers. The preload list is a list of websites that are hardcoded into web browsers to always use HTTPS, even if the user has never visited the site before. This attribute can help to ensure that the HSTS policy is always enforced, even for first-time visitors.

Therefore, when the Strict-Transport-Security header includes all three attributes (fields), it means that the HSTS policy should be enforced for the specified max-age time period, applied to all subdomains, and included in the preload list maintained by web browsers.

Set-Cookie

Set-Cookie is a HTTP response header used by the server to send a cookie to the client. When the client receives the cookies, they save them in their browser.

To send multiple cookies, you need to define multiple Set-Cookie headers in the same response.

# Syntax
Set-Cookie: <cookie-name>=<cookie-value>; Domain; Expires; Secure; HttpOnly; Path=/

# Set-Cookie example
e.g. Set-Cookie: session-id=1234567890; Path=/; Domain=example.com; Secure; HttpOnly; Expires=Wed, 22 Mar 2023 12:00:00 GMT;

Let’s explain the different fields that can compose a Set-Cookie http response header:

<cookie-name>=<cookie-value>: define the cookie name and its value.
Domain=<domain-value>: define the domain name to which the cookie is sent (optional).
Expires=<date>: define the date and time after which the cookie should expired. After the expiration date, the cookie will no longer be sent to the server.
Secure: indicates that the cookie should only be sent over a secure HTTPS connection.
HttpOnly: indicates that the cookie should not be accessible to JavaScript code running on the page, which can help prevent cross-site scripting (XSS) attacks.
Path=<path-value>: Indicates the path that must exist in the requested URL for the browser to send the Cookie header. For example, if a cookie is set with a Path attribute of /blog, the cookie will only be sent to the server with requests for URLs that start with /blog, such as example.com/blog, example.com/blog/post-1, and so on.

NOTE📝: All these attributes are not mandatory.

X-Frame-Options

The X-Frame-Options HTTP response is used to prevent a page from being display inside an iframe.

This header tells the browser whether to allow or deny the page to be displayed in an iframe. Setting the value of this header to “DENY” will prevent the page from being displayed in an iframe altogether, while setting it to “SAMEORIGIN” will only allow the page to be displayed in an iframe on the same origin as the page.

# X-Frame-Options header syntax
X-Frame-Options: DENY
X-Frame-Options: SAMEORIGIN

The SAMEORIGIN attribute means that the page can be displayed in an iframe which includes the same origin (the same domain, same subdomain, and same protocol) as the page.

Therefore, if the page is hosted on “https://example.com", it can be displayed in an iframe on pages with the same origin, such as “https://example.com/page1", and “https://subdomain.example.com". However, it cannot be displayed in an iframe on pages with a different origin, such as “https://anotherdomain.com" or “https://example.net".

When using DENY attribute, the page would not be displayed in an iframe, regardless of the origin.

Overall, the goal of using this header is to defend against a web attack called clickjacking.

HTTP Methods

HTTP supports several methods. Here are a few of them:

GET

The GET method is an HTTP method used to retrieve a resource such as a webpage, an image, video from the web server.

Here is a simple GET method syntax:

GET /resource_path HTTP/1.1
Host: <hostname>

In this example, i sent a HTTP GET request to the web server (107.22.139.22), and asked it to give me the /get webpage located in its root directory (httpbin.org/get).

POST

The POST method is an HTTP method used to send data to the web server.

Here is a simple POST method syntax:

POST /resource_path HTTP/1.1
Host: <hostname>
Content-Length: application/json
Content-Length: 13
data_to_send

NOTE: It is mandatory to leave an empty line between the ‘Content-Length’ header and the ‘data_to_send’, otherwise you will get a 400 Bad request response. In fact, this permits to separate the HTTP header from the body.

The figure below is an illustration of how you can use this method:

Is it possible to send data in a format other than json?

Yes, this is absolutely feasible.

Let’s take a look at the example below:

HTTP POST Method using a URL encoding MIME

As you can see, the ‘Content-Type’ used in this example was ‘application/x-www-urlencoded’. It is a MIME type (url encoding MIME) commonly used for sending form data over HTTP. Moreover it’s the default encoding type used when submitting HTML forms.

NOTES📝:

The headers Content-Type and Content-Length are required when sending data with the POST method. If omitted, you will get a 400 Bad Request response from the server. To easily get the Content-Length of the data, you want to send; you can use the echo command as follows:

Other well known Content-Type’s values are text/html and multipart/form-data.

Well! However, wondering if data can be sent using the GET method?

The answer is yes.

This can be done using the url’s query parameter. Simply put, we will utilize the url query parameters to send data using the GET method.

The figure below is how this can be done:

NOTE📝: When using the GET method to send data, we may face two major problems namely:

URL length restriction: an URL has a limited length, which means that users can quickly be limited in the data they send (2048 characters maximum).
Security risk: sending data using the URL is not a proper way of doing things especially if these data are sensitive.

For these reasons, it will be preferable to use the POST method instead.

HEAD

The HEAD method is an HTTP method used to only request the web server’s HTTP header response without asking the response body.

It is quite identical to the GET method. However, contrary to the GET method that requests the web server’s header as well as its body, the HEAD method only requests the web server’s header.

Here is a simple HEAD method syntax:

HEAD /resource_path HTTP/Version
Host: <hostname>

PUT

The PUT method is an HTTP method used to update data on a web server. This supposes that the data have already been created.

Syntax:
PUT /resource_path HTTP/1.1
Host: <hostname>
Content-Length: text/html
Content-Length: 13
data_to_send

NOTE📝: It is mandatory to leave an empty line between the ‘Content-Length’ header and the ‘data_to_send’, otherwise you will get a 400 Bad request response.

The figure below is an illustration of how this can be used:

DELETE

The DELETE method is an HTTP method used to delete a resource on a web server.

Here is a simple DELETE method syntax:

Syntax:
DELETE /resource_path HTTP/1.1
Host: <hostname>

To learn more about other HTTP methods, you can check out this article.

HTTP Versions

HTTP has several version such as HTTP/0.9 , HTTP/1.0 ,HTTP/1.1 , HTTP/2.0 , HTTP/3.0 .

Here we will focus on the first three ones see that they are the most known and used.

HTTP/0.9

HTTP/0.9 is known as the simplest HTTP version. Indeed, it had no HTTP headers and only supports the GET method.

The figure below is an illustration of HTTP request and response using HTTP/0.9:

HTTP/1.0

Due to HTTP/0.9 limitation, HTTP/1.0 was implemented. This HTTP version does support HTTP headers. Moreover, the server will return a status code that allows the clients to know if their request was successfully processed or not.

The figure below is an illustration of HTTP request and response using HTTP/1.0:

Though the significant improvements made, HTTP/1.0 wasn’t able to keep track of HTTP connections which means that, a new HTTP connection was created for each new request made. This consumed lots of resources, and wasn’t efficient.

To cope with that, HTTP/1.0 was replaced with HTTP/1.1.

HTTP/1.1

By contrast to HTTP/1.0, HTTP/1.1 keeps track of HTTP connections which means that a client can use the same HTTP connection to send multiple requests without creating a new HTTP connection for each request like it was the case with HTTP/1.0.

This new HTTP feature is called persistent connections or keep-alive connections and was made possible thanks to the introduction of the HTTP “Connection” header.

The figure below below is an illustration of HTTP requests and responses, using HTTP/1.1 :

Moreover, HTTP/1.1 allows you to host multiple hostnames with the same IP address on the same server. This awesome feature is called virtual hosting. Virtual hosts are hostnames that share the same IP address.

To access a specific hostname, you need to use the ‘Host’ header. That’s the reason why, the ‘Host’ header is required when using HTTP/1.1 .

To learn more about the evolution of HTTP versions, you can visit this website.

Super! Let’s now sum up all the things we’ve seen so far.

Let’s recap!

In this article, we learnt what was HTTP and we covered:

HTTP status codes which are regrouped in five categories.
HTTP methods (GET, POST, HEAD, PUT, DELETE).
HTTP Headers (cookie, user-agent, server, x-powered-by, host).
HTTP versions (HTTP/0.9, HTTP/1.0, HTTP/1.1).

That’s all guys! Hope you learnt something!

Do not forget to click on the little clap icon below if you enjoyed the content.

Furthermore, thanks for subscribing to my newsletter to keep up with my latest articles.

https://olivierkonate.medium.com/python-requests-library-beginner-60f59112c71d