knowledge/technology/internet/HTTP.md
2024-01-17 09:00:45 +01:00

27 KiB
Raw Blame History

aliases source obj
HyperText Transfer Protocol
https://developer.mozilla.org/en-US/docs/Web/HTTP concept

HyperText Transfer Protocol

Hypertext Transfer Protocol (HTTP) is an application-layer protocol for transmitting hypermedia documents, such as HTML. It was designed for communication between web browsers and web servers, but it can also be used for other purposes. HTTP follows a classical client-server model, with a client opening a connection to make a request, then waiting until it receives a response. HTTP is a stateless protocol, meaning that the server does not keep any data (state) between two requests.

Requests

HTTP Request Structure

A typical HTTP request consists of the following components:

GET /path/to/resource HTTP/1.1
Host: www.example.com
Accept: text/html
  • Request Line: Specifies the HTTP method, the resource's URI, and the HTTP version.
  • Headers: Additional information about the request, such as the host, accepted content types, etc.
  • Body: Optional data sent with the request, typically used in POST or PUT requests.

HTTP Response Structure

A typical HTTP response consists of the following components:

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 123

<!DOCTYPE html>
<html>
<head>
    <title>Example</title>
</head>
<body>
    <p>Hello, World!</p>
</body>
</html>
  • Status Line: Specifies the HTTP version, status code, and a brief status message.
  • Headers: Provide additional information about the response, such as content type, length, etc.
  • Body: Contains the actual data or resource requested.

HTTP Multipart

HTTP Multipart is an extension of the Hypertext Transfer Protocol (HTTP) that allows for the transmission of multiple types of data in a single request or response. This is particularly useful when dealing with file uploads, form submissions, or any scenario where different types of data need to be sent simultaneously.

Overview

HTTP Multipart messages are used to send multiple parts of data as a single message body. Each part can have its own content type and can be of different data types, such as text or binary. Multipart messages are identified by a unique boundary string that separates each part within the message.

Structure of a Multipart Message

A Multipart message typically consists of the following structure:

POST /upload HTTP/1.1
Host: example.com
Content-Type: multipart/form-data; boundary=boundary_string

--boundary_string
Content-Disposition: form-data; name="text_data"

This is a text field.

--boundary_string
Content-Disposition: form-data; name="file"; filename="example.txt"
Content-Type: text/plain

Contents of the file go here.

--boundary_string--
  • The Content-Type header specifies that the message body is a multipart form data, and it includes the boundary string.
  • Each part of the message is separated by the boundary string.
  • Parts can include text data or binary data, and they are defined by the Content-Disposition header.
  • The name attribute in the Content-Disposition header specifies the name of the field or parameter.
  • For file uploads, the filename attribute is included, along with the Content-Type header specifying the type of file.

HTTP Methods

HTTP defines several methods or verbs that indicate the desired action to be performed on a resource. Common methods include:

  • GET: Retrieve data from the server.
  • POST: Submit data to the server to be processed.
  • PUT: Update a resource on the server.
  • DELETE: Remove a resource on the server.
  • HEAD: Retrieve metadata about a resource without the actual data.
  • OPTIONS: Query the server for the allowed methods on a resource.

HTTP Headers

Headers can be grouped according to their contexts:

  • Request headers contain more information about the resource to be fetched, or about the client requesting the resource.
  • Response headers hold additional information about the response, like its location or about the server providing it.
  • Representation headers contain information about the body of the resource, like its MIME type, or encoding/compression applied.
  • Payload headers contain representation-independent information about payload data, including content length and the encoding used for transport.

Authentication Headers

  • Authorization:
    The HTTP Authorization request header can be used to provide credentials that authenticate a user agent with a server, allowing access to a protected resource.
    Authorization: <auth-scheme> <authorization-parameters>
    

Caching Headers

  • Age:
    The Age header contains the time in seconds the object was in a proxy cache.
    Age: <delta-seconds>
    
  • Expires:
    The Expires HTTP header contains the date/time after which the response is considered expired.
    Expires: <http-date>
    Expires: Wed, 21 Oct 2015 07:28:00 GMT
    

Content negotiation Headers

  • Accept:
    The Accept request HTTP header indicates which content types, expressed as MIME types, the client is able to understand. The server uses content negotiation to select one of the proposals and informs the client of the choice with the Content-Type response header. Browsers set required values for this header based on the context of the request. For example, a browser uses different values in a request when fetching a CSS stylesheet, image, video, or a script.
    Accept: <MIME_type>/<MIME_subtype>
    Accept: <MIME_type>/*
    Accept: */*
    
    // Multiple types, weighted with the quality value syntax:
    Accept: text/html, application/xhtml+xml, application/xml;q=0.9, image/webp, */*;q=0.8
    
  • Accept-Encoding:
    The Accept-Encoding request HTTP header indicates the content encoding (usually a compression algorithm) that the client can understand. The server uses content negotiation to select one of the proposals and informs the client of that choice with the Content-Encoding response header.
    Accept-Encoding: gzip
    Accept-Encoding: compress
    Accept-Encoding: deflate
    Accept-Encoding: br
    Accept-Encoding: identity
    Accept-Encoding: *
    
    // Multiple algorithms, weighted with the quality value syntax:
    Accept-Encoding: deflate, gzip;q=1.0, *;q=0.5
    
  • Content-Language:
    The Accept-Language request HTTP header indicates the natural language and locale that the client prefers. The server uses content negotiation to select one of the proposals and informs the client of the choice with the Content-Language response header. Browsers set required values for this header according to their active user interface language. Users rarely change it, and such changes are not recommended because they may lead to fingerprinting.
    Accept-Language: <language>
    Accept-Language: *
    
    // Multiple types, weighted with the quality value syntax:
    Accept-Language: fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5
    
  • Cookie:
    The Cookie HTTP request header contains stored HTTP cookies associated with the server (i.e. previously sent by the server with the Set-Cookie`` header or set in JavaScript using Document.cookie`).
    Cookie: <cookie-list>
    Cookie: name=value
    Cookie: name=value; name2=value2; name3=value3
    
  • Set-Cookie:
    The Set-Cookie HTTP response header is used to send a cookie from the server to the user agent, so that the user agent can send it back to the server later. To send multiple cookies, multiple Set-Cookie headers should be sent in the same response.
    Set-Cookie: <cookie-name>=<cookie-value>
    Set-Cookie: <cookie-name>=<cookie-value>; Domain=<domain-value>
    Set-Cookie: <cookie-name>=<cookie-value>; Expires=<date>
    Set-Cookie: <cookie-name>=<cookie-value>; HttpOnly
    Set-Cookie: <cookie-name>=<cookie-value>; Max-Age=<number>
    Set-Cookie: <cookie-name>=<cookie-value>; Partitioned
    Set-Cookie: <cookie-name>=<cookie-value>; Path=<path-value>
    Set-Cookie: <cookie-name>=<cookie-value>; Secure
    
    Set-Cookie: <cookie-name>=<cookie-value>; SameSite=Strict
    Set-Cookie: <cookie-name>=<cookie-value>; SameSite=Lax
    Set-Cookie: <cookie-name>=<cookie-value>; SameSite=None; Secure
    
    // Multiple attributes are also possible, for example:
    Set-Cookie: <cookie-name>=<cookie-value>; Domain=<domain-value>; Secure; HttpOnly
    

CORS Headers

  • Access-Control-Allow-Origin:
    The Access-Control-Allow-Origin response header indicates whether the response can be shared with requesting code from the given origin.
    Access-Control-Allow-Origin: *
    Access-Control-Allow-Origin: <origin>
    Access-Control-Allow-Origin: null
    
  • Access-Control-Allow-Headers:
    The Access-Control-Allow-Headers response header is used in response to a preflight request which includes the Access-Control-Request-Headers to indicate which HTTP headers can be used during the actual request.
    Access-Control-Allow-Headers: [<header-name>[, <header-name>]*]
    Access-Control-Allow-Headers: *
    
  • Access-Control-Allow-Methods:
    The Access-Control-Allow-Methods response header specifies one or more methods allowed when accessing a resource in response to a preflight request.
    Access-Control-Allow-Methods: <method>, <method>, …
    Access-Control-Allow-Methods: *
    
  • Access-Control-Expose-Headers:
    The Access-Control-Expose-Headers response header allows a server to indicate which response headers should be made available to scripts running in the browser, in response to a cross-origin request.
    Access-Control-Expose-Headers: [<header-name>[, <header-name>]*]
    Access-Control-Expose-Headers: *
    
  • Access-Control-Max-Age:
    The Access-Control-Max-Age response header indicates how long the results of a preflight request (that is the information contained in the Access-Control-Allow-Methods and Access-Control-Allow-Headers headers) can be cached.
    Access-Control-Max-Age: <delta-seconds>
    
  • Access-Control-Request-Headers:
    The Access-Control-Request-Headers request header is used by browsers when issuing a preflight request to let the server know which HTTP headers the client might send when the actual request is made (such as with setRequestHeader()). The complementary server-side header of Access-Control-Allow-Headers will answer this browser-side header.
    Access-Control-Request-Headers: <header-name>, <header-name>, …
    
  • Access-Control-Request-Method:
    The Access-Control-Request-Method request header is used by browsers when issuing a preflight request, to let the server know which HTTP method will be used when the actual request is made. This header is necessary as the preflight request is always an OPTIONS and doesn't use the same method as the actual request.
    Access-Control-Request-Method: <method>
    
  • Origin:
    The Origin request header indicates the origin (scheme, hostname, and port) that caused the request. For example, if a user agent needs to request resources included in a page, or fetched by scripts that it executes, then the origin of the page may be included in the request.
    Origin: null
    Origin: <scheme>://<hostname>
    Origin: <scheme>://<hostname>:<port>
    

Download Headers

  • Content-Disposition:
    In a regular HTTP response, the Content-Disposition response header is a header indicating if the content is expected to be displayed inline in the browser, that is, as a Web page or as part of a Web page, or as an attachment, that is downloaded and saved locally.
    Content-Disposition: inline
    Content-Disposition: attachment
    Content-Disposition: attachment; filename="filename.jpg"
    

Message body Headers

  • Content-Length:
    The Content-Length header indicates the size of the message body, in bytes, sent to the recipient.
    Content-Length: <length>
    
  • Content-Type:
    The Content-Type representation header is used to indicate the original media type of the resource (prior to any content encoding applied for sending).
    Content-Type: text/html; charset=utf-8
    Content-Type: multipart/form-data; boundary=something
    
  • Content-Encoding:
    The Content-Encoding representation header lists any encodings that have been applied to the representation (message payload), and in what order. This lets the recipient know how to decode the representation in order to obtain the original payload format. Content encoding is mainly used to compress the message data without losing information about the origin media type.
    Content-Encoding: gzip
    Content-Encoding: compress
    Content-Encoding: deflate
    Content-Encoding: br
    
    // Multiple, in the order in which they were applied
    Content-Encoding: deflate, gzip
    
  • Content-Language:
    The Content-Language representation header is used to describe the language(s) intended for the audience, so users can differentiate it according to their own preferred language.
    Content-Language: de-DE
    Content-Language: en-US
    Content-Language: de-DE, en-CA
    

Redirect Headers

  • Location:
    The Location response header indicates the URL to redirect a page to. It only provides a meaning when served with a 3xx (redirection) or 201 (created) status response.
    Location: <url>
    

Request Context Headers

  • Referer:
    The Referer HTTP request header contains the absolute or partial address from which a resource has been requested. The Referer header allows a server to identify referring pages that people are visiting from or where requested resources are being used. This data can be used for analytics, logging, optimized caching, and more.
    Referer: <url>
    
  • Referrer-Policy:
    The Referrer-Policy HTTP header controls how much referrer information (sent with the Referer header) should be included with requests.
    Referrer-Policy: no-referrer
    Referrer-Policy: no-referrer-when-downgrade
    Referrer-Policy: origin
    Referrer-Policy: origin-when-cross-origin
    Referrer-Policy: same-origin
    Referrer-Policy: strict-origin
    Referrer-Policy: strict-origin-when-cross-origin
    Referrer-Policy: unsafe-url
    
  • User-Agent:
    The User-Agent request header is a characteristic string that lets servers and network peers identify the application, operating system, vendor, and/or version of the requesting user agent.
    User-Agent: <product> / <product-version> <comment>
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X x.y; rv:42.0) Gecko/20100101 Firefox/42.0
    

Range Headers

  • Accept-Ranges:
    The Accept-Ranges HTTP response header is a marker used by the server to advertise its support for partial requests from the client for file downloads. The value of this field indicates the unit that can be used to define a range.
    Accept-Ranges: bytes
    
  • Range:
    The Range HTTP request header indicates the parts of a resource that the server should return. Several parts can be requested at the same time in one Range header, and the server may send back these ranges in a multipart document. If the server sends back ranges, it uses the 206 Partial Content status code for the response. If the ranges are invalid, the server returns the 416 Range Not Satisfiable error.
    Range: <unit>=<range-start>-
    Range: <unit>=<range-start>-<range-end>
    Range: <unit>=<range-start>-<range-end>, <range-start>-<range-end>
    Range: <unit>=<range-start>-<range-end>, <range-start>-<range-end>, <range-start>-<range-end>
    Range: <unit>=-<suffix-length>
    
  • Content-Range:
    The Content-Range response HTTP header indicates where in a full body message a partial message belongs.
    Content-Range: bytes <range-start>-<range-end>/<size>
    Content-Range: bytes <range-start>-<range-end>/*
    Content-Range: bytes */<size>
    

HTTP Response Status Codes

HTTP response status codes indicate whether a specific HTTP request has been successfully completed. Responses are grouped in five classes:

  • Informational responses (100 199)
  • Successful responses (200 299)
  • Redirection messages (300 399)
  • Client error responses (400 499)
  • Server error responses (500 599)

Informational responses

  • 100 Continue: This interim response indicates that the client should continue the request or ignore the response if the request is already finished.
  • 101 Switching Protocols: This code is sent in response to an Upgrade request header from the client and indicates the protocol the server is switching to.
  • 102 Processing (WebDAV): This code indicates that the server has received and is processing the request, but no response is available yet.

Successful responses

  • 200 OK: The request succeeded.
  • 201 Created: The request succeeded, and a new resource was created as a result. This is typically the response sent after POST requests, or some PUT requests.
  • 202 Accepted: The request has been received but not yet acted upon. It is noncommittal, since there is no way in HTTP to later send an asynchronous response indicating the outcome of the request. It is intended for cases where another process or server handles the request, or for batch processing.
  • 204 No Content: There is no content to send for this request, but the headers may be useful. The user agent may update its cached headers for this resource with the new ones.
  • 205 Reset Content: Tells the user agent to reset the document which sent this request.
  • 206 Partial Content: This response code is used when the Range header is sent from the client to request only part of a resource.
  • 207 Multi-Status (WebDAV): Conveys information about multiple resources, for situations where multiple status codes might be appropriate.
  • 208 Already Reported (WebDAV): Used inside a <dav:propstat> response element to avoid repeatedly enumerating the internal members of multiple bindings to the same collection.

Redirection messages

  • 300 Multiple Choices: The request has more than one possible response. The user agent or user should choose one of them. (There is no standardized way of choosing one of the responses, but HTML links to the possibilities are recommended so the user can pick.)
  • 301 Moved Permanently: The URL of the requested resource has been changed permanently. The new URL is given in the response.
  • 302 Found: This response code means that the URI of requested resource has been changed temporarily. Further changes in the URI might be made in the future. Therefore, this same URI should be used by the client in future requests.
  • 303 See Other: The server sent this response to direct the client to get the requested resource at another URI with a GET request.
  • 304 Not Modified: This is used for caching purposes. It tells the client that the response has not been modified, so the client can continue to use the same cached version of the response.
  • 307 Temporary Redirect: The server sends this response to direct the client to get the requested resource at another URI with the same method that was used in the prior request. This has the same semantics as the 302 Found HTTP response code, with the exception that the user agent must not change the HTTP method used: if a POST was used in the first request, a POST must be used in the second request.
  • 308 Permanent Redirect: This means that the resource is now permanently located at another URI, specified by the Location HTTP Response header. This has the same semantics as the 301 Moved Permanently HTTP response code, with the exception that the user agent must not change the HTTP method used: if a POST was used in the first request, a POST must be used in the second request.

Client error messages

  • 400 Bad Request: The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
  • 401 Unauthorized: Although the HTTP standard specifies "unauthorized", semantically this response means "unauthenticated". That is, the client must authenticate itself to get the requested response.
  • 403 Forbidden: The client does not have access rights to the content; that is, it is unauthorized, so the server is refusing to give the requested resource. Unlike 401 Unauthorized, the client's identity is known to the server.
  • 404 Not Found: The server cannot find the requested resource. In the browser, this means the URL is not recognized. In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client. This response code is probably the most well known due to its frequent occurrence on the web.
  • 405 Method Not Allowed: The request method is known by the server but is not supported by the target resource. For example, an API may not allow calling DELETE to remove a resource.
  • 406 Not Acceptable: This response is sent when the web server, after performing server-driven content negotiation, doesn't find any content that conforms to the criteria given by the user agent.
  • 408 Request Timeout: This response is sent on an idle connection by some servers, even without any previous request by the client. It means that the server would like to shut down this unused connection.
  • 409 Conflict: This response is sent when a request conflicts with the current state of the server.
  • 410 Gone: This response is sent when the requested content has been permanently deleted from server, with no forwarding address. Clients are expected to remove their caches and links to the resource. The HTTP specification intends this status code to be used for "limited-time, promotional services". APIs should not feel compelled to indicate resources that have been deleted with this status code.
  • 411 Length Required: Server rejected the request because the Content-Length header field is not defined and the server requires it.
  • 412 Precondition Failed: The client has indicated preconditions in its headers which the server does not meet.
  • 413 Payload Too Large: Request entity is larger than limits defined by server. The server might close the connection or return an Retry-After header field.
  • 414 URI Too Long: The URI requested by the client is longer than the server is willing to interpret.
  • 415 Unsupported Media Type: The media format of the requested data is not supported by the server, so the server is rejecting the request.
  • 416 Range Not Satisfiable: The range specified by the Range header field in the request cannot be fulfilled. It's possible that the range is outside the size of the target URI's data.
  • 417 Expectation Failed: This response code means the expectation indicated by the Expect request header field cannot be met by the server.
  • 418 I'm a teapot: The server refuses the attempt to brew coffee with a teapot.
  • 421 Misdirected Request: The request was directed at a server that is not able to produce a response. This can be sent by a server that is not configured to produce responses for the combination of scheme and authority that are included in the request URI.
  • 422 Unprocessable Content (WebDAV): The request was well-formed but was unable to be followed due to semantic errors.
  • 423 Locked (WebDAV): The resource that is being accessed is locked.
  • 424 Failed Dependency (WebDAV): The request failed due to failure of a previous request.
  • 425 Too Early Experimental: Indicates that the server is unwilling to risk processing a request that might be replayed.
  • 426 Upgrade Required: The server refuses to perform the request using the current protocol but might be willing to do so after the client upgrades to a different protocol. The server sends an Upgrade header in a 426 response to indicate the required protocol(s).
  • 428 Precondition Required: The origin server requires the request to be conditional. This response is intended to prevent the 'lost update' problem, where a client GETs a resource's state, modifies it and PUTs it back to the server, when meanwhile a third party has modified the state on the server, leading to a conflict.
  • 429 Too Many Requests: The user has sent too many requests in a given amount of time ("rate limiting").
  • 431 Request Header Fields Too Large: The server is unwilling to process the request because its header fields are too large. The request may be resubmitted after reducing the size of the request header fields.
  • 451 Unavailable For Legal Reasons: The user agent requested a resource that cannot legally be provided, such as a web page censored by a government.

Server error messages

  • 500 Internal Server Error: The server has encountered a situation it does not know how to handle.
  • 501 Not Implemented: The request method is not supported by the server and cannot be handled. The only methods that servers are required to support (and therefore that must not return this code) are GET and HEAD.
  • 502 Bad Gateway: This error response means that the server, while working as a gateway to get a response needed to handle the request, got an invalid response.
  • 503 Service Unavailable: The server is not ready to handle the request. Common causes are a server that is down for maintenance or that is overloaded. Note that together with this response, a user-friendly page explaining the problem should be sent. This response should be used for temporary conditions and the Retry-After HTTP header should, if possible, contain the estimated time before the recovery of the service. The webmaster must also take care about the caching-related headers that are sent along with this response, as these temporary condition responses should usually not be cached.
  • 504 Gateway Timeout: This error response is given when the server is acting as a gateway and cannot get a response in time.
  • 505 HTTP Version Not Supported: The HTTP version used in the request is not supported by the server.
  • 506 Variant Also Negotiates: The server has an internal configuration error: the chosen variant resource is configured to engage in transparent content negotiation itself, and is therefore not a proper end point in the negotiation process.
  • 507 Insufficient Storage (WebDAV): The method could not be performed on the resource because the server is unable to store the representation needed to successfully complete the request.
  • 508 Loop Detected (WebDAV): The server detected an infinite loop while processing the request.
  • 510 Not Extended: Further extensions to the request are required for the server to fulfill it.
  • 511 Network Authentication Required: Indicates that the client needs to authenticate to gain network access.