Uniform Resource Locator


(URL, previously "Universal") A standard way of specifying the location of an object, typically a web page, on the Internet. Other types of object are described below. URLs are the form of address used on the World-Wide Web. They are used in HTML documents to specify the target of a hypertext link which is often another HTML document (possibly stored on another computer).

Here are some example URLs:


The part before the first colon specifies the access scheme or protocol. Commonly implemented schemes include: ftp, http (web), gopher or WAIS. The "file" scheme should only be used to refer to a file on the same host. Other less commonly used schemes include news, telnet or mailto (e-mail).

The part after the colon is interpreted according to the access scheme. In general, two slashes after the colon introduce a hostname (host:port is also valid, or for FTP user:passwd@host or user@host). The port number is usually omitted and defaults to the standard port for the scheme, e.g. port 80 for HTTP.

For an HTTP or FTP URL the next part is a pathname which is usually related to the pathname of a file on the server. The file can contain any type of data but only certain types are interpreted directly by most browsers. These include HTML and images in gif or jpeg format. The file's type is given by a MIME type in the HTTP headers returned by the server, e.g. "text/html", "image/gif", and is usually also indicated by its filename extension. A file whose type is not recognised directly by the browser may be passed to an external "viewer" application, e.g. a sound player.

The last (optional) part of the URL may be a query string preceded by "?" or a "fragment identifier" preceded by "#". The later indicates a particular position within the specified document.

Only alphanumerics, reserved characters (:/?#"<>%+) used for their reserved purposes and "$", "-", "_", ".", "&", "+" are safe and may be transmitted unencoded. Other characters are encoded as a "%" followed by two hexadecimal digits. Space may also be encoded as "+". Standard SGML "&<name>;" character entity encodings (e.g. "é") are also accepted when URLs are embedded in HTML. The terminating semicolon may be omitted if &<name> is followed by a non-letter character.

