Connecting cables between two or more computers and allow them to send data back and forth through these cables; connecting machines shooting bits at each other all over the planet, the network is called internet.
For effective communications, computers on both ends must know what the bits represent. Its style is called network protocol. There are protocols for sending & fetching email, for sharing files, and even for controlling computers that happen to be infected by malicious software.
All Internet-connected devices “speak” Transmission Control Protocol (TCP), and most communication on the Internet is built on top of it.
A TCP connection works as follows:
- One computer must be waiting, or listening, for other computers to start talking to it.
- To be able to listen for different kinds of communication at the same time on a single machine, each listener has a number (called a port) associated with it.
- Most protocols specify which port should be used by default. For example, when we want to send an email using the SMTP protocol, the machine through which we send it is expected to be listening on port 25.
- Another computer can then establish a connection by connecting to the target machine using the correct port number.
- If the target machine can be reached and is listening on that port, the connection is successfully created.
- The listening computer is called the server, and the connecting computer is called the client.
- Such a connection acts as a two-way pipe through which bits can flow—the machines on both ends can put data into it.
- Once the bits are successfully transmitted, they can be read out again by the machine on the other side.
World Wide Web (not the Internet) is a set of protocols and formats that allow us to visit web pages in a browser.
- To become part of the Web,
- connect a machine to the Internet
- have it listen on port 80 with the HTTP protocol
- so that other computers can ask it for documents.
Machines connected to the Internet get an IP address, which is a number that can be used to send messages to that machine, and looks something like 18.104.22.168 or 2001:4860:4860::8888.
You can register a domain name to point at the IP address of a machine you control.
Each document on the Web is named by a Uniform Resource Locator (URL), e.g.:
|Protocol||Domain Name Server (document’s location)||Path (of the requested document)|
If you type this URL into your browser’s address bar, the browser will try to retrieve and display the document at that URL.
- Then, using the HTTP protocol, it will make a connection to the server at that address and ask for the resource /13_browser.html.
- If all goes well, server sends back document, your browser then displays on your screen.
The Hypertext Transfer Protocol (HTTP) retrieves named resources (chunks of information, such as web pages or pictures).
- It specifies that the side making the request start with naming the resource and the version of the protocol that it is trying to use: GET /index.html HTTP/1.1
- HTTP treats the network as a streamlike device into which you can put bits and have them arrive at the correct destination in the correct order.
Isolating a programming environment to test is called sandboxing, the idea being that the program is harmlessly playing in a sandbox.
When you open a web page in your browser, the browser retrieves the page’s HTML text and parses it.
- The browser builds up a Document Object Model (DOM) of the document’s structure and uses this model to draw the page on the screen.