In this chapter, we study the conceptual and implementation aspects of network applications.
We begin by defining key application-layer concepts, including network services required by applications, clients and servers, processes, and transport-layer interfaces.
We examine several network applications in detail, including the Web, e-mail, DNS, peer-to-peer (P2P) file distribution, and video streaming.
We then cover network application development, over both TCP and UDP.
In particular, we study the socket interface and walk through some simple client-server applications in Python.
We also provide several fun and interesting socket programming assignments at the end of the chapter.

2.1 Principles of Network Applications

At the core of network application development is writing programs that run on different end systems and communicate with each other over the network.
Servers often (but certainly not always) are housed in a data center.

You need to write software that will run on multiple end systems, but do not need to write software that runs on network-core devices, such as routers or link-layer switches.
Network-core devices do not function at the application layer but instead function at lower layers— specifically at the network layer and below.
This basic design—namely, confining application software to the end systems has facilitated the rapid development and deployment of a vast array of network applications.

2.1.1 Network Application Architectures

  • client-server architecture

    • server(服务器)
      an always-on host, which services requests from many other hosts, called clients(客户端).
  • P2P architecture
    In a P2P architecture, there is minimal (or no) reliance on dedicated servers in data centers.
    Instead the application exploits direct communication between pairs of intermittently connected hosts, called peers(对等节点).
    One of the most compelling features of P2P architectures is their self-scalability.

2.1.2 Processes Communicating

  • process(进程)
    A process can be thought of as a program that is running within an end system.
    In this book, we are not particularly interested in how processes in the same host communicate, but instead in how processes running on different hosts (with potentially different operating systems) communicate.

Client and Server Processes

A network application consists of pairs of processes that send messages to each other over a network.
For each pair of communicating processes, we typically label one of the two processes as the client and the other process as the server.
Indeed, a process in a P2P file-sharing system can both upload and download files.

In the context of a communication session between a pair of processes, the process that initiates the communication (that is, initially contacts the other process at the beginning of the session) is labeled as the client. The process that waits to be contacted to begin the session is the server.

The Interface Between the Process and the Computer Network

Any message sent from one process to another must go through the underlying network.
A process sends messages into, and receives messages from, the network through a software interface called a socket(套接字).
A socket is the interface between the application layer and the transport layer within a host, and it’s also referred to as the Application Programming Interface (API) between the application and the network, since the socket is the programming interface with which network applications are built.

The application developer has control of everything on the application-layer side of the socket but has little control of the transport-layer side of the socket, only control that:

  • the choice of transport protocol
  • perhaps the ability to fix a few transport-layer parameters

Addressing Processes

To identify the receiving process, two pieces of information need to be specified:
1. the address of the host
The host is identified by its IP address, is that an IP address is a 32-bit quantity that we can think of as uniquely identifying the host.
2. an identifier that specifies the receiving process in the destination host (more specifically, the receiving socket)
This information is needed because in general a host could be running many network applications.
A destination port number serves this purpose.
Popular applications have been assigned specific port numbers.

2.1.3 Transport Services Available to Applications

What are the services that a transport-layer protocol can offer to applications invoking it? We can broadly classify the possible services along four dimensions: reliable data transfer, throughput, timing, and security.

Reliable Data Transfer

Throughput(吞吐量)

In the context of a communication session between two processes along a network path, is the rate at which the sending process can deliver bits to the receiving process.
The available throughput can fluctuate with time.
These observations lead to another natural service that a transport-layer protocol could provide, namely, guaranteed available throughput at some specified rate.
Applications that have throughput requirements are said to be bandwidth-sensitive applications.

Timing

A transport-layer protocol can also provide timing guarantees.

Security

A transport protocol can provide an application with one or more security services.

2.1.4 Transport Services Provided by the Internet

The Internet (and, more generally, TCP/ IP networks) makes two transport protocols available to applications, UDP and TCP.

TCP Services

The TCP service model includes a connection-oriented service and a reliable data transfer service.

  • connection-oriented service
    • handshaking procedure
      TCP has the client and server exchange transport-layer control information with each other before the application-level messages begin to flow, allowing them to prepare for an onslaught of packets.
      The connection is a full-duplex connection in that the two processes can send messages to each other over the connection at the same time.
      When the application finishes sending messages, it must tear down the connection.
  • reliable data transfer service
    The communicating processes can rely on TCP to deliver all data sent without error and in the proper order.
    The TCP congestion-control mechanism throttles a sending process (client or server) when the network is congested between sender and receiver.

UDP Services

UDP is a no-frills, lightweight transport protocol, providing minimal services.
UDP is connection less, so there is no handshaking before the two processes start to communicate.
UDP does not include a congestion-control mechanism, so the sending side of UDP can pump data into the layer below (the network layer) at any rate it pleases.
Note, however, that the actual end-to-end throughput may be less than this rate due to the limited transmission capacity of intervening links or due to congestion.

TLS

Neither TCP nor UDP provides any encryption—the data that the sending process passes into its socket is the same data that travels over the network to the destination process.
TCP-enhanced-with-TLS not only does everything that traditional TCP does but also provides critical process-to-process security services, including encryption, data integrity, and end-point authentication.
TLS(Transport Layer Security) is not a third Internet transport protocol, on the same level as TCP and UDP, but instead is an enhancement of TCP, with the enhancements being implemented in the application layer.

Services Not Provided by Internet Transport Protocols

We have organized transport protocol services along four dimensions: reliable data transfer, throughput, timing, and security.
Today’s Internet can often provide satisfactory service to time-sensitive applications, but it cannot provide any timing or throughput guarantees.

2.1.5 Application-Layer Protocols

An application-layer protocol defines:

  • The types of messages exchanged, for example, request messages and response messages
  • The syntax of the various message types, such as the fields in the message and how the fields are delineated
  • The semantics of the fields, that is, the meaning of the information in the fields
  • Rules for determining when and how a process sends messages and responds to messages

2.1.6 Network Applications Covered in This Book

In this chapter, we discuss five important applications: the Web, electronic mail, directory service, video streaming, and P2P applications.

2.2 The Web and HTTP

  • World Wide Web(万维网)

2.2.1 Overview of HTTP

The Hyper Text Transfer Protocol (HTTP), the Web’s application-layer protocol, is at the heart of the Web.
HTTP is implemented in two programs: a client program and a server program.
The client program and server program, executing on different end systems, talk to each other by exchanging HTTP messages.
HTTP defines the structure of these messages and how the client and server exchange the messages.
A Web page (also called a document) consists of objects which is addressable by a single URL.
Each URL has two components: the host name of the server that houses the object and the object’s path name.
Because Web browsers (such as Internet Explorer and Chrome) implement the client side of HTTP, in the context of the Web, we will use the words browser and client interchangeably.
Web servers, which implement the server side of HTTP, house Web objects, each addressable by a URL.
HTTP defines how Web clients request Web pages from Web servers and how servers transfer Web pages to clients.

HTTP uses TCP as its underlying transport protocol, and need not worry about lost data or the details of how TCP recovers from loss or reordering of data within the network, that is the job of TCP and the protocols in the lower layers of the protocol stack.
Because an HTTP server maintains no information about the clients, HTTP is said to be a stateless protocol.
The Web uses the client-server application architecture.

2.2.2 Non-Persistent and Persistent Connections

  • non-persistent connections(非持久连接), persistent connections(持久连接)
    Should each request/response pair be sent over a separate TCP connection, or should all of the requests and their corresponding responses be sent over the same TCP connection?
    Although HTTP uses persistent connections in its default mode, HTTP clients and servers can be configured to use non-persistent connections instead.

HTTP with Non-Persistent Connections