Network Sockets (a.k.a. Internet Sockets) are application-level software implementations that map data from a process on one network machine to another network machine. They were originally defined in the 1971 RFC 147 to describe network communication within the ARPA network (R).
Sockets are essential to network communications but sometimes get discussed such that lead to unclear definitions. At their core, sockets are software-level implementations that serve as connections between processes on different network end systems. Another way to define sockets is as a process interface through which data are sent and received.
Overview
Technical implementations of sockets have evolved greatly since their early inception. Initially, a Network Socket was defined as such:
… the unique identification to or from which information is transmitted in the network. The socket is specified as a 32-bit number with even sockets identifying receiving sockets and odd sockets identifying sending sockets. A socket is also identified by the host in which the sending or receiving processer is located.
Conceptually, network sockets serve as bridges between network applications such that information from a sender can arrive safely at a receiver. In many modern networked applications, the sender/receiver relationship can be described as client/server where a server socket is established and “listens” for incoming client socket connections. Beyond this conceptual distinction, network sockets can be categorized by several technical means as well.
What is a Socket, really?
As a simple concept, sockets can be thought of as transport-layer facilitated connections between end systems; bridges between river banks. In more technical terminology; sockets are endpoint instances within a network such that physical end systems can maintain multiple sockets to multiple other end systems.
Host Address + Port Number
A host address is, ultimately, a machine-friendly identifier that helps one end system locate another end system within a networked environment. Common host addresses take human-friendly formats such as www.example.com but are translated to machine-readable IP addresses by domain name service (DNS) protocols.
Ports are locational constructs that allow different processes to communicate. Ports are software-based (there are no physical layer ports) that are managed by a computer’s operating system. In the context of networking, ports identifiers used to locate a process on a local machine such that a remote machine can connect accordingly. Getting the bits to the right neighborhood and to the right house, as it were.
The combination of host:port
addressing helps a process on one machine connect to another machine and locate a corresponding process there. The initial specification for TCP refers to a specific host:port
address as an end system and sometimes refers to that construct as a socket (R). This builds to define a TCP connection as two sockets.
Types of Network Sockets
There are several types of network sockets available, though Stream Sockets and Datagram Sockets are the most popular. These types of network sockets support TCP and UDP communications; the two most implemented transport-level protocols among Internet applications (R).
Stream Socket
Stream sockets are used with Transmission Control Protocol (TCP) applications or Stream Control Transmission Protocols (SCTP) and support the Reliable Data Protocols (RDP) inherent to each.
Datagram Socket
Datagram sockets are used to support User Datagram Protocol (UDP) applications that rely on connectionless data transfer. Each packet sent via datagram sockets is individually addressed and routed but takes no measure to ensure order or arrival. Datagram sockets are considered to be “unreliable” transport services.
Raw Socket
Raw sockets allow the send/receive of Internet Protocol packets from the network layer without any specific constraint on protocol (TCP, UDP, etc.). As such, header specifications are made at the application layer when sending and much of the encapsulation is left up to application developers.
Client Sockets vs. Server Sockets
In Internet Protocol (IP) there is no generic difference between a client socket and a server socket. Both are implemented to map a process on one network machine to a process on another network machine. On an application level, however, these types of network sockets can be distinguished as such:
Client Socket – Created to send a connection request to a specific host:port
combination. Waits in the server-side connection queue until accepted.
Server Socket – Created to listen for incoming client connection requests on a specific port on a given host. Created on well-known ports mapped for connections of a particular protocol. Port 80
for HTTP requests, for example.
Socket Protocols
In addition to sockets being configurable to different ports, they too can be configurable for different transport-layer protocols such as TCP or UDP. The nature of TCP and UDP help define the differences in socket type:
UDP Socket – Uses a two-tuple {host:port}
set for only the destination address.
TCP Socket – Uses a four-tuple {source:port, host:port}
set for host and destination addresses.
UDP is a connectionless protocol that makes no affordance to data reliability. As such, there is no need for a “return address” because the recipient makes no effort to make acknowledgment to the sender.
TCP is a connection-oriented protocol that makes certain guarantees for data arriving in order, without loss, and keeping network congestion to a minimum. As such, TCP needs to facilitate full-duplex communication. Knowing both the sender and receiver addresses are required for this.
Review
If you’d made it this far you’re likely now aware that the term socket is fairly abstract. On one hand, it’s a very concrete construct that connects two network end systems. On the other, different ports can be used to create many virtual endpoints on a single host system. The more easily-defined attributes of sockets are that they require host: