Internet is nothing but a bunch of data sent from here to there, but… how it is sent? On packets and frames.
Let’s learn how data is divided into smaller pieces and transmitted over the internet in today’s #FromZeroToHacker lesson.
Table of contents |
Introduction |
What I have learnt today? |
Stats |
Resources |
Introduction
Data over the internet isn’t sent willy-nilly: We have to follow a set of protocols to send the data. But also, we need to connect (at least!) two devices. What type of connections do exist? Which one is better for our needs?
Time to learn it, then.
What I have learnt today?
What are packets and frames?
Packets and frames are small pieces of data that, when added together, make a larger file or message.
A packet is the unit of data used in the network layer (layer 3), while a frame is the unit of data used in the data link layer (layer 2).
This process is called encapsulation. When we talk about anything IP addresses, we are talking about packets. When the encapsulating information is stripped away, we are talking about the frame itself.
Packets are small pieces of data that are sent in parts to prevent bottlenecking a network with big messages.
Packets have different structures, depending on the type of packet sent. Within it, there are a set of headers that contains meta information. The most used headers are:
- Time to Live: Sets an expiry timer for the packet, to not clog the network
- Checksum: Provides integrity checking for protocols TCP/IP
- Source address: The IP of the device that sent it
- Destination address: The IP of the device that is sent to.
A quick explanation:
- Your computer sends a packet to your router with the destination IP
- The router puts your packet into an “envelope” called frame, that has the IP address but also the MAC address.
- The frame is sent to the destination IP (another router)
- The destination router takes the frame out, reads the MAC address and sends the packet to the device with that MAC address
TCP/IP (The three-way handshake)
TCP, or Transmission Control Protocol, is another rule used in networking. The TCP/IP protocol follows a four-layer model, similar to the OSI model:
- Application
- Transport
- Internet
- Network interface
Information is added to each layer of the TCP/IP model as the packet goes through it. This process is called encapsulation. The reverse process is, therefore, decapsulation.
As we saw yesterday, TCP is connection-based, so TCP must establish a connection between two devices before data is sent.
One TCP feature is that the receiving of the data is guaranteed.
This process is called Three-way handshake.
TCP packets contain headers with metadata. The most crucial headers are:
- Source port: Port opened by the sender to send the packet.
- Destination port: Port that an application or service is running on the destination host.
- Source IP: Self-explanatory
- Destination IP: Self-explanatory
- Data: Where the data sent is stored
- Flag: Determinates how the packet should be handed by either device
- And more
The process to establish a connection between two devices is called Three-way handshake. Here is how it works:
- SYN: A SYN message is sent by the client. This is used to initiate a connection and SYNchronise the two devices together.
- SYN/ACK: Sent by the receiver, acknowledges the synchronisation attempt.
- ACK: The acknowledgement packet can be used by either the client or server to acknowledge that a number of packets have been received.
- DATA: Once a connection is established, data is sent.
- FIN: This packet is used to properly close a connection.
- RST: This packet ends all communication abruptly. Sent as a last resort when there is a problem during the process.
Any sent data is given a random number sequence and is reconstructed using this number and incrementing by 1. This number is the ISN (Initial Sequence Number).
TCP closing a connection
When a TCP closes a process, it also follows a protocol and rules:
- FIN: A FIN message is sent to the other device.
- FIN/ACK: Sent by the receiver, acknowledges the closing petition.
- ACK: The device acknowledges the process and terminates the connection.
UDP/IP
The User Datagram Protocol is another protocol used to communicate data between devices.
UDP is a stateless protocol that doesn’t require a connection between the two devices (therefore, no Three-way handshake is needed).
Despite that, this protocol sometimes is useful in some cases, even better than TCP/IP.
As a Three-Way Handshake is not needed, UDP packets are lighter, having fewer headers. However, both protocols have shared headers:
- Time to Live: Sets an expiry timer for the packet, to not clog the network
- Source address: The IP of the device that sent it
- Destination address: The IP of the device that is sent to.
As this process is stateless, the connection process is simpler:
Ports
Ports are an essential point in data exchange. To simplify this (and not just copy-paste text), think of a harbour and ports. One ship (data) is sent to a harbour (device) and has to dock at a port (device’s port). One cruiser can’t dock at a port made for a smaller ship, so every connection has a dedicated port.
The numerical value of the ports ranges between 0 and 65535.
There is a standard set of rules. Normally, web browsers use port 80, while FTP uses 21, and SSH 22. Any port between 0 and 1024 is known as a common port. Here is a list of the common ports.
While these protocols are just one standard, nothing is stopping you from changing them. Running a web server on port 8080 is nothing unheard of. But applications assume ports sometimes, so you may run into trouble if you do it. Consider yourself warned.
Stats
From 330.534th to 312.278th. Right now I’m sitting in the top 16%!
Here is also the Skill Matrix: