CA644, System Software
Table of Contents
CA644, System Software
Dr. Niall McMahon
2022-10-25
If you print these slides, think about using two pages per sheet although don't worry too much about it!
Dr. Niall McMahon
Drawing on previous work by:
Dr. Michael Scriney
Dr. Long Cheng
And sources credited in the references.
Autumn 2022.
read()
and write()
work with sockets.
OSI is a conceptual model that characterises and standardises the communication functions of a telecommunication or computing system without regard to its underlying internal structure and technology. See OSI model at Wikipedia for more.
The OSI model has many layers; this increase the complexity of networking.
OSI | TCP |
Application
Presentation Session | Application |
Transport | Transport |
Network | Internet |
Data link
Physical | Network Interface |
TCP | Technologies and Protocols |
Application | HTTP, FTP, SMTP |
Transport | TCP, UDP |
Internet | IP, ARP (Address Resolution Protocol) |
Network Interface | Ethernet, FDDI (Fiber), ATM (Asynch. Transfer Mode) |
In the OSI seven-layer model, the lowest layer is Layer-1 (L1). The most abstracted layer is Layer-7 (L7).
A stream, SOCK_STREAM
, is a reliable, two-way communication data stream.
A socket datagram, SOCK_DGRAM
, lacks data verification but has a higher efficiency.
TCP operates on streams, not packets: a single send
does not send a single packet. A single receive
does not necessarily receive the same amount of data. One machine may send twice, each time sending a stream of 5 bytes. The receiving machine may only receive once, a stream of 10 bytes.
A single message can be framed by sending information about its length or by using delimiters.
Socket programming is based on the transport layer, i.e. TCP/UDP. When two computers communicate:
The destination process is identified using:
See marked up code for a client and server.
Client | Server |
create socket
define server socket (name)
(Request connection between local and server sockets) connect
|
create socket
define server socket (name)
bind
(local socket is bound to server name) listen
(Accept connection) accept
|
(Client/server session)
write
read
|
(Client/server session)
read
write
|
(EOF)
close
|
(EOF)
read
close
Back to listen /accept step
|
Structures or struct
are data types that are built using things of other types.
This example is close to what Deitel and Deitel use in their C++ How to Program, in the chapter on classes and data abstraction.
struct Time {
int hour; // 0 - 23
int minute; // 0 - 59
int second; // 0 - 59
};
struct
begins the definition. Time
is the structure tag. The structure tag is the name of this structure. Variables are associated with the structure tag, i.e. this new structure type. However! Structures can contain a pointer to another Time structure, i.e. Time *timeptr
.
In this example, hour
, minute
and second
are the members of Time
. Members can be of any type but structures cannot include an instance of themselves, i.e. in this example a structure of type Time
.
To declare a structural variable of type Time
, write Time NewTimeVariable
. Here, NewTimeVariable
is created and is of type Time
. The hour
variable member of NewTimeVariable
is accessed by writing NewTimeVariable.hour
.
Structs are another kind of abstraction. Another example of how things can be layered to create higher level structures that improve user experience.
From IBM, the sockaddr_in
struct is defined as follows:
struct sockaddr_in {
short sin_family;
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
sin_len
: This field contains the length of the address for UNIX 98 specifications. Note: The sin_len field is provided only for BSD 4.4 compatibility. It is not necessary to use this field even for BSD 4.4/ UNIX 98 compatibility. The field is ignored on input addresses.
sin_family
: This field contains the address family, which is always AF_INET when TCP or User Datagram Protocol (UDP) is used.
sin_port
: This field contains the port number.
sin_addr
: This field contains the IP address. sin_addr
is of C type union, i.e. a value that may have any of several representations or formats within the same position in memory. When setting the IP address, the exact type and needs to be further specified. s_addr
specifies the IP address as one (4 byte) integer.
sin_zero
: This field is reserved. Set this field to hexadecimal zeros.
There's a little more here about the memory allocated to the socket address structure.
From IBM:
void *memset(void *dest, int c, size_t count)
In this definition, memset is defined with a pointer to an unspecified type, i.e. void *dest
; dest
contains an address in memory. The memset() function sets the first count
bytes at dest
to the value c
. The value of c
is passed in as an int but it is converted to an unsigned character.
The memset() function returns a pointer to dest
.
There's a nice description at A detailed tutorial on Memset in C/C++ with usage and examples.
memset(buffer, 0, sizeof(buffer));
In this example, memset() is used to set all bits of a character array buffer
to 0. The array name buffer
is the memory address of the starting point of the array.
From the International C Standard:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
From IBM:
#include <sys/types.h>
#include <sys/socket.h>
int connect(int socket, struct sockaddr *address, int address_len);
Here, the first parameter is this socket's file descriptor id, sock
. address
is the start location of a socket address structure that contains the address of the second (target) socket. address_len
is the length of the second (target) socket address structure.
From our client example,
connect(sock, (struct sockaddr*) &serv_addr, sizeof(serv_addr));
Here, the first parameter is the socket file descriptor id, sock
, of the client process. The second parameter is the address of the socket address structure of the second (server) process. Its type is cast - i.e. changed - to a pointer to a struct using (struct sockaddr*), its name. Finally, the size occupied by the socket address structure describing the second socket is passed as the third parameter.
From IBM:
#include <sys/types.h>
#include <sys/socket.h>
int bind(int socket, struct sockaddr *address, int address_len);
Here, the first parameter is this socket's file descriptor id, sock
. address
is the start location of a socket address structure that contains the address of the second (target) socket. address_len
is the length of the second (target) socket address structure.
From our client example,
bind(serv_sock, (struct sockaddr*) &serv_addr, sizeof(serv_addr));
Here, the first parameter is the socket file descriptor id, serv_sock
, of the client process. The second parameter is the address of the socket address structure of the second (server) process, its name. Its type is cast - i.e. changed - to a pointer to a struct using (struct sockaddr*). Finally, the size occupied by the socket address structure describing the second socket is passed as the third parameter.
From IBM:
#include <sys/types.h>
#include <sys/socket.h>
int listen(int socket, int backlog);
The first parameter is this socket's file descriptor id, sock
. The second is the number of client connections that are allowed wait before connection requests are rejected. listen() indicates that sock
is where connection requests will be accepted.
From our client example,
listen(serv_sock, 20);
So serv_sock is the socket to direct connection requests to and there are to be no more than 20 connection request waiting for serv_sock.
From IBM:
#include <unistd.h>
ssize_t read(int socket, void *buf, ssize_t N);
Where the first parameter is the file descriptor id of the target socket, buf
is the memory address of the buffer to write the received information into and N
is the length in bytes of the buffer that buf
points to. read()
returns a value of type size in bytes, signed, i.e. +/-.
As a note:
write()
does not immediately transmit data to the network, but first writes the data into the buffer, andthen the TCP protocol sends the data from the buffer to the target machine.
write()
function.
From our client example,
read(sock, buffer, sizeof(buffer)-1);
Here, the first parameter is the socket file descriptor id, sock
, of the client process. The second parameter is the address of the start of the buffer array - remember an array name is the start address of the array. Finally, the size occupied by the buffer is passed as the third parameter. One character in buffer
is
reserved for the termination character, meaning that the available size of buffer is sizeof(buffer) - 1
. Character arrays must be null terminated, i.e. the final character must be '\0'
.
Similar to read()
. As a note:
write()
buffer is less than the data to be sent, then write()
will be blocked until the data in the buffer is sent to the target machine.
write()
will not return until all data is written into the buffer.
TCP/IP uses network byte ordering. A 16-bit integer (short) or a 32-bit integer (long) is sent from a host (a server is a kind of host) using host to network (HTON) with htons()
or htonl()
. Similarly, integers are received from the network to the host using network to host (NTOH) with ntohs()
or ntohl()
.
Byte ordering:
Conversion:
htons()
, htonl()
: host to network short/long. Short is 16 bit, long is 32 bit.
ntohs()
, ntohl()
: network to host short/long. Short is 16 bit, long is 32 bit.
What is converted?
Denary | Binary | Hexadecimal |
0 | 0000 | 0 |
1 | 0001 | 1 |
2 | 0010 | 2 |
3 | 0011 | 3 |
4 | 0100 | 4 |
5 | 0101 | 5 |
6 | 0110 | 6 |
7 | 0111 | 7 |
8 | 1000 | 8 |
9 | 1001 | 9 |
10 | 1010 | A |
11 | 1011 | B |
12 | 1100 | C |
13 | 1101 | D |
14 | 1110 | E |
15 | 1111 | F |
By R. S. Shaw - Own work, Public Domain (Wikipedia).
A computer uses the same endianness to store and find the integer value so the output is the same for a another machine using the other endianness.
However, problems can happen when memory is addressed using bytes instead of integers, or when memory contents are transmitted between computers with different endianness.