Notes Home

Dr. Niall McMahon
Lecture Notes
Table of Contents

Operating Systems, Part 1

CA644, System Software

Dr. Niall McMahon

2022-09-27

If you print these slides, think about using two pages per sheet although don't worry too much about it!

Credits

Dr. Niall McMahon
Drawing on previous work by:
Dr. Michael Scriney
Dr. Long Cheng
With help from Mark Humphrys. And sources credited in the references.
Autumn 2022.

Introduction

History

Take a look at the revised notes for Lecture 4: History, Part 1.

Evolution of Contemporary Architecture

Modern computers are digital and electronic. A digital computer is one where numbers are stored whole, i.e. fractions of one unit are not possible. Analogue machines, conversely, represent values using physical phenomena that allow continuous variation of a value within a range, e.g. using wheels or voltages.

There are many super books on the topic of the evolution of computer architecture. Andrew S. Tanenbaum lists a few - which I haven't read - in his book, Structured Computer Organization, in Section 1.2, Milestones in Computer Architecture. In this section, Tanenbaum divides the history of computer architecture into five generations. Tanenbaum has several other books about operating systems; his book, Operating Systems: Design and Implementation, was sold with Minix, his Unix-like system developed for learning. Linus Torvalds first mentioned Linux publicly as a Minix-related project on comp.os.minix. An old but useful paperback text I consulted was Computers by K.N. Dodd; this was first published in 1966.

Tanenbaum's Five Generations

  • The Zeroth Generation: mechanical computers.
  • The First Generation: vacuum tubes and plug boards (1945 - 1955).
  • The Second Generation: transistors and batch systems (1955 - 1965).
  • The Third Generation: integrated circuits (ICs) and multiprogramming (1965 - 1980).
  • The Fourth Generation: large scale integration and personal computers (1980 +).
  • The Fifth Generation: invisible computers.

Computer Types

(With very approximate costings, after Tanenbaum.)

  • Disposable computer (< €1).
  • Microcontroller (€10s).
  • Games consoles (€100s).
  • Personal computers and work stations (€100s - €1000s).
  • Server (€1000s).
  • Mainframes (€10s of millions).
  • Data centres (€10s - €100s of millions).
  • Super computers (€100s of millions).

Data Scales

Units and Prefixes

  • One single 1 or 0 value is a bit.
  • A byte is 8 bits.
  • A kilobyte (kB) is 1000 bytes (or 1024 bytes, depending on the context).
  • A megabyte (MB) is 1000 KB.
  • A gigabyte (GB) is 1000 MB.
  • A terabyte (TB) is 1000 GB.
  • A petabyte (PB) is 1000 TB.
  • An exabyte (EB) is 1000 PB.
  • A zettabyte (ZB) is 1000 EB.

In Context

  • 90 bytes is approximately equivalent to one line of text in a book.
  • 4 kB is approximately one page of text from a book.
  • There are about 173 million books and documents in the US Library of Congress. About 51 million of these are books. The total data content is around 15 TB.
  • The amount of data is growing exponentially. Around 2010, the amount of data generated annually was of the order of 1-2 ZB; in 2021, almost 80 ZB of data was generated.

Computer Programs

Definition

  • A sequence of instructions for a computer describing how to perform a certain task.
  • Programs are executed one instruction at a time.
  • The computer has no memory of past instructions and no knowledge of future instructions.

Algorithms

  • An algorithm is a finite sequence of well-defined instructions for solving a problem.
  • Algorithms define programs for computers.

Implementation

  1. Dedicated hardware with fixed input data.
  2. Hardware that runs the program, but with some input data read at run-time by the machine, e.g. calculators, factory machines, routers, ASICs.
  3. General-purpose hardware; the program and input data are read at run-time, e.g. PCs.

General Purpose Computers

  • The hardware, i.e. the electronic circuits, of a computer can recognise and execute a set of simple instructions, e.g. add two numbers; compare two numbers; copy data from one location to another.
  • Programs must eventually consist of some combination of these instructions.
  • This basic instruction set is the machine language for that computer.
  • Machine languages are tedious to program.

Aside: Dodd's Program for a Simple Calculator

The is useful to review in conjunction with the Feynman lecture.

Description

Imagine a simple calculator that has built-in functions, +, -, x and /. Such a calculator may have two memory locations.

The first, a five digit register, is used to store input values. A register has the ordinary meaning of a recording device, often to do with money. The second, a ten digit accumulator, stores the result of the calculation.

0 4 6 3 7

Register holding the value 4637.

0 0 0 7 2 5 5 5 8 3

Accumulator holding the value 7255583.

We can imagine that the calculator allows input using buttons or perhaps by turning wheels associated with each position in the register. There are buttons associated with each built-in funtion and one additional button (Z) that resets the accumulator to zero.

Sample Program

The problem is to multiply 17 by 18, to multiply 19 by 20, to add the results of these two multiplications together and to divide by 2. The program is:

  1. Press Z.
  2. Enter 17 into the register.
  3. Press +.
  4. Enter 18 into the register.
  5. Press x.
  6. The accumulator displays 306. Write this down.
  7. Press Z.
  8. Enter 19 into the register.
  9. Press +.
  10. Enter 20 into the register.
  11. Press x. The accumulator displays 380.
  12. Enter 306 into the register, reading from where you wrote it down.
  13. Press +. The accumulator displays 686.
  14. Enter 2 into the register.
  15. Press /.
  16. The accumulator displays 343. Write this down.

Important Steps

  1. Providing numerical information to the machine via the register.
  2. Indicating the operation to be performed, i.e. +, -, x, /, Z.
  3. Removing information via the accumulator by writing it down.

Automation

  • Moving from the calculator in the example to a general computer involves automating the three imporant steps.
  • Steps 1, 2 involve sending information to the machine and Step 3 involves receiving information.
  • Steps 1 and 2 are input and Step 3 is output, together input/output or I/O.

Levels, Layers and Organising Computers

Simplifying the Machine Language

  • Writing code in machine language is tedious and impossible for complicated programs.
  • If the machine language is considered to be the lowest level language, or L0, an L1 can be imagined that combines L0 commands together into more convenient shortcuts.
  • An L1 shortut will need to be either translated or intepreted into L0.

Translation and Interpretation

  • If the L1 program is first converted entirely into equivalent L0 code before execution, this is called translation. The translator is a program written in L0.
  • If instead, the program in L0 takes each L1 instruction one at a time and executes the equivalent L0 instructions, this is called interpretation. The L0 program is an interpreter.
  • L1 can be considered as the machine language for a virtual machine M1.

Additional Levels

  • In order to build a practical interpreter or translator, L1 cannot be too different from L0. This means that L1 may still be far from ideal for programming with ease.
  • So an L2 language can be constructed that is more user-friendly.
  • And this process can be repeated to create higher level (or layer) languages. Levels and virtual machines are interchangeable ideas.

Multilevel Machines

  • Level 5
    Problem-oriented language level.
    For example, C, C++, Java and so on.
  • Level 4
    Assembly language level.
    Compiled from Level 5.

    ----
    Levels 1 - 3 are the domain of system programming and system software.
    ----

  • Level 3
    Operating system (OS) machine level.
    Assembled from Level 4.
    Some instructions carried out directly by L2 microprogram. Others interpreted by the OS.
    System calls are L3 instructions that differ from those in L2.
  • Level 2
    Instruction set architecture (ISA) level.
    Partial interpretation from Level 3.
    Microprogram or hardware execution circuit intepreter.
  • Level 1
    Microarchitecture.
    Interpretation or direct execution from Level 2.
    Collection of registers (built from 16, 32 or 64 1-bit memories, in turn built from gates.)
    Also the arithmetic logic unit (ALU) circuit.
  • Level 0
    Digital logic level.
    Hardware. Logical gates consisting of a few transistors.

The Instruction Set Architecture (ISA) Level

The Intel Instruction Set

The complete Intel x86 instruction set is here and the official reference guides are at Intel. You can find the x86-64 architecture here.

If you write a program in the low-level instructions, it is likely to be much more efficient but you can only run the program on that particular hardware.

If you write a program in a high level language, the same program can be (re)compiled into the low-level instruction set of a different machine. This is an automated process and much easier than having to re-write the program from scratch.

One high level instruction such as:

x = x + y

Might translate into low-level instructions something like:

  • Find the memory location represented by x.
  • Read from memory into a register.
  • Do the same with y into another register.
  • Carry out various arithmetic operations.
  • Retrieve the results from some further register.
  • Write back into a memory location.

BIOS

  • Each computer hardware manufacturer provides a Basic Input Output System (BIOS) that initialises the system when it starts, setting it up to start the full OS.
  • It also provides some support for the OS when its up and running.

Applications

Once the OS is running, other software can be installed to run with the OS for specific applications. Common software applications include:

  • Office tools, e.g. Microsoft Office, LibreOffice, Adobe Acrobat.
  • Web browsers, e.g. Google Chrome, Firefox, MS Edge.

Computer Architecture

In this lecture, computer means the entire system, not only the CPU. The computer architecture that was described by John von Neumann in a 1945 draft report he co-authored called, First Draft of a Report on the EDVAC. You can find a copy of the report thanks to MIT here, First Draft of a Report on the EDVAC.

However their external appearance, all machines consist of the following logical units or sections:

  • Input unit: this receives information from input devices. The information is made available to other units for processing.
  • Central processing unit (CPU).
  • Arithmetic and logic unit (ALU).
  • Memory unit: usually utlising random access memory (RAM).
  • Output unit: takes processed information and places it on various output devices. This allows the information to be used outside the computer.
  • Secondary storage unit: long-term storage, e.g. disk drives, solid state storage, tape, cloud.

Input Devices

  • Keyboard.
  • Mouse.
  • Microphone.
  • Camera.
  • Touchpad.
  • Internet/network.

Output Devices

  • Screens, projectors and printers.
  • Speakers.
  • Braille readers.
  • Internet/network.

CPU

  • Similar to the data processor or filing clerk from the Feynman lecture.
  • Accesses instructions sequentially.
  • Supervises the operation of other units, i.e. instructing input to move information to memory.
  • Built with silicon chips that now have billions of transistors:
  • The Intel 8088 chip that launched the IBM PC in 1981 had about 29,000 transistors.
  • Intel Pentium 4s of the early 2000s had about 55,000,000 transistors.
  • Intel 11th generation Core chips have about 250,000,000 transistors in an area of about 1 mm2, meaning that the transistor count is in the tens of billions per processor.

ALU

  • Performs the arithmetic look-ups - actually calculations - from the Feynman story.
  • Addition, subtraction, multiplication, division and equality.

Primary Memory

  • Also known as simply memory.
  • Used by the CPU during processing.
  • Stores information from input unit for processing and processed information.
  • Relatively low capacity.
  • Although these days, most machines have memories of at least a few gigabytes.
  • Usually random access memory (RAM), meaning that the things stored could be accessed in any order; you don't have to start at the first location and work through until you find what you're looking for.
  • Back in the 80s, working memory was measured in kilobytes.

What are Operating Systems?

Definition

Operating systems turn ugly hardware into beautiful abstractions.

- Andrew Tanenbaum, Modern Operating Systems.

  • The most basic instruction sets designed to operate the machine and provide a platform for other software packages to run.
  • They manage hardware resources and manage the other programs.
  • They are written for particular CPUs or CPU types.

Functions

  • Process management.
  • Inter-process communication (IPC) and networking.
  • I/O and file systems.
  • Memory management.
  • Thread-support.
  • Security, i.e. safeguarding stability and system access.

Contemporary OSes

  • Microsoft Windows.
  • Linux.
  • Other Unix-like systems, e.g. Mac OS.
  • iOS.
  • Android.
  • Chrome OS.

Design

  • An operating system is a large piece of software. The Linux kernel has about 30 million lines of source code. Windows contains approximately 50 million lines of source code.
  • Linux and Windows are monolithic kernels; although the kernel is composed of a number of modules, they are combined into a single program.
  • The kernel is the core part of an operating system. It's one of the first programs loaded in startup and remains in memory as long as the machine is in operation.
  • Unix (and Linux) is written in C with some parts in Assembly.

Evolution of OSes

  • Early machines could only handle one job or task at a time, i.e. single user batch computing.
  • In batch computing, the computer runs a single program at a time but data is processed in groups or batches.
  • Often, programs were written on punched cards.
  • Operating systems developed to make switching between programs easier.
  • As computing evolved, resources were used more efficiently by allowing multiple programs to run at once, sharing the computer's resources; this is called multiprogramming.
  • Multiprogramming, still on large central machines, allowed universities to develop timesharing operating systems. These allowed many users to use the system at once.
  • In timesharing, the operating system switches access to resources between users very many times per second. This gives the illusion of simultaneous operation.
  • Timesharing meant that users got results from programs faster.

CPU Modes

User and Kernel

The CPU operates in two modes:

  1. User mode: subset of instructions available.
  2. Kernel mode: all instructions available. Also known as system or monitor mode.

In user mode, the CPU can execute only a subset of its instruction set; the dangerous instructions that might crash the machine are disallowed for security reasons.

In kernel mode no such restriction applies; the full instruction set, including dangerous instructions, is available to the CPU.

Scheduling

The kernel is always in memory, i.e. it is not scheduled. The kernel operates in kernel/system mode and includes:

  • Memory management.
  • Process scheduling.
  • Device I/O.

Other parts of the OS are scheduled. They are loaded in and out of memory. These operate in user mode include:

  • Command line/shell.
  • GUIs.
  • Other utilities.

Regular applications are all scheduled and operate in user mode.

Application Programming

  • Application programmbers can access kernel mode only via trusted services, i.e. system calls.
  • Linux system calls are written in the C language, the simplest way of invoking them is from a C program.

Interfaces

  • OSes have two primary interfaces, the hardware interface and the software interface.
  • Applications talk to the operating system by invoking the system services it provides.
  • This is called system interface programming i.e. writing code that makes system calls using the software interface.
  • The hardware interface is where the operating system meets the hardware and carries out input and output (I/O) for the user.

The Software Interface

System Calls

  • A system call, or syscall, is a request made to the kernel for access to a resource.
  • Compilers translate high-level language I/O requests into read and write system calls.
  • System services are what we can ask of the operating system and sometimes referred to as the software instruction set.
  • Assembly instructions define interaction with the CPU.
  • There are six categories of system call.

System Call Categories

  1. Process control.
  2. File management.
  3. Device management.
  4. Information maintenance.
  5. Communication.
  6. Protection.

Process Control

  • Create, e.g. fork.
  • Terminate.
  • Load and execute.
  • Get and set attributes.
  • Wait for time or event. Signal event.
  • Allocate and free memory.

File Management

  • Create, delete.
  • Open, close.
  • Read, write, reposition.
  • Get and set attributes.

Device Management

  • Request, release.
  • Read, write, reposition.
  • Get and set attributes.
  • Attach or detach.

Information Maintenance

  • Get and set system information, e.g. time, date.
  • Get and set process/file/device metadata.

Communication

  • Create, delete connection.
  • Send, receive messages.
  • Transfer status information.
  • Attach or detach remote devices.

Protection

  • Get and set file permissions.

Making a Read System Call

Invoking the Call

  • To invoke the read system call, a program will write to registers with parameters that the kernel requires for that call.
  • A register is a small amount of fast storage close to the CPU.
  • Four 16-bit registers, AX, BX, CX and DX were on board the original x86 CPUs.
  • The EAX register indicates the system call to be invoked, i.e. the syscall number is written to the EAX register.
  • This is a general purpose 32 bit location and an extended (E) version of the original AX register.
  • For read, the interrupt number the program generates is 0x80 in hexadecimal (or 00001000 in binary).
  • Other registers already contain the information that the kernel requires to carry out the system call.
  • For read, this would include the file to read from, how much data to read and where to store it.
  • With the system call and its parameters, the program hands control to the kernel.
  • It does this by generating an interrupt.
  • This interrupt is generated by software - the OS - and not hardware.

Kernel Mode

  • When the interrupt occurs the kernel takes over and the CPU switches to kernel mode.
  • The kernel looks up how to handle interrupt number 0x80 in its Interrupt Descriptor Table (IDT).
  • This leads the kernel to the associated system call handling code.
  • The kernel then looks up how to handle this particular system call, i.e. read, in its System Call Table (SCT).
  • The kernel executes the code that handles this system call before returning to continue with the rest of the program.


System call flow.

Interrupts

Interrupt Categories

The three types of interrupt are:

  • Software, generated by system calls, discussed above.
  • Hardware, generated by external devices.
  • Exceptions, generated by the CPU when errors happen.

Why Interrupts?

  • Interrupts allow the OS to maintain control.
  • CPU processes instructions one at a time. The OS must be able to retake control from another process.
  • Interrupts alert the OS to a process requiring attention.

Hardware Interface

  • How are data and commands transferred between attached hardware devices and the operating system?
  • To operate a device we need two registers, status and data.
  • The operating system reads the status register to check the device status e.g. it must be in a READY state before sending it commands.
  • The operating system writes to the status register to send commands such as READ and WRITE to the device.
  • The data register is used as storage when transmitting data to/from the device.
				    CPU           Main Memory
				     |                |
				     ------------------
				             |
				          Data Bus
				             |
				      Status / Data
				             |
				       Device (e.g. Disk)
				

Programmed I/O

The simplest approach to performing I/O is known as programmed I/O; this is where the CPU does all the I/O work. For example, reading 1 MB of data from the disk using programmed I/O happens like this:

  1. CPU issues a READ command to the device.
  2. CPU reads the status register
  3. if status != READY then goto 2.
  4. CPU reads byte from data register.
  5. CPU stores byte in main memory.
  6. if not done goto 1.

Problems with Programmed I/O

  • The repeated polling of device status register is inefficient and wastes CPU cycles.
  • The device has no way of attracting the CPU's attention; this means that the CPU must poll all devices to check if they have something interesting to say, e.g. to check if the network card has received any new packets.
  • A more sophisticated approach to I/O is required that allows devices to alert the CPU when they require attention.
  • The way a device alerts the CPU is by generating an interrupt; this more efficient approach is called interrupt-driven I/O.
  • Decoupling of I/O and computation was a major advance for operating systems.

Interrupt-Driven I/O

In this case, when a device needs attention:

  • It alerts the CPU with an interrupt.
  • The CPU finishes executing the current instruction.
  • The CPU saves its current context.
  • The CPU asks the interrupting device to identify itself and acknowledges the interrupt so that the device can stop interrupting.
  • The CPU runs the interrupt service routine of the corresponding device.
  • The saved context is restored and the CPU continues from where it left off before being interrupted.
				 Interrupt line to CPU from device
				 |
				 -->CPU           Main Memory
				 |   |                |
				 |   ------------------
				 |           |
				 |        Data Bus
				 |           |
				 |    Status / Data
				 |           |
				 ---- Device (e.g. Disk)
				

Interrupt Numbers

How does the CPU know which interrupt service routine to run?

  • The CPU uses the interrupt number.
  • This was sent back when the CPU asked the interrupting device to identify itself.
  • The interrupt number is used as an index into the Interrupt Descriptor Table (IDT) where it finds a pointer to the code for handling that interrupt.
  • That code is in the device driver for the interrupting device.
  • Until it has to process an interrupt, the CPU can be getting on with useful work; the needless polling of devices is avoided.

Interrupt Priority

Are all interrupts equal?

  • No, interrupts have different priority levels.
  • When servicing an interrupt the CPU runs at the corresponding interrupt level and cannot be interrupted by an equal or lower priority interrupt.
  • Blocking is handled by the programmable interrupt controller.
  • The system clock has highest priority interrupt and allows the operating system to regain system control, e.g. 10 milliseconds.
  • Lower priority interrupts can be interrupted by higher priority interrupts, e.g. an audio device interrupt will interrupt the handling of a disk interrupt.

Problems with Interrupt-Driven I/O

Consider playing an MP3 file through an audio device:

  • Each time the audio device runs out of data it will interrupt the CPU demanding more data be supplied.
  • If the data is not supplied in time, the result is audible glitching in the output audio stream.
  • With a single data register then the audio device will very regularly run out of data and the CPU will be forced to deal with a lot of interrupts.
  • Tying the CPU up in the mundane task of simply moving data between memory and the device is wasteful.
  • Can we do better? Can we free the CPU of this mundane task?

Direct Memory Access

I/O devices are much slower than the CPU. The CPU takes 100 microseconds or less to put a character into memory. For 999,900 microseconds of each second, the CPU is waiting.

  • Using Direct Memory Access (DMA), the CPU programs a DMA controller to do the transfer.
  • Device controllers operate in parallel.
  • The CPU specifies how much data to read/write and where to read it from/write it to.
  • The DMA controller does the transfer in the background, in parallel.
  • It interrupts the CPU only when the entire I/O operation is complete.
  • In the above example the CPU would program the DMA controller to read the entire MP3 file from memory and write it to the audio device.
  • A DMA controller may be shared by a number of devices or may be dedicated to one specific device.
  • While DMA is under-way the CPU can get on with useful work.
			         Disk              Printer
			         |                 |
			         Disk Controller   Printer Controller
			         |                 |
			     CPU -------------------
			         |
			         Memory Controller
			         |
			         Memory
			    

References

  • Histories of Unix include: UNIX Past at Unix.org, History of Unix by Jeff Lessem at the University of Colorado, and The invention of Unix at (Nokia) Bell Labs.
  • Structured Computer Organization by Andrew Tanenbaum. Tanenbaum has several other books about operating systems; his book, Operating Systems: Design and Implementation, was sold with Minix, his Unix-like system developed for learning. Linus Torvalds first mentioned Linux publicly as a Minix-related project on comp.os.minix. An old but useful paperback text I consulted was Computers by K.N. Dodd; this was first published in 1966.