Notes Home

CA644, System Software
Table of Contents

C, Part 1 - The System Language

CA644, System Software

Dr. Niall McMahon

2022-10-11

If you print these slides, think about using two pages per sheet although don't worry too much about it!

Credits

Dr. Niall McMahon
Drawing on previous work by:
Dr. Michael Scriney
Dr. Long Cheng
And sources credited in the references.
Autumn 2022.

Introduction

System Programs

  • System programs are those that work at the interfaces between applications and hardware.
  • Operating systems.
  • Microcontrollers, i.e. dedicated systems used in airplanes, cars, trains, factories etc.
  • Embedded processors, i.e. portable electronics.
  • Digital signal processing, i.e. analogue to digital conversion and so on.

Origins of C

  • C was designed to create system programs; its structure is close to the underlying machine instructions, i.e. it is a low-level, high-level language.
  • Originally developed to build utilities for Unix by Dennis Ritchie at AT&T Bell in 1972.
  • Unix was rebuilt using C in 1973.
  • Most recent version of the standard released in 2011; this is the C17 standard.
  • Hardware drivers etc. are often written in C.

Development of C

  • Derivatives of C include C++, C# and others and it has inspired many other languages, including Python.
  • It is fast but lacks some of the features of subsequently developed languages, e.g. object oriented structures, error handling and more.
  • C, as a relatively low-level language with not many built-in safety checks, can cause system problems if used incorrectly.
  • In real-life, programs ought to be tested in a debugging environment first and never run as root on a Unix-like system.

Machine Code

  • As we discussed before, machine code refers to the numeric instruction set; this is unique to each processor.
  • Assembly is a high level version of machine code that uses symbolic shortcuts to refer to instructions.

Using C

Basics

  • Programs are written in text files with .c extension, i.e. hello_world.c.
  • gcc (GNU Compiler Collection) is used to compile the program so that it can be run.
  • This produces an object file with the extension .o. The object file is used to build the executable files.
  • The syntax is: gcc hello_world.c -o hello_world.o.
  • The -o flag creates the output executable that can be run using the command ./hello_world.
  • Adding a -Wall flag, i.e. gcc -Wall hello_world.c -o hello_world.o, enables (most) compiler warnings.

Debugging

  • Programs can be run and debugged using the command-line debugger, gdb.
  • Once a program has been compiled, it can be debugged by using: gdb hello_world.o.
  • I'll provide links to online/offline resources with more about gdb.

Steps in Building C Programs

  • Preprocess: gcc -E hello_world.c > hello_world.i. The -E flag explicitly runs the preprocessor.
  • Compile: gcc -S hello_world.i. The -S flag explicitly creates assembly code output, .s file.
  • Assemble: gcc -c hello_world.s. The -c flag explicitly creates an object file, .o file. This is machine code.
  • Link: gcc -o hello_world hello_world.c. The -o flag explicitly creates the executable by finding the missing functions, for example, printf() and scanf() from the standard C library, libc.
  • You can create the executable in one go using gcc -o hello_world hello_world.c.

IDEs

  • We will create and edit programs using Vim or Nano, or a text editor.
  • When developing seriously, programmers use Integrated Development Environments (IDEs), e.g. Netbeans (for Java), Eclipse and others.
  • This is a single application that allows code editing, compilation and debugging all in one place, usually with a nice GUI.
  • Complex programs benefit most from IDEs.

Hello World

/* Hello world! */
/* This is a comment
that spans
multiple lines. */
//This is a single line comment.
//Comments, as usual, are ignored by the compiler.

#include <stdio.h>
/*stdio.h is a header file that contains input and output functions for files.

int main()
{
printf("Hello World!\n");

return 0; //Return 0 means success.
}

Header Files

Commonly used header files include:

  • stdlib.h: defines four variable types, macros and functions. General purpose.
  • stdio.h: standard input and output definitions, macros and functions.
  • string.h: type definition, one macro and functions for manipulating character arrays.
  • time.h: time type definitions, macros and functions.
  • ctype.h: functions useful for testing and mapping characters, i.e. deciding which class a character falls into - if it is an alphabetic character, a control character etc.
  • math.h: math functions and one macro.

Variables

Variables are declared before use and types include:

  • int: basic integer type (16 bit minimum size).
  • float: single precision floating point type (32 bit minimum size).
  • char: this is the smallest addressable unit of the machine that can contain a basic character set. It is an integer type (8 bit minimum size).

And others.

Area of a Rectangle

/* Calculate the area of a rectangle */

#include <stdio.h>

int main()
{
float height = 5.0;
float width = 15.0;
float area = height*width;
printf("Area is: %f\n", area);
}

The specifier "%f" casts area as a float.

Reading in Characters

  • getchar() is used to read a character from the command line.
  • Each character is an integer, a decimal version of a 7-bit code, defined by the ASCII table.
  • 0 is represented by the number 48, i.e. 011 0000.

If we wish to read in a number then the following code would be used:

int fromInput = getchar();
int myInt = fromInput-48;

For Loops

To print out the numbers 1 - 5, use:

int i = 0;
for(i; i < 6; i++)
{
 printf("%d",i);
}

The specifier "%d" casts i as a decimal integer.

Pointers

  • Useful feature of C.
  • Pointers allow access to a variable's location in memory.

Create an integer variable myint and assign it the value 5.

int myint = 5;

The next line returns the address in memory where myint is stored.

printf("%d", &myint);

We can assign this address to a pointer. The declaration int * creates a pointer to an integer value.

int *newpointer;

The next line assigns the address of the variable a to the pointer variable newpointer.

newpointer = &myint;

Printing to screen newpointer will give the same result as printing to screen the address of myint, i.e. &myint.

printf("%d", newpointer);

However, printing to screen *newpointer will give the contents of the memory location that newpointer contains, i.e. 5 in this example.

printf("%d", *newpointer);

Why are pointers useful?

Pointers are used to pass values into functions in C.

Functions

  • Parameters (or arguments) in C work differently to other programming languages.

What does this code snippet do?

#include <stdio.h>
void add(int x, int y){
x = x + 1;
y = y + 1;
}

int main()
{
int x = 2;
int y = 3;
add(x,y);
printf("X is %d, Y is %d",x,y);
}
  • The function add, above, gets its own copy of x and y.
  • The original variables in main are unchanged.
  • This is call by value.
  • To modify the original x and y, we must call by reference.
  • Use pointers for this.
  • Instead of taking the values of x and y as parameters, if we take their address in memory, i.e. using pointers.
  • This is how to modify the original values.

Call by Reference

The code is now:

#include <stdio.h>
void add(int *x, int *y){
*x = *x+1;
*y = *y+1;
}
int main()

{
int x = 2;
int y = 3;
add(&x,&y);
printf("X is %d, Y is %d",x,y);
}

In this case, the function call was:

add(&x,&y);

The function is defined as taking pointers to integers, not integers:

void add(int *x, int *y)

Libraries

The C POSIX Library is the specification of a standard library for POSIX systems; these include Unix-like systems such as Linux. The unistd.h header file provides access to the POSIX OS API, including access to system calls such as fork. The stdio.h header file declares functions that deal with standard input and output. It includes one function, fdopen(), that is supported only by a POSIX program. The sys/types.h header file includes definitions for custom datatypes, e.g. structs.

System Calls

Linux API

  • The system call sends a request to the kernel through interrupts; these implement services provided by the kernel.
  • See the The Linux Kernel API and the Linux Man Pages.

exit()

  • The exit() system call stops the program.
  • Else the program will continue to run and cause a segmentation error, i.e. usually trying to read or write to a memory location that is not available to the process.
  • In C, the main C library, libc, handles this call.

getpid()

getpid(): Description

  • The function getpid() is used to return the current process id of a program.
  • Write a C program called pid.c which prints out its process id.
  • getpid returns a type pid_t; it is called like this:

    pid_t proc_id; //declare variable called proc_id of type pid_t
    proc_id = getpid();

  • You may need to include unistd.h, for access to POSIX system calls.
  • To print it you use printf("my process id is: %d\n",<variable>).

getpid(): Example

Make sure you can implement this:

#include <stdio.h>
#include <unistd.h>

int main(void){
pid_t proc_id;
proc_id = getpid();
printf("The process ID is: %d\n", pid_t);
return 0;
}

fork()

fork(): Description

  • The fork system call will create a new process; it does this by creating a copy of the process that calls it.
  • When fork is called it will return two integers of type pid_t, one to the parent process and one to the child process.
    • The parent will get the process ID of the child created.
    • The child will get the number 0.
  • You can use this to determine if the parent or child process is running, i.e. if the return from fork() is 0, then the child process is running.

if(return_from_fork == 0){
// code to execute in child
}
else{
// code to execute in parent
}

fork(): Exercise

  • Write a program called fork.c which will create a child process and print out the process IDs of the parent and child.
  • It should not print out the process ID of 0.

fork(): Example 1

In the following code, a process is forked.

Each of the two processes then continues to run, each printing out its process ID and a number from 1 to 100.

By running this program, you will see how the processes are interleaved.

The order of printouts will change each time the program is run - the order depends on when the CPU's context switched between the two processes; this is hard to predict.

#include <stdio.h>
#include <unistd.h>

#define MAXIMUM_COUNT 100

void main(void){
pid_t pid;
int i;

fork();
pid = getpid();
for (i = 1; i <= MAXIMUM_COUNT; i++) {
printf("This line is from pid %d, value = %d\n", pid, i);
}
}

Try this yourself.

fork(): Example 2

In the following code, the returned integer from the fork() function is used to determine if the process is the parent or child.

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>

#define MAXIMUM_COUNT 100

int ChildCode(void); /* Declare a function that will be called if the process is the child process. */
int ParentCode(void); /* Declare a function that will be called if the process is the parent process. */
//Main function with no return, i.e. void.
void main(void)
{
  //Declare variable pid of type pid_t.
  pid_t pid;
  /Call fork() and assign the value to pid.
  //Remember that 0 means the child process. The parent will have a different number.
  pid = fork();
  //If this process is the child process:
  if (pid == 0)
    ChildCode();
  //Else, if it's the parent process:
  else
    ParentCode();
  }
 
//This is the code for the child process after forking.
int ChildCode(void)
{
  int i;
  for (i = 1; i <= MAXIMUM_COUNT; i++)
    printf("Child process. Count number: %d\n", i);
  printf("\n End child process. \n");
  return 0; //Success.
}

//This is the code for the parent process after forking.
int ParentCode(void)
{
  int i;
  for (i = 1; i <= MAXIMUM_COUNT; i++)
    printf("Parent process. Count number: %d\n", i);
  printf("\n End parent process. \n");
  return 0; // Success.
}

Try this yourself.

execl()

execl(): Definition

Once a child process has been created, the programmer can load in a new process image, i.e. a new program, for the child process to execute. This is the main reason that processes are forked. The exec family of system calls is used to do this. These are:

int execl ( const char *path, const char *arg, ... );
int execlp( const char *file, const char *arg, ... );
int execle( const char *path, const char *arg, ..., char *const envp[] );
int execv ( const char *path, char *const argv[] );
int execvp( const char *file, char *const argv[] );
int execve( const char *file, char *const argv[], char *const envp[] );

The first three system calls are execl calls, i.e. exec followed by l.

execl accepts a variable number of arguments as a list terminated by a NULL entry. The execv calls use an array (or vector) in place of a list of variables.

The variants of execl and execv that end in p will search for the new program files using the PATH environment variable; those that end in e accept an array of strings that indicate the environment variables. This array of strings must end with a NULL, i.e. a Null pointer, as the last entry.

Here, we'll use execl to load in a new program into the child process. execl loads in a program using the following syntax:

execl("/hello_world","Hello World",NULL);

Assuming that the "Hello World" program is saved as ./hello_world in the same directory. The second argument is the program name, not needed mostly. The argument list is terminated by NULL.

execl(): Exercise 1

  • The exec system replaces the current process with one submitted by exec; in this exercise, use execl.
  • Write a simple Hello World operation in a file called test.c.
  • It should print out Hello!.

execl(): Exercise 2

  • Write a script called exec.c that calls execl to substitute the current process for the Hello! program.
  • The syntax of execl is: execl(<path to program>,<name of program>, NULL);

open()

  • Use the open system call to open a file.
  • Include fcntl.h; this contains the flags we need.
  • The syntax is open("path to file", <flag_1>|<flag_2>.
  • open returns an integer, a file descriptor ID.
  • The flag O_CREAT will create the file if it doesn't exist.
  • The flag O_RDONLY will open the file for reading.
  • Write a program called open.c to create a file called test.txt.
  • Close the file after it is opened using the close system call: close(<file descriptor id>);

write()

  • Create a file called writer.c.
  • Open the file created before in write-only mode, i.e. O_WRONLY.
  • Use write(<file descriptor>, <string to write>, <size of the string>.

For example, write(<file descriptor>, "Hello\n", strlen("Hello\n"));

Don't forget to close the file.

read()

  • Create a file called reader.c
  • To read from a file, we need a buffer to use as a temporary data store; this is a char array in RAM.
  • Create one that is 100 bytes in size: char *buffer = (char *) calloc(100, sizeof(char));
  • Open the file we wrote (in Exercise 5), using read-only mode, i.e. O_RDONLY.
  • The syntax of read is: int bytes = read(<file descriptor id>, <buffer>, <number of bytes to read>.
  • read returns an integer, i.e. the number of bytes read.
  • buffer[bytes]='\0' // this terminates the string.
  • To print out what is read, use printf("Read:\n%s/n from file\n",buffer);.

References

  • Structured Computer Organization by Andrew Tanenbaum. Tanenbaum has several other books about operating systems; his book, Operating Systems: Design and Implementation, was sold with Minix, his Unix-like system developed for learning. Linus Torvalds first mentioned Linux publicly as a Minix-related project on comp.os.minix.