Type Class Project: Online Classes, computer science, mooc, operating systems

# Introduction to Operating Systems (Link)

• Georgia Tech: CS8803

Recommended textbooks:

• Operating System Concepts by Silberschatz, Galvin, and Gagne
• Modern Operating Systems by Tanenbaum and Bos
• Operating Systems: Three Easy Pieces by Arpaci-Dusseau

Basically, you need to know a little C and be comfortable with command-line linux stuff.

## 3: P1L2: Introduction to Operating Systems

• 0:02:28: switching between user and kernel mode is not cheap

## 4: P2L1: Processes and Process Management

A brief and fairly general overview of virtual memory and process scheduling, and a mention of IPC.

## 5: P2L2: Threads and Concurrency

On synchronization, mutexes, signals.

The lecture goes over mostly the same material as this report (as is mentioned in the introductory video).

pthread_t aThread;

void * (*start_routine)(void *),
void *arg);



The pthread_attr_t item contains settings for how the thread is to be created.

/* PThread Creation */

#include <stdio.h>

void *foo (void *arg) { /* thread main */
printf("Foobar!\n");
}

int main (void) {

int i;

return 0;
}

// locks the mutex if it is unlocked, but otherwise returns immediately
// rather than blocking as pthread_mutex_lock does


Conditions are like Birrell:

int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr);



## 7: Problem Set 1

Write a multi-threaded C program that gives readers priority over writers concerning a shared (global) variable. Essentially, if any readers are waiting, then they have priority over writer threads – writers can only write when there are no readers. This program should adhere to the following constraints:

• Multiple readers/writers must be supported (5 of each is fine)
• Writers must write the shared variable X number of times
• Writers must print:
• The written value
• The number of readers present were when value is written (should be 0)
• Before a reader/writer attempts to access the shared variable it should wait some random amount of time
• Note: This will help ensure that reads and writes do not occur all at once

### Simple Socket: Client

Write a simple C program that creates, initializes, and connects a client socket to a server socket. You should provide a way to specify the connecting server address and port. This can be hardcoded or passed via the command line.

### Simple Socket: Server

Write a simple C program that creates and initializes a server socket. Once initialized, the server should accept a client connection, close the connection, and then exit.

You should provide a way to specify the server's listening port. This can be hardcoded or passed via the command line.

### The Echo Protocol

The Echo Protocol

In C, write a server and client that implement the fictitious "echo protocol". To implement the protocol, the server should accept any string from the client, and then return that string with all letters capitalized (if letters exist).

Echo Protocol Example

• Client sends "Hello, wOrlD"
• Server echoes "HELLO, WORLD"

As soon as the server responds to a client, it may close. And, as soon as the clients receives a response, it may close. Feel free to reuse portions of the Simple Socket: Client and Simple Socket: Server programs.

## 8: P2L4: Thread Design Considerations

### Interrupts and signals

Interrupts are generated externally to the CPU–by timers, IO devices, or other CPUs, for example.

Signals are triggered within the CPU, such as software which is running.

Interrupts no longer come across dedicated wires (as they were on e.g. the 6502); they are sent through the same mechanism as other data (e.g. over the PCIe bus), and are associated with an MSI number.

## 9: P2L5: Thread Performance Considerations

Discusses Flash: An Efficient and Portable Web Server during the first half, then discusses the design of experiments during the second half.

## 11: P3L1: Scheduling

The simplest scheduler would be a run-to-completion scheduler: once a task is scheduled, it is allowed to run to completion without preemption. The algorithm used to determine in which order to schedule tasks can have substantial impact on performance metrics.

• First-come first-serve (FCFS)
Schedule tasks in the order they arrive. A FIFO queue would be a suitable structure for the runqueue.
• Shortest job first (SJF)
Schedule tasks in order of execution time (which must be known in advance). The runqueue should be an ordered structure.

However, since we may care about responsiveness (i.e. wait time), we might prefer a preemptive scheduler: the scheduler will interrupt running tasks if higher priority (in SJF, shorter completion time) tasks arrive.

### Priority scheduling

With a priority scheduler, there is a danger of starvation–low-priority tasks may be preempted indefinitely by higher-priority tasks. This can be avoided by implementing priority aging: as a task becomes older, its priority effectively increases, so even low-priority tasks will eventually run.

If a high-priority task needs a resource that is locked by a low-priority task, it may not be able to complete until the lower-priority task releases the lock. However, other medium priority tasks may preempt the low-priority task, delaying the high-priority task's execution indefinitely. This is called priority inversion. It can be solved by (temporarily) assigning to the low-priority task holding the lock the priority of the high-priority task, in order to allow it to execute until the lock is released, freeing the resource for the use of the high-priority task.

### Round robin scheduling

Tasks can yield their spot on the CPU when e.g. waiting on I/O, moving to the back of the queue. The scheduler switches when tasks complete or yield.

The scheduler may also implement timeslicing: each task is given a certain amount of time to execute, after which it will be preempted by the next task (and placed at the back of the queue). This allows timesharing of CPU-bound tasks.

The length of a timeslice should be much larger than the time to perform a context switch.

### Multi-level feedback queue (MLFQ)

Created by Fernando Corbato. Multiple queues are maintained with difference timeslice lengths, and tasks are pushed up or down the levels depending on whether they use up the whole timeslice or not.

### Linux O(1) scheduler

Introduced in linux 2.5.

Performs scheduling in constant time w.r.t. task count. Maintains queues in 140 priority levels, with user tasks occupying 100-139 (0-indexed), defaulting to 120.

### Linux completely fair scheduler (CFS)

Introduced in kernel version 2.6.23.

Implemented with a red-black tree. Selects tasks in O(1) time, adds new tasks in O(log n). Tasks have a vruntime which records how much CPU time they have consumed, scaled by priority–low-priority tasks have their vruntime incremented more quickly than high-priority tasks. The task with the lowest vruntime will be selected to run. This ensures that tasks execute, in the long run, for a fraction of time proportional to their priorities.

### Scheduling on multi-CPU systems

We wish to schedule tasks on the same CPU each time, to improve cache performance. This is processor-affinity.

CPUs that support hyperthreading provide additional registers to hold the state of tasks that are not currently running, to reduce the cost of context switching.

Chip multithreading systems need a new operating system scheduler is discussed.

## 12: P3L2: Memory Management

Modern hardware has a memory management unit (MMU) as part of the CPU package that is responsible for handling translation of virtual to physical memory addresses. The MMU may contain a translation lookaside buffer (TLB) that speeds up the process by caching valid translations.

## 13: P3L3: Inter-Process Communication

IPC mechanisms may be message-passing or memory-based.

### Message-passing

Using message-passing methods requires to OS to mediate communication, so there will be substantial overhead due to the required system calls.

Message-passing IPC methods:

• sockets
• pipes
• message queues

### Memory-based

Memory-based IPC methods:

• shared memory
• memory-mapped files
• RPC

## 14: P3L4: Synchronization Constructs

### Hardware support

Hardware support is needed to implement synchronization constructs. The hardware must support atomic instructions. These will perform some group of instructions as a whole or not at all, serializing operations from multiple threads to ensure consistency. Different hardware platforms support different atomic instructions, so synchronization constructs must be ported to use the correct low-level procedures for each platform.

### Semaphores

A semaphore is like a mutex with an integer value. A maximum value is assigned when the semaphore is initialized, and each thread using the resource protected by the semaphore will test if the value of the semaphore is non-zero (wait), and proceed if so, decrementing the semaphore. When a thread completes, it should increment the semaphore (post). A semaphore with a maximum value of 1 is equivalent to a mutex.

#include <semaphore.h>

sem_t sem;
sem_init(sem_t *sem, int pshared, int count);
sem_wait(sem_t *sem);
sem_post(sem_t *sem);


If data will not be mutated, it is safe to access it with multiple readers. If it will be mutated, it must be exclusively locked for a single writer. These may also be called shared and exclusive locks.

#include <linux/spinlock.h>

rwlock_t m;
// critical section

write_lock(m);
// critical section
write_unlock(m);


### Spinlocks

Read The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors.

The test_and_set spinlock:

spinlock_init(lock):
lock = free;
spinlock_lock(lock):
while(test_and_set(lock) == busy);
spinlock_unlock(lock):
lock = free;


This will work quickly, but generate lots of contention. It's very inefficient.

An alternative, the test_and_test_and_set spinlock.

spinlock_lock(lock):
while((lock == busy) OR (test_and_set(lock) == busy))


This modification allows us to make use of caches, only using the atomic instructions that bypass cache if the cache indicates the lock is free. This isn't too bad if we have write-update cache behavior, but it will not be good otherwise.

## 15: P3L5: I/O Management

There are several general categories of device:

• character devices support getting/putting a single character at a time, such as a keyboard or TTY
• network devices may produce and consume data in blocks, but do not provide random access

The CPU can communicate with devices via either interrupts or polling.

With programmed I/O (PIO), the CPU can directly control a device by writing to and reading from its registers. For example, the CPU could command a NIC to send a packet, writing the data into the NIC's data register (repeatedly, until the whole packet has been sent).

Direct memory access (DMA), on the other hand, allows devices to get data 'directly' from memory, without the CPU having to loop through a bunch of store instructions. Instead, the CPU, for example, notifies the NIC to send a certain amount of data located at a certain place in memory, and the NIC then manages this with the aid of the DMA controller. This is more complex than PIO, and may be less efficient for smaller amounts of data.

## 16: P3L6: Virtualization

• Virtual machine monitors: current technology and future trends

### Protection levels (rings)

x86 has 4 rings: ring 3 has the lowest privilege, and ring 0 has the highest privilege. User apps run in ring 3, and the OS runs in ring 0.

### Processor virtualization

#### Trap and emulate

Non-privileged guest instructions are run directly by the hardware, so there is not a speed penalty. Privileged instructions are trapped by the hypervisor, which emulates the instruction for the guest (or rejects the instruction, terminating the VM).

This did not work on pre-2005 x86 CPUs because there were 17 privileged instructions that would not trap to the hypervisor, and would instead fail silently.

#### Binary translation

Pioneered by Mendel Rosenblum of VMware. Dynamically modify the instructions to avoid running invalid instructions.

#### Paravirtualization

Originally used on Xen. The guest is modified so it knows it is running on a VM. Therefore, the guest can explicitly call the hypervisor when needed.

### Memory virtualization

The obvious way to provide memory to the guest is to provide the guest with virtual memory, just as the guest provides virtual memory to applications. But this requires double translation of memory addresses in order to access memory.

A faster option is for the hypervisor to intervene when virtual memory is mapped by the guest, in order to directly map virtual addresses to machine addresses. This allows the VM to take advantage of the hardware virtual memory support.

Another alternative is to use paravirtualized memory mapping. Then the guest OS can help coordinate mapping virtual memory addresses to machine addresses.

Modern hardware provides support for these to improve efficiency.

### Device virtualization

#### Passthrough model

The guest is provided with exclusive access to the device. The drive runs in the guest space and it has full control of the device. This makes sharing devices between multiple guests difficult. Additionally, since the VM expects a specific device to be present on the physical hardware (since it has a driver for that device), migration to other machines is inconvenient.

#### Hypervisor-direct model

The hypervisor intercepts all device accesses. Then the driver for the real hardware is owned by the hypervisor, which performs operations on behalf of the guest.

#### Split device driver model

A specialized driver runs in the guest which provides a device API, but does not directly control any hardware. It then communicates with a real device driver running in the service VM (or on the host, for type 2 VMs).

### Hardware virtualization

Around 2005, AMD and Intel began providing hardware features to support virtualization: AMD Pacifica and Intel Vanderpool Technology (Intel-VT).

## 17: P4L1: Remote Procedure Calls

An Interface Definition Language (IDL) is used to describe the capabilities and requirements of a server providing an RPC interface.

Name Role