OS Design: I/O optimizations

DmFrPro
Nerd For Tech
Published in
3 min readMar 23, 2021

--

In this article I will tell you how the OS manages the I/O devices. We know that I/O devices are slower than CPU or RAM, by this reason the OS developers came up with ideas to add special schedulers, caches and buffers.

Windows, Linux and Mac OS

Before reading this article I strongly recommend to read this article about processes, threads and dispatcher.

DMA and I/O bus

We all know that CPU, RAM and GPU are connected to a system bus. In hardware the bus is PCI-Express bus that can transfer gygabytes of data by the very small time.

Look the diagram below:

System bus and connected devices

But this connection method was modified when people developed DMA.

DMA (Direct Memory Access) is a sharing data mode, when I/O devices aren’t connected to CPU. I/O devices are connected to the separated I/O bus (SATA). This technology increases speed of transfering the data, because now CPU doesn’t handle I/O devices’ data.

Look at the diagram below:

DMA-supported system with I/O bus

Caching and buffering

HDD Cache (buffer memory) is a special fast memory that stores the data that is used most often by processes and CPU. This memory chip is located at every HDD/SSD.

Imagine that HDD doesn’t have the cache. We know that HDD doesn’t support parallel writing and reading, and when a lot of processes call HDD to write something, there are a queue of calls. And processes must wait until HDD won’t perform operations ahead the queue and we lose the performance.

Let’s look at the diagram below:

No-cache HDD vs HDD with cache and multifuffering

Cache provides quite faster reading operation than if HDD/SSD was without cache.

Schedulers and their algorithms

We know that HDD has a head that writes the data to a magnet disk or reads the data from this disk. These physical operations were too slow and developers came up with interesting solution — schedulers.

Scheduler is a program that manages I/O operations. For example, in the diagram above the scheduler responses for loading the data from HDD to cache and conversely — from cache to HDD.

Let’s look at the default situation: we have one thousand points that are located randomly at the magnet disk and we need to read all these points at the shortest time using HDD’s head.

There are 3 types of scheduling:

  1. Short-dated scheduling — reading/writing the data from HDD and sending it to the thread
  2. Medium-dated scheduling — calling to swap the data from HDD to RAM
  3. Long-dated scheduling — adding the thread to the queue

There are 3 groups of algorithms of reading/writing points:

  1. FCFS and FCLS — handle the calls in FIFO (First In First Out) or LIFO (Last In First Out) orders. These algorithms aren’t effective and sometimes we may get resource starvation
  2. SSTF and SJF — handle the calls with shortest turnaround time
  3. SCAN and LOOK — handle the calls from the beginnig of the magnet disk to ending of this disk and do the same conversely

Modern schedulers use algorithms that include advantages of SSTF/SJF and SCAN/LOOK basic algorithms.

--

--

DmFrPro
Nerd For Tech

throw new NoSuchElementException("Bio is not found");