Operating System Concepts: Loaders, Memory, Processes, Synchronization
Compile and Go Loaders
In this type of loader, the linker and loader instructions are read line by line, their machine code is obtained, and it is directly placed in the main memory at some known address. This means the assembler runs in one part of memory, and the assembled machine instructions and data are directly put into their assigned memory locations. After completion of the assembly process, the loader contains the instruction using which the location counter is set to the start of the newly assembled program. A typical example is WATFOR-77, a FORTRAN compiler which uses such a "load-and-go" scheme. This loading scheme is also called "assemble-and-go".
Advantages
- Simplicity
- Quick Testing
- No Separate Linking
- Immediate Feedback
- Low Resource Usage
Disadvantages
- Inefficient for Large Programs
- Lack of Modular Development
- Higher Compilation Overhead
- Limited Scalability
- Complex Error Debugging
Types of Loaders
- Compile and Go
- Absolute Loader
- General Loader Scheme
- Relocating Loaders
- Subroutine Linkage
- Direct Linkage Loader
General Loader Scheme
The general loader scheme is a systematic approach used to load programs into memory for execution. It ensures the program's object code is correctly placed in memory, resolved for dependencies, and prepared for execution. In this loader scheme, the source program is converted to an object program by a translator (assembler). The loader accepts these object modules and places machine instructions and data in an executable form at their assigned memory. The loader occupies some portion of main memory.
Advantages
- Flexibility
- Efficient Memory Use
- Dynamic Updates
- Error Detection
- Scalability
Disadvantages
- Higher Overhead
- Memory Usage
- Complex Implementation
- Security Risks
Subroutine Linkages
Subroutine linkages refer to the mechanisms by which a program calls and returns from a subroutine. They involve managing control flow and data passing between the calling program (caller) and the subroutine (callee). Subroutine linkages include saving the return address, passing arguments, and restoring the caller’s environment.
Call Initiation
- Save the return address (where execution should continue after the subroutine finishes).
- Pass arguments to the subroutine.
Subroutine Execution
- Perform the subroutine's task using the provided arguments.
- Use a stack or dedicated registers to manage local variables and temporary data.
Return to Caller
- Restore the return address and the caller's environment.
- Pass the result (if any) back to the caller.
Absolute Loaders
An absolute loader is one of the simplest types of loaders, which directly loads the program into a specific memory location as defined by the program's object code. It does not perform relocation or linking; the memory addresses in the object code are fixed (absolute). This type of loader is called absolute because no relocation information is needed; it is obtained from the programmer or assembler. In this scheme, the programmer or assembler should have knowledge of memory management. The programmer must manage two key aspects: first, specifying the starting address of each module to be used; second, knowing the absolute starting address of the respective module when branching from one segment to another, so that this address can be specified in the corresponding JMP
instruction.
Dynamic Link Libraries (DLLs)
A Dynamic Link Library (DLL) is a file that contains code and data that can be used by multiple programs simultaneously. DLLs enable modular programming, allowing software to be broken into reusable components. Unlike static libraries, DLLs are not embedded into the executable file but are loaded into memory at runtime. This type of relocatable loader allows programmers to use multiple procedure and data segments. Consequently, procedures and data from other segments can be freely referenced by the program. This loader scheme performs the translation of the source program independently.
Structure of DLL
- Exported Functions: Functions and data declared in the DLL and made accessible to other programs.
- Internal Functions: Functions used internally by the DLL and not accessible externally.
- Resources: Includes resources such as icons, strings, and images.
Need for DLL
- Code Reusability
- Efficient Memory Use
- Modular Development
- Easier Updates
- Smaller Executables
- Dynamic Linking
Advantages
- Memory Efficiency
- Modularity
- Smaller Executables
- Faster Updates
Disadvantages
- Dependency Issues
- Version Conflicts
- Security Risks
- Runtime Errors
Program Overlays
An overlay is a part of a program that is currently required for its execution and has the same load origin as other parts of the program. A program containing multiple overlays is called an overlay-structured program. The program that controls the loading of overlays when required is called an overlay manager. This overlay manager is linked with the root segment.
Threads and Their Lifecycle
A thread is a dispatchable unit of work. It consists of a thread ID, program counter, stack, and register set. A thread is also called a Lightweight Process (LWP) because they consume fewer resources than a process. Threads are easy to create and destroy. A thread is a basic processing unit to which an operating system allocates processor time, and more than one thread can execute code inside a process. Each thread belongs to exactly one process, and no thread can exist outside a process. Each thread represents a separate flow of control.
Thread Lifecycle
- New: The thread is just created.
- Ready: The thread's
start()
method is invoked, and it is now ready for execution. The OS places the thread into the ready queue. - Running: The highest priority ready thread enters the running state. The thread is assigned a processor and is now executing.
- Blocked: This state occurs when a thread is waiting for a lock to access an object.
- Waiting: Here, a thread is waiting indefinitely for another thread to perform an action.
- Sleeping: A thread sleeps for a specified period. When the sleeping time expires, it enters the ready state. The CPU is not used by a sleeping thread.
- Dead: When the thread completes its task or operation.
Process Scheduling
CPU utilization is maximized by using the multiprogramming concept. The processor is not idle; it is executing a process. The processor scheduler selects one process for execution from the ready queue. The aim is to assign processes to be executed by the processor in a way that meets system objectives, such as response time, throughput, and processor efficiency.
Long-Term Scheduler
The long-term scheduler is also called the job scheduler. It determines which processes are admitted to the system for processing. Processes are selected from the queue and loaded into the main memory for execution. Long-term scheduling controls the degree of multiprogramming in multitasking systems. It provides a balanced mix of jobs, such as I/O-bound and CPU-bound. Long-term scheduling is used in real-time operating systems. Time-sharing operating systems typically have no long-term scheduler.
Medium-Term Scheduler
The medium-term scheduler is part of the swapping function. Sometimes, it removes processes from memory. It also reduces the degree of multiprogramming. If a process makes an I/O request while in memory, the operating system moves this process into a suspended state. Once suspended, the process cannot make any progress towards completion. In this situation, the process is removed from memory, freeing space for other processes. The suspended process is stored in a secondary storage device (e.g., hard disk). This process is called swapping.
Short-Term Scheduler
The short-term scheduler is also called the CPU scheduler. It selects processes from the ready queue that are ready to execute and allocates the CPU for their execution. The short-term scheduler is faster than the long-term scheduler. A scheduling decision will, at a minimum, have to be made after every time slice, and these are very short. It is also known as the dispatcher.
Preemptive Scheduling | Non-Preemptive Scheduling |
---|---|
Preemptive scheduling allows a process to be interrupted in the midst of its execution, taking the CPU away and allocating it to another process. | Non-preemptive scheduling ensures that a process relinquishes control of the CPU only when it finishes with its current CPU burst. |
Preemptive scheduling incurs a cost associated with accessing shared data. | Non-preemptive scheduling does not increase the cost. |
It also affects the design of the operating system kernel. | It does not affect the design of the OS kernel. |
Preemptive scheduling is more complex. | Non-preemptive scheduling is simpler. |
Example: Round Robin. | Example: First-Come, First-Served (FCFS). |
Round Robin Scheduling
*Round Robin* is a preemptive scheduling algorithm. It is used in interactive systems. Here, processes are given a limited amount of processor time, called a *time slice* or *time quantum*. If a process does not complete before its quantum expires, the system preempts it and gives the processor to the next waiting process. The system then places the preempted process at the back of the ready queue. Processes are placed in the ready queue using a FIFO (First-In, First-Out) scheme. For the RR algorithm, the principal design issue is the length of the time quantum to be used. With a short time slice, processes will move through the system relatively quickly. However, it increases processing overhead.
Semaphores
A *semaphore* is an operating system abstract data type, described by Dijkstra. It is a non-negative integer variable used as a flag. It takes only integer values. It's used to solve the *critical section problem*. A proper semaphore implementation requires that P (wait) and V (signal) be indivisible (atomic) operations, often achieved with hardware support. This atomicity ensures that no two processes can simultaneously execute P or V operations on the same semaphore.
Semaphore Properties
- Machine independent.
- Simple to implement.
- Correctness is easy to determine.
- Semaphores can acquire many resources simultaneously.
Mutexes
A *mutex* is used to ensure that only one thread at a time can access the resource protected by the mutex. The process that locks the mutex must be the one that unlocks it. A mutex is suitable for managing mutual exclusion to a shared resource. Mutexes are easy and efficient to implement. A mutex can be in one of two states: locked or unlocked. It is often represented by one bit. Zero (0) typically means unlocked, and any other value represents locked. It uses two primary procedures: acquire()
(lock) and release()
(unlock). If the mutex is locked, the calling process enters a blocked state and waits until the process in the critical section finishes its execution. A mutex variable is similar to a binary semaphore. However, they are not identical. A segment of code in which a thread may be accessing a shared variable is called a *critical region*.
Monitors
A *monitor* is an object that contains both data and procedures needed to perform allocation of shared resources. It is a programming language construct that supports both data access synchronization and control synchronization. Monitors are implemented in programming languages like Pascal, Java, and C++. Java makes extensive use of monitors to implement mutual exclusion. A monitor is an abstract data type for which only one process may be executing a procedure at any given time. A monitor is a collection of procedures, variables, and data structures. Data inside the monitor may be either global to all routines within the monitor or local to a specific routine. Monitors are a higher-level concept than P and V operations (semaphores).
Dining Philosophers Problem
The Dining Philosophers Problem involves five philosophers seated around a table who alternate between thinking and eating. To eat, each philosopher needs two forks (one on the left and one on the right) but can only pick up one fork at a time. Since forks are shared, no two adjacent philosophers can eat simultaneously, requiring synchronization to prevent conflicts or deadlocks.
Solution
We consider each fork as a shared item, protected by a mutex lock. Before eating, each philosopher first locks the left fork and then the right fork. If a philosopher successfully acquires both locks (two forks), they can then eat. After finishing eating, this philosopher releases both forks and thinks.
Producer-Consumer Problem
The Producer-Consumer Problem is a classic synchronization problem where two processes, the Producer and the Consumer, share a common buffer. The Producer generates data and adds it to the buffer, while the Consumer retrieves data from the buffer. Synchronization is required to ensure the Producer doesn’t add to a full buffer and the Consumer doesn’t remove from an empty buffer.
Solution
The Producer-Consumer Problem is solved using semaphores for synchronization. Three semaphores are used: empty
(for available space in the buffer), full
(for items in the buffer), and mutex
(for mutual exclusion). The Producer waits on empty
before adding an item to the buffer and signals full
after adding. The Consumer waits on full
before removing an item, then signals empty
. The mutex
ensures only one process accesses the buffer at a time. This solution prevents buffer overflow or underflow, ensures mutual exclusion, and allows both processes to operate concurrently without conflicts.
Deadlock Management
Deadlock Prevention
- Mutual Exclusion: It is necessary in any computer system because some resources (e.g., memory, CPU) must be exclusively allocated to one user at a time. No other process can use a resource while it is allocated to a process.
- Hold and Wait: If a process holding certain resources is denied a further request, it must release its original resources and, if required, request them again.
- No Preemption: This condition could be bypassed by allowing the operating system to deallocate resources from a process.
- Circular Wait: Circular wait can be bypassed if the operating system prevents the formation of a circular chain of processes waiting for resources.
Deadlock Avoidance
Deadlock avoidance depends on additional information about the long-term resource needs of each process. The system decides whether granting a resource is safe or not and only makes the allocation when it is safe. When a process is created, it must declare its maximum claim, i.e., the maximum number of resource units it may need. The resource manager can grant the request if the resources are available. For example, if Process 1 requests a resource held by Process 2, the system must ensure that Process 2 is not waiting for a resource held by Process 1.
Deadlock Detection
Deadlock detection is the process of determining that a deadlock exists and identifying the processes and resources involved in the deadlock. If processes are blocked on resources for an inordinately long time, the detection algorithm is executed to determine whether the current state is a deadlock. Deadlock detection algorithms can incur significant runtime overhead due to the runtime cost of maintaining necessary information and executing the detection algorithm.
Deadlock Recovery
Here, one or more processes will have to be preempted, thus releasing their resources so that the other deadlocked processes can become unblocked.
- Process Termination: Deadlock is removed by aborting a process. However, aborting a process is not always straightforward. Options include aborting all deadlocked processes or aborting one process at a time until the deadlock is resolved.
- Resource Preemption: Sometimes, a resource is temporarily taken away from its current process and allocated to another process.
- Recovery Through Rollback: When a process in a system terminates, the system performs a rollback by undoing every operation related to the terminated process to a safe state.
- Starvation: Starvation is a situation in which a process waits indefinitely for an event that might never occur in the system, often a side effect of recovery strategies.
Memory Fragmentation
Types of Fragmentation
- Internal Fragmentation: This occurs when there is wasted space internal to a memory partition because the block of data loaded is smaller than the allocated partition. Keeping track of this free space is overhead for the system.
- External Fragmentation: This occurs when enough total main memory space exists to satisfy a request, but it is not contiguous; storage is fragmented into a large number of small, non-contiguous holes. Solutions for external fragmentation include:
- Compaction
- Using a logical address space (e.g., paging or segmentation)
Inverted Page Tables
An *inverted page table* uses a page table that contains an entry for each physical frame. This ensures that the page table occupies a fixed fraction of memory. Its size is proportional to the physical memory. Traditional page table overhead increases with address space size. Traditional page tables can get too big to fit in memory. An inverted page table has one entry for each real page of memory. Lookup time is increased because it requires a search on the inverted table. Inverted page tables are used by architectures like Itanium, PowerPC, and UltraSPARC.
Segmentation | Paging |
---|---|
Program is divided into variable-size segments. | Program is divided into fixed-size pages. |
User (or compiler) is responsible for dividing the program into segments. | Division into pages is performed by the operating system. |
Segmentation is slower than paging. | Paging is faster than segmentation. |
Segmentation is visible to the user. | Paging is invisible to the user. |
Segmentation eliminates internal fragmentation. | Paging suffers from internal fragmentation. |
Segmentation suffers from external fragmentation. | There is no external fragmentation. |
Processor uses segment number, offset to calculate absolute address. | Processor uses page number, offset to calculate absolute address. |
Virtual Memory Concepts
*Virtual memory* is a method of using hard disk space to provide additional memory. It simulates additional main memory capacity. A *swap file* (or paging file) is an area of your hard disk set aside for virtual memory. Virtual memory data is stored in a secondary storage device. It helps to extend memory capacity and works with primary memory to load applications. Virtual memory reduces the cost of expanding physical memory capacity. Virtual memory uses two types of addresses: *virtual addresses* and *physical addresses*. A virtual address is referred to by a process, while a physical address is the actual address in main memory.
Virtual Memory with Paging
Each logical address used in a program consists of a page number and word number. This pair is written as (pi, wi), where pi is a page number and wi is the word number within page pi. In virtual memory with paging, the operating system views the logical address space of a program as a linear arrangement of fixed-sized components called *pages*. Paging is the most common memory management technique. The virtual space of a process is divided into fixed-size pages. A virtual address is composed of a page number and a page offset. Physical memory is divided into fixed-size *frames*, and a page in virtual space fits into a frame in physical memory.
Virtual Memory with Segmentation
Each segment is a logical unit of a program as defined by the program's designer. Examples of segments include a subroutine, an array of data, a table of symbols, or a user's program. The segment number is mapped to a physical address using *segment descriptor tables*. These tables perform the same function as page tables in paging. Because segments can vary in size, a bounds check is also needed to ensure that the offset is within the segment. The function of the *Memory Management Unit* (MMU) is to map logical addresses into physical addresses, similar to the virtual memory mapping concept. A special hardware register, called the *segment table address register*, points to the segment table of a program. If the segment is present in memory, address translation is completed using the memory start address found in its segment table entry.
Thrashing in Virtual Memory
*Thrashing* occurs when a process does not have enough frames allocated to store the pages it uses repeatedly, leading to a very high page fault rate. A process is thrashing if it is spending more time paging in/out than executing. Since thrashing leads to low CPU utilization, the operating system's medium or long-term scheduler may step in, detect this, and increase the degree of multiprogramming. A policy based on the *local replacement mode* will tend to limit the effect of thrashing. A replacement policy based on the *global replacement mode* is more likely to cause thrashing. Since all pages of memory are available to all transactions, a memory-intensive transaction may occupy a large portion of memory, making other transactions susceptible to page faults and resulting in a system that thrashes. Thrashing can be mitigated by using the *working set model* and *page fault frequency* (PFF) algorithms.
Static Link Library (SLL) | Dynamic Link Library (DLL) |
---|---|
A library that gets embedded into the executable during compile time. | A library that is linked to the program at runtime. |
Linked at compile time. | Linked at runtime or load time. |
Larger, as library code is embedded in the executable. | Smaller, as library code remains external. |
Slightly faster at runtime; no additional linking is needed. | May incur a slight overhead during runtime due to dynamic linking. |
Cannot be shared among multiple programs; each executable contains its own copy. | Can be shared among multiple programs, reducing memory usage. |
Executable is self-contained, making it more portable. | Requires the library to be present on the target system at runtime. |
No external dependencies after compilation. | Requires the library to be available at runtime. |
Higher, as each program includes its own copy. | Lower, as multiple programs can share the same library in memory. |
Compiler | Interpreter |
---|---|
Translates the entire source code into machine code at once, creating an executable file. | Translates and executes source code line by line or statement by statement. |
Complete translation before execution. | Translation and execution occur simultaneously. |
Generates an executable file (e.g., .exe , .out ). | No separate output file; executes directly. |
Generally faster execution after compilation. | Slower execution as each line is translated at runtime. |
Errors are detected after the entire code is compiled. | Errors are detected and reported line by line during execution. |
Requires more memory due to storing the entire translated code. | Generally uses less memory as it processes code line by line. |
Examples: C, C++, Java. | Examples: Python, JavaScript, Ruby. |