Linux Process Management (Part 2/3)

The Process Family Tree

Process Hierarchy

All Linux processes originate from the init process (PID 1), which is started by the kernel during boot.

Parent-Child Relationships

  • Each process has exactly one parent.
  • A parent can have multiple child processes.
  • Children of the same parent are siblings.

task_struct Relationships

The task_struct has:

  • A parent pointer (pointing to the parent’s task_struct)
  • A children list (storing pointers to child processes)

Accessing the Parent Process

struct task_struct *my_parent = current->parent;

Iterating Over Children

struct task_struct *task;
struct list_head *list;
list_for_each(list, current->children) {
    task = list_entry(list, struct task_struct, sibling);
    /* task now points to one of current’s children */
}


init_task

The task_struct of the init process is statically allocated as init_task.

Traversing Up to init

struct task_struct *task;
for (task = current; task != &init_task; task = task->parent)
;
/* task now points to init */


Iterating Over All Processes

The task list is a circular doubly linked list.

Use next_task(task) and prev_task(task) macros to navigate processes.

list_entry(task->tasks.next, struct task_struct, tasks);
list_entry(task->tasks.prev, struct task_struct, tasks);


Using for_each_process() to Iterate Over All Tasks

struct task_struct *task;
for_each_process(task) {
    /* Prints the name and PID of each process */
    printk("%s[%d]\n", task->comm, task->pid);
}
NOTE : Iterating over all processes is computationally expensive and should be avoided unless necessary.

Process Creation


Unix Process Creation Model

Unix follows a two-step process creation mechanism using fork() and exec(). This differs from many other operating systems, which typically use a single “spawn” function to create and run a new process. The separation of process duplication (fork()) and program execution (exec()) allows for greater flexibility and process management.

fork() – Creating a Child Process

The fork() system call creates a new child process, which is an exact copy of the parent process. However, there are a few key differences:

  • Process ID (PID) – The child receives a unique PID.
  • Parent Process ID (PPID) – The child’s PPID is set to the parent’s PID.
  • Certain resources and statistics – Some properties, such as pending signals, are not inherited by the child process.

exec() – Loading a New Program

Once a child process is created, it may need to run a different program. This is achieved using exec(), which replaces the child process’s address space with a new executable. The child process no longer runs the parent’s code but instead begins executing a completely different program.

Comparison with Other Operating Systems

In non-Unix operating systems, process creation is often handled by a single “spawn” function, which both creates the process and loads the new program in one step. In contrast, Unix separates these actions, giving developers more control over process management and resource allocation.

Copy-on-Write (COW) Optimization


Traditional Approach – An Inefficient Model

Before optimizations like Copy-on-Write (COW), when fork() was called, all resources of the parent (such as memory pages) were immediately duplicated for the child. This approach was inefficient because:

  • It copied a large amount of data unnecessarily.
  • Many copies were never used, especially when exec() was called immediately after fork().


Linux’s Copy-on-Write (COW) Implementation

To address these inefficiencies, Linux employs Copy-on-Write (COW), which delays or prevents unnecessary copying of data. Instead of immediately duplicating memory pages, the parent and child share the same memory until one of them modifies it.


How COW Works

  • When fork() is called, the child process initially shares the parent’s memory.
  • These shared memory pages are marked as read-only.
  • If either the parent or child attempts to modify the shared memory, a duplicate copy of that page is created for the modifying process.
  • If exec() is called immediately after fork(), no memory is copied at all, since the process image is completely replaced.


Reducing Overhead with COW

By using COW, Linux significantly reduces the overhead of fork(), making it a lightweight operation. The only work required for fork() is:

  • Duplicating the parent’s page tables (not the memory itself).
  • Creating a new process descriptor (task_struct) for the child.

This optimization aligns with the Unix philosophy of fast process execution, allowing quick process creation without wasting resources.

Important Points

-> Unix uses fork() and exec() separately, unlike other systems that use a single “spawn” function.
-> fork() creates a copy of the parent process, while exec() loads a new program into the process’s memory.
-> Copy-on-Write (COW) optimizes fork() by sharing memory pages instead of duplicating them immediately.
-> COW prevents unnecessary copying, significantly reducing overhead—especially when exec() follows fork().


Forking in Linux


Introduction

Forking in Linux is the process of creating a new child process from an existing parent process. The fork() system call is commonly used for this purpose, but internally, Linux uses a more flexible system call called clone(). The clone() function allows finer control over what resources the parent and child share. Functions such as fork(), vfork(), and __clone() are actually wrappers around clone() with specific flags.

The actual logic behind process creation is implemented in the kernel function do_fork(), which is located in the kernel/fork.c file. This function calls copy_process(), which performs the core operations needed to create a new process. Once a new process is created, do_fork() ensures that it is scheduled to run.

1. Forking via clone()

In Linux, the clone() system call is responsible for creating new processes. Unlike fork(), which duplicates nearly all of the parent’s resources, clone() allows finer control over what is shared between the parent and child. For example, it can be used to create threads that share memory, file descriptors, and signal handlers.

Functions like fork(), vfork(), and __clone() internally use clone() with specific flags to achieve different behaviors. The actual creation of the new process is handled in do_fork(), which calls copy_process() to perform the necessary setup before scheduling the new process for execution.


2. copy_process(): The Core of Forking

Step 1: dup_task_struct() – Creating a Process Descriptor

The function dup_task_struct() creates a new kernel stack, thread information structure, and task_struct for the child process. The child initially has an exact copy of the parent’s process descriptor, including its state, registers, and resources.

// Simplified Conceptual Example
struct task_struct *child_task = dup_task_struct(parent_task);

Step 2: Resource Limit Check

Before proceeding, Linux checks whether the user has exceeded the process creation limit. If the number of processes exceeds the limit, the fork fails.

Step 3: Differentiating the Child Process

The child’s task_struct is modified to distinguish it from the parent. Certain statistics, such as CPU usage, are reset, while most attributes remain unchanged.

Step 4: Setting the Child’s Initial State

The child’s process state is set to TASK_UNINTERRUPTIBLE, ensuring that it does not start executing immediately.

// Simplified Conceptual Example
child_task->state = TASK_UNINTERRUPTIBLE;

Step 5: Updating Process Flags

The copy_flags() function updates the process flags to reflect its new identity:

  • Clears PF_SUPERPRIV, removing any inherited superuser privileges.
  • Sets PF_FORKNOEXEC, indicating that the process has not executed a new program yet.
// Simplified Conceptual Example
copy_flags(child_task, parent_task);
child_task->flags &= ~PF_SUPERPRIV;
child_task->flags |= PF_FORKNOEXEC;

Step 6: Allocating a Unique PID

A new Process ID (PID) is assigned to the child using alloc_pid().

// Simplified Conceptual Example
child_task->pid = alloc_pid();

Step 7: Resource Duplication or Sharing

Depending on the flags passed to clone(), certain resources (such as open files, memory space, and signal handlers) are either:

  • Copied for independent child process execution.
  • Shared between parent and child, useful for thread creation.

Step 8: Cleanup and Returning the New Process

Once initialization is complete, copy_process() returns a pointer to the newly created child process, which is then scheduled for execution.

3. do_fork(): Finalizing the Fork

Once copy_process() completes successfully, the do_fork() function performs final steps to prepare the child for execution.

  • It wakes up the child process, allowing it to be scheduled.
  • Ideally, the kernel tries to run the child first to minimize Copy-on-Write (COW) overhead. If the child immediately calls exec(), no memory duplication is needed.
  • However, there is a note in the kernel stating that this optimization is not yet functioning correctly, but it remains a design goal.


4. vfork(): A Specialized Fork

Key Characteristics of vfork()

  • No memory duplication: Unlike fork(), vfork() does not create a copy of the parent’s address space.
  • Parent is paused: The parent must wait for the child to complete before continuing execution.
  • Child cannot modify memory: The child cannot write to the parent’s address space.

Historically, vfork() was an optimization before Copy-on-Write (COW) was introduced. Today, its only real advantage is avoiding page table duplication, which still makes it useful in certain scenarios.


vfork()’s Implementation in Linux

Step-by-Step Execution of vfork()

  • copy_process() sets the child’s vfork_done member to NULL.
  • do_fork() assigns a special waiting address to vfork_done.
  • The child runs first, while the parent waits for a signal via vfork_done.
  • When the child exits or calls exec(), mm_release() signals the parent to resume execution.
  • The parent wakes up and returns, allowing it to continue execution.

While vfork() reduces overhead, its implementation is more complex and not as commonly needed today.

Important Points

-> fork() is the primary system call used for process creation in Linux.
-> Internally, clone() provides more flexibility in deciding which resources to share between processes.
-> copy_process() is the core function responsible for setting up a new process, duplicating resources, and assigning a new PID.
-> do_fork() schedules the new process for execution, ideally prioritizing the child first to reduce Copy-on-Write overhead.
-> vfork() is an optimized version of fork() that prevents memory duplication, but it has stricter constraints on execution.
-> Although vfork() was historically beneficial, its importance has diminished with Copy-on-Write optimizations.


The Linux Implementation of Threads

Threads as a Programming Abstraction

Threads allow for multiple execution flows within a single program, sharing the same memory space. They can also share resources like open files. This abstraction facilitates concurrent programming and enables true parallelism on multi-processor systems.

Linux’s Unique Approach

Linux doesn’t have a distinct “thread” concept at the kernel level. Instead, it implements threads as standard processes. The kernel doesn’t use specialized scheduling or unique data structures for threads. In Linux, a thread is simply a process that shares specific resources with other processes.

task_struct and Resource Sharing

Each thread in Linux has its own unique task_struct, making it appear as a normal process to the kernel. The key detail is that these processes are configured to share resources such as the address space.

Comparison with Other Operating Systems

Operating systems like Windows and Solaris include explicit kernel support for threads, often referred to as “lightweight processes.” In these systems, threads are treated as lighter and faster execution units compared to full processes.

In contrast, Linux views threads as a mechanism for sharing resources between processes, which are already designed to be lightweight.

Conceptual Difference

In other systems:

  • A single process descriptor manages shared resources.
  • Individual thread descriptors manage thread-specific resources.

In Linux:

  • Each thread is a separate process with its own task_struct.
  • These processes are configured to share necessary resources.

Example:
Consider a program with four threads.

  • Other Systems: One process descriptor and four thread descriptors.
  • Linux: Four individual process descriptors (task_structs), configured to share memory and other resources.

Result: Linux’s approach is considered both elegant and efficient.

Important Points

-> Linux implements threads as processes that share resources.
-> This differs significantly from operating systems with explicit thread support.
-> By using the task_struct structure for threads, Linux simplifies its kernel code.
-> This design enables a consistent and uniform system for managing both processes and threads.


Creating Threads

Thread Creation via clone()

Linux uses the clone() system call to create threads, much like it creates processes. The key difference lies in the flags passed to clone(), which define which resources are shared.

Example:

clone(CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0);

This code creates a new task that shares the address space (CLONE_VM), filesystem resources (CLONE_FS), file descriptors (CLONE_FILES), and signal handlers (CLONE_SIGHAND) with its parent.
Source: notes.shichao.io

This shared-resource behavior is what defines threads in Linux.

Comparison with fork() and vfork()

  • A standard fork() can be represented as: clone(SIGCHLD, 0);
  • vfork() can be represented as: clone(CLONE_VFORK | CLONE_VM | SIGCHLD, 0);


clone() Flags

The flags passed to clone() determine the degree of resource sharing between parent and child processes. These flags are defined in <linux/sched.h>.

Common clone() Flags and Their Meanings:

FlagDescription
CLONE_FILESParent and child share open files
CLONE_FSParent and child share filesystem information
CLONE_IDLETASKSet PID to zero (used by idle tasks)
CLONE_NEWNSCreate a new namespace for the child
CLONE_PARENTChild has the same parent as its parent
CLONE_PTRACEContinue tracing the child
CLONE_SETTIDWrite the Thread ID (TID) back to user-space
CLONE_SETTLSCreate new Thread Local Storage (TLS) for the child
CLONE_SIGHANDParent and child share signal handlers and blocked signals
CLONE_SYSVSEMParent and child share System V SEM_UNDO semantics
CLONE_THREADParent and child are in the same thread group
CLONE_VFORKParent sleeps until the child wakes it (vfork() behavior)
CLONE_UNTRACEDPrevent tracing processes from forcing CLONE_PTRACE on the child
CLONE_STOPStart the process in the TASK_STOPPED state
CLONE_CHILD_CLEARTIDClear the TID in the child
CLONE_CHILD_SETTIDSet the TID in the child
CLONE_PARENT_SETTIDSet the TID in the parent
CLONE_VMParent and child share the address space

THANK YOU

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *