Linux Process Management (Part 3/3)

Understanding Kernel Threads in Linux

Kernel threads are a fundamental part of the Linux operating system, enabling the kernel to perform background operations efficiently. Unlike user-space processes, kernel threads are standard processes that operate exclusively within kernel-space and are never associated with user-space execution.

What Makes Kernel Threads Different?

Kernel threads differ from normal user processes in several ways. One of the most notable distinctions is that they do not have an associated address space their mm (memory management) pointer is NULL. This design ensures they remain purely in kernel-space, without the overhead of transitioning to user-space contexts.

Despite their isolation from user-space, kernel threads are schedulable and preemptable, just like any other process in the system. This means the kernel can switch between them based on scheduling policies, allowing for multitasking even within internal operations.

Where Are Kernel Threads Used?

Kernel threads are heavily relied upon in Linux for tasks that need to run in the background without user intervention. For instance, they handle:

Flush operations (writing data from memory to disk)
Soft interrupt handling, such as with the ksoftirqd daemon

“ksoftirqd : is a kernel thread responsible for handling software interrupts (softirqs) in Linux.”

Interestingly, these threads are visible from user-space. You can spot them using the ps -ef command, where they appear alongside regular processes.

How Kernel Threads Are Created

Kernel threads are not spawned like regular user-space programs. Instead, they are typically created during system boot, often by other kernel threads. A critical fact to note is that only kernel threads can create other kernel threads. All of these threads ultimately originate from the kthreadd kernel process, which serves as their parent.

Using `kthread_create()` to Make Kernel Threads

The creation of kernel threads is done through the kthread_create() function, defined in <linux/kthread.h>.

example for reference:

struct task_struct *kthread_create(int (*threadfn)(void *data),
                                   void *data,
                                   const char namefmt[],
                                   ...);

threadfn: This is the function the thread will execute.
data: The argument passed to the thread function.
namefmt: A printf-style format string to name the thread.

A key detail is that threads created with kthread_create() are not runnable immediately. After creation, they must be explicitly woken up using wake_up_process() to begin execution.

The `kthread_run()` Convenience Macro

To simplify the process, Linux provides a macro called kthread_run(). This macro wraps both kthread_create() and wake_up_process() into a single call, making thread creation more convenient.

#define kthread_run(threadfn, data, namefmt, ...) 
({ 
    struct task_struct *k; 
    k = kthread_create(threadfn, data, namefmt, ## __VA_ARGS__); 
    if (!IS_ERR(k)) 
        wake_up_process(k); 
    k; 
})

Using this macro reduces boilerplate and ensures that the thread starts running as soon as it is created.

How Kernel Threads Terminate

Kernel threads typically run until they explicitly exit using do_exit(), or they are externally stopped. Linux provides the kthread_stop() function for controlled termination:

int kthread_stop(struct task_struct *k);

Here, k is the pointer returned by kthread_create(). This function stops the thread and waits for its termination, ensuring a clean shutdown.

Important Points

Kernel threads are created using the clone() system call with specialized flags that determine how resources are shared.
They play a crucial role in performing internal kernel operations like I/O flushing and interrupt handling.
Functions like kthread_create() and the kthread_run() macro are standard mechanisms for spawning these threads.
Unlike user-space threads, kernel threads remain entirely within kernel-space and never interact with user-level memory or processes.

The Linux Implementation of Threads

In modern programming, threads serve as an essential abstraction that allows multiple flows of execution within a single program. By design, threads share the same memory space and can also access shared resources such as open files. This capability not only enhances concurrent programming but also enables true parallelism on multi-core and multi-processor systems.

A Unique Perspective: How Linux Handles Threads

Unlike some other operating systems, Linux takes a unique approach to threading. At the kernel level, Linux does not have a distinct or special concept of a “thread.” Instead, threads are implemented using the same mechanisms as regular processes. From the kernel’s point of view, there are no separate data structures or scheduling strategies specifically for threads.

In Linux, a thread is essentially just a process that has been configured to share certain resources such as memory or file descriptors with other processes. This means the kernel does not need to treat threads any differently than processes, making the overall system more uniform and consistent.

The Role of `task_struct` and Resource Sharing

Every thread in Linux is represented by its own unique task_struct, which is the kernel’s internal data structure used to manage process information. As a result, each thread looks like an independent process from the kernel’s point of view. What binds them together as threads is the fact that these “processes” are deliberately set up to share specific resources, like the address space.

This design allows Linux to maintain its standard process management system while still supporting multithreaded applications effectively.

Comparing Linux to Other Operating Systems

Many other operating systems, such as Windows and Solaris, offer explicit kernel-level support for threads. These are often referred to as “lightweight processes” and are seen as faster and less resource-intensive than full-blown processes.

Here is some theory explaination:

Other Operating Systems:
- A single process descriptor manages shared resources.
- Separate thread descriptors handle thread-specific data.
Linux:
- Every thread is represented as a separate process with its own task_struct.
- These processes are configured to share memory and other required resources.

To visualize this, imagine a program with four threads:

In other systems, you would have one process descriptor and four thread descriptors.
In Linux, you would find four separate process descriptors (task_structs), each sharing the same memory space and other relevant resources.

Why This Matters: Efficiency Through Uniformity

Though it may seem unconventional, Linux’s approach is both elegant and efficient. By treating threads as specialized processes, the kernel avoids the complexity of maintaining separate mechanisms for processes and threads. This leads to a uniform management system that simplifies scheduling and resource handling.

Important Points

Linux implements threads as processes that share specific resources.
This is a notable departure from operating systems that offer specialized thread support.
Using the same task_struct mechanism for both threads and processes simplifies kernel code.
The result is a consistent, unified system that handles both processes and threads efficiently.

Creating Threads in Linux

In Linux, thread creation is handled using the clone() system call, which is closely related to the process creation mechanism. While both threads and processes use clone(), what differentiates a thread from a process lies in the flags passed to this system call, these flags determine which resources the parent and child will share.

For example, the following code snippet creates a new task that shares several resources with its parent:

clone(CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0);

Here, CLONE_VM allows the new task to share the address space, CLONE_FS shares the filesystem-related information, CLONE_FILES shares open file descriptors, and CLONE_SIGHAND shares signal handlers. This sharing behavior is what effectively defines threads in Linux they are essentially processes with shared resources.

Comparison with fork() and vfork()

The fork() and vfork() system calls are simplified wrappers around clone() with specific flags:

A typical fork() call is equivalent to: clone(SIGCHLD, 0);
A vfork() call would be: clone(CLONE_VFORK | CLONE_VM | SIGCHLD, 0);

These simplified invocations make fork() and vfork() suitable for creating separate or semi-shared process contexts, whereas clone() offers fine-grained control over the sharing behavior.

clone() Flags and Their Meanings

The clone() system call is incredibly flexible due to the various flags defined in <linux/sched.h>. Here’s a concise reference table outlining what each flag does:

CLONE_FILES: Share open file descriptors.
CLONE_FS: Share filesystem information.
CLONE_IDLETASK: Set PID to zero (used by idle tasks).
CLONE_NEWNS: Create a new mount namespace.
CLONE_PARENT: Child inherits the same parent as the caller.
CLONE_PTRACE: Continue tracing the child process.
CLONE_SETTID: Write thread ID (TID) to user-space.
CLONE_SETTLS: Set up Thread Local Storage (TLS) for the child.
CLONE_SIGHAND: Share signal handlers and blocked signals.
CLONE_SYSVSEM: Share System V SEM_UNDO semantics.
CLONE_THREAD: Make the child part of the same thread group.
CLONE_VFORK: Block parent until child calls exec() or exit().
CLONE_UNTRACED: Disable parent process tracing.
CLONE_STOP: Start the child in a stopped state.
CLONE_CHILD_CLEARTID: Clear TID in child upon exit.
CLONE_CHILD_SETTID: Set TID in child.
CLONE_PARENT_SETTID: Set TID in parent.
CLONE_VM: Share memory space (address space).

Each of these flags allows developers to build very specific relationships between tasks, making Linux extremely versatile for managing both processes and threads.

Kernel Threads

Purpose

Kernel threads are specialized threads used internally by the Linux kernel to perform essential background operations. Unlike user-space processes, these threads operate exclusively within kernel space and are crucial for managing core system functions.

They are implemented as standard processes but differ in how they interact with memory and execution contexts. Their primary role is to execute kernel-level tasks without needing any interaction with user-space applications.

Differences from Normal Processes

Kernel threads have several unique characteristics that set them apart from normal (user-space) processes:

No Address Space:
Kernel threads do not have a user-space address space. Their mm pointer, which normally points to the memory descriptor structure for user-space processes, is set to NULL.
Kernel-Space Only:
These threads never switch context to user-space. Their execution remains strictly within the kernel, ensuring direct access to kernel resources and data structures.
Schedulable and Preemptable:
Like regular processes, kernel threads are managed by the scheduler, meaning they can be preempted or rescheduled like any other task. This enables efficient multitasking even within the kernel.

Kernel Thread Usage

The Linux kernel utilizes kernel threads for various internal operations that must occur independently of user-space processes. Some common examples include:

Flush Operations:
Writing cached data from memory to disk or other storage devices.
ksoftirqd:
Handles software interrupts that are deferred from the immediate hardware interrupt context.

To inspect currently running kernel threads, you can use:

This command lists all processes, including kernel threads, usually identified by square brackets around their names.

ps -ef

Kernel Thread Creation

Kernel threads are created during system boot and throughout the system’s runtime as needed. Notably:

Only kernel threads can spawn new kernel threads.
All newly created kernel threads are forked by a special kernel process known as kthreadd.

Using `kthread_create()`

The kthread_create() function is the primary method for creating a new kernel thread. It is declared in the <linux/kthread.h> header.

struct task_struct *kthread_create(int (*threadfn)(void *data),
                                   void *data,
                                   const char namefmt[],
                                   ...);

threadfn: A pointer to the function that the new kernel thread will execute.
data: Argument passed to threadfn.
namefmt: A format string (similar to printf) used to give the kernel thread a name.

Note: The newly created thread is initially in an unrunnable state. You must call wake_up_process() to start its execution.

Using `kthread_run()`

To simplify the creation and starting of a kernel thread, Linux provides the kthread_run() macro. This macro wraps the kthread_create() and wake_up_process() calls in one convenient function.

#define kthread_run(threadfn, data, namefmt, ...) 
    ({ 
        struct task_struct *k; 
        k = kthread_create(threadfn, data, namefmt, ## __VA_ARGS__); 
        if (!IS_ERR(k)) 
            wake_up_process(k); 
        k; 
    })

This macro is preferred when you want to both create and immediately run a kernel thread.

Kernel Thread Termination

Kernel threads can be terminated in two primary ways:

Self-Termination:
A kernel thread can call do_exit() to terminate itself gracefully.
External Termination:
Another part of the kernel can stop a kernel thread by using the kthread_stop() function.

int kthread_stop(struct task_struct *k);

The k parameter is the task_struct pointer returned from kthread_create().

When this function is called, the target kernel thread is notified to exit, and the caller can wait for the thread to finish.

Important Points

Kernel threads are fundamental for handling kernel-only background tasks.
They do not operate in user space and have no associated address space.
The kernel provides kthread_create() and kthread_run() to spawn them.

Process Termination in Linux

In Linux, processes are not meant to run indefinitely. At some point, every process must terminate either by choice or due to external intervention. When a process ends, the kernel takes responsibility for cleaning up all associated resources and informing the parent process about the termination. This cleanup ensures that system resources like memory, files, and semaphores are not leaked, and the system remains stable.

Methods of Process Termination

1. Self-Induced Termination

Processes can voluntarily terminate themselves using either of the following methods:

Explicit Termination:
The process directly invokes the exit() system call. This call tells the kernel to terminate the process and handle the necessary cleanup tasks.
Implicit Termination:
When the main() function finishes execution and returns a value, the compiler internally inserts a call to exit(). This ensures the process terminates properly even if exit() was not explicitly called by the programmer.

2. Involuntary Termination

Processes can also be terminated by the system or due to runtime faults:

Signal-Based Termination:
If a process receives a fatal signal (like SIGKILL or SIGTERM) and does not handle it, the kernel forcibly terminates the process.
Exception-Based Termination:
If a process encounters a critical error (e.g., a segmentation fault or divide-by-zero error), it is terminated to prevent unpredictable behavior or security issues.

do_exit(): The Termination Handler

At the heart of Linux process termination lies the do_exit() function, located in kernel/exit.c. This function is responsible for systematically dismantling the process, ensuring that no resources are left hanging and that the termination is safely communicated across the system.

Steps in `do_exit()` Execution

Each of the following steps contributes to a clean and complete process termination:

1. Marking the Process for Termination

The process is flagged with PF_EXITING in its task_struct.
This flag indicates to the kernel and other subsystems that the process is in the middle of exiting and should not be scheduled for further execution.

2. Removing Kernel Timers

The function del_timer_sync() is used to remove any active timers associated with the process.
This prevents timer callbacks from running after the process has exited, which could lead to kernel crashes or undefined behavior.

3. Accounting and Logging

If BSD-style accounting is enabled, the function acct_update_integrals() is called.
This logs information such as CPU time, memory usage, and other statistics, which can be useful for auditing or resource monitoring.

4. Releasing Address Space (`exit_mm()`)

The function exit_mm() is invoked to handle the memory cleanup.
If the memory space (mm_struct) is not shared with other processes, it is completely deallocated.
If it is shared, the exiting process detaches from it without affecting the other users of that memory space.

5. Releasing IPC Semaphores (`exit_sem()`)

The function exit_sem() ensures that the process is removed from all IPC semaphore queues.
This prevents deadlocks or resource contention for processes that continue running after the exiting process is gone.

6. Releasing Open Files and Filesystem Data (`exit_files()` and `exit_fs()`)

These functions decrement the reference count for file descriptors and filesystem structures.
When the count drops to zero, the kernel deallocates the respective resources.

7. Storing Exit Code

The process’s exit status is stored in the exit_code field of its task_struct.
This allows the parent process to later retrieve the status using system calls like wait() or waitpid().

8. Notifying the Parent and Reparenting Children (`exit_notify()`)

The kernel notifies the parent process that this process has exited.
If the exiting process has any child processes, they are usually reparented to the init process (PID 1), ensuring that they do not become orphaned.
The exiting process’s state is changed to EXIT_ZOMBIE, meaning it has terminated but still holds some information (like the exit code) until the parent collects it.

9. Final Scheduling Call (`schedule()`)

Finally, the process calls schedule() to relinquish control of the CPU.
This ensures that the process is never scheduled to run again, and its thread of execution is effectively dead.

NOTE : The process termination flow in Linux is showing great attention to detail to structured, guarantee resource cleanup, system stability, and proper notification mechanisms. Whether a process terminates on its own or is forced to stop, the kernel via do_exit() ensures that every aspect of its lifecycle is cleanly wrapped up.

Table for reference : Linux Process Termination Overview

Aspect	Description
Termination Types	– Self-Induced: Process calls `exit()` or returns from `main()`. – Involuntary: Signal-based (e.g., `SIGKILL`) or due to exceptions (e.g., segmentation fault).
Main Termination Function	`do_exit()` in `kernel/exit.c`. Handles all cleanup operations and final steps of termination.
do_exit() Steps	Action
1. Mark as Exiting	Set `PF_EXITING` in `task_struct`. Flags the process for termination.
2. Remove Timers	Call `del_timer_sync()` to cancel process timers.
3. Accounting	Use `acct_update_integrals()` (if enabled) to log usage stats.
4. Release Memory	`exit_mm()`: Free or detach from `mm_struct`.
5. Release IPC Semaphores	`exit_sem()`: Remove process from semaphore queues.
6. Release Filesystem Data	`exit_files()` & `exit_fs()`: Decrement ref counts, deallocate if zero.
7. Store Exit Code	Save exit status in `task_struct.exit_code` for parent retrieval.
8. Notify Parent & Reparent Children	`exit_notify()`: Notify parent, reparent children (to `init` usually), mark state as `EXIT_ZOMBIE`.
9. Final Scheduling	`schedule()`: Yield CPU, never scheduled again.

Diagram: Linux Process Termination Flow

  +----------------------------+
  |  Process is Terminating   |
  +------------+---------------+
               |
               v
  +----------------------------+
  | 1. Set PF_EXITING Flag     |
  +----------------------------+
               |
               v
  +----------------------------+
  | 2. Cancel Timers           |
  | del_timer_sync()          |
  +----------------------------+
               |
               v
  +----------------------------+
  | 3. Log Resource Usage      |
  | acct_update_integrals()   |
  +----------------------------+
               |
               v
  +----------------------------+
  | 4. Release Memory Space    |
  | exit_mm()                 |
  +----------------------------+
               |
               v
  +----------------------------+
  | 5. Exit IPC Semaphores     |
  | exit_sem()                |
  +----------------------------+
               |
               v
  +----------------------------+
  | 6. Close Open Files        |
  | exit_files(), exit_fs()   |
  +----------------------------+
               |
               v
  +----------------------------+
  | 7. Store Exit Code         |
  | in task_struct.exit_code |
  +----------------------------+
               |
               v
  +----------------------------+
  | 8. Notify Parent           |
  | & Reparent Children       |
  | exit_notify()             |
  +----------------------------+
               |
               v
  +----------------------------+
  | 9. Set to EXIT_ZOMBIE      |
  | and call schedule()       |
  +----------------------------+
               |
               v
  +----------------------------+
  | Process is Terminated     |
  +----------------------------+

Zombie Processes

When a process finishes execution and the do_exit() function completes, the process transitions into a zombie state. Although the process is technically no longer active, it continues to exist in a minimal form.

A zombie process is non-runnable and does not consume CPU cycles.
Its sole purpose is to provide termination information (like exit code) to the parent process.
It retains certain kernel structures such as task_struct, thread_info, and the kernel stack until the parent explicitly collects its exit status.

The zombie state persists until the parent process calls wait4() or any related wait() system call. This allows the parent to retrieve the exit details and finalize the child’s termination.

Removing the Process Descriptor

The actual deletion of the process descriptor (task_struct) happens after the process terminates. It is the parent’s responsibility to invoke wait4() to acknowledge the child’s termination and trigger the final cleanup.

Steps in `release_task()` Execution

Once the parent has acknowledged the child process’s exit, the system executes a series of cleanup steps in the release_task() function.

1. Unhashing the Process

Functions like __exit_signal(), __unhash_process(), and detach_pid() remove the process from:
- The PID hash table
- The task list in the kernel

2. Releasing Remaining Resources

__exit_signal() handles:
- Releasing system resources
- Updating process statistics
- Managing signal-related cleanup

3. Notifying the Thread Group

If the exiting process is the last thread in a thread group, the kernel:
- Notifies the thread leader’s parent for additional group-wide cleanup

4. Freeing Memory

put_task_struct() is called to deallocate memory, including:
- The kernel stack
- thread_info
- The task_struct itself

The Dilemma of Orphaned Processes

In some cases, a parent process terminates before its child processes. These children are then considered orphaned, and must be reparented to avoid becoming permanent zombies.

How Reparenting Works

1. Finding a New Parent (`find_new_reaper()`)

The kernel first tries to reassign the orphan to another thread in the same thread group.
If no such thread is available, the init process (PID 1) becomes the new parent to guarantee cleanup.

2. Updating Parent Pointers (`reparent_thread()`)

The orphan’s real_parent and parent pointers are updated to point to the new parent.

3. Handling ptrace-Traced Processes (`exit_ptrace()`)

If the orphan was being debugged:
- Its traced children are temporarily assigned to the debugger.
- Once the debugger exits or reparenting is finalized, those children are handed over to init or another appropriate thread.
Separate lists are maintained for:
- Normal children
- ptraced children

This separation optimizes and simplifies the reparenting logic.

The Role of the `init` Process in Cleanup

The init process (PID 1) plays a crucial role in maintaining system hygiene by preventing zombie accumulation.

It periodically calls wait() to collect exit codes from any orphaned children.
This ensures that zombies assigned to it are properly reaped.
Without this mechanism, orphaned zombies could accumulate, leading to resource leaks and system instability.

Important Points

-> Process termination involves releasing resources and informing the parent.

-> do_exit() is the core function handling this termination.

-> Zombie processes store exit info until collected by the parent.

-> release_task() handles the final removal of the process descriptor.

-> Orphaned processes are reparented to avoid resource leaks.

-> The init process ensures the system stays free of lingering zombies.

Conclusion

Linux’s process termination and cleanup mechanism is robust and systematic, ensuring that:

Resources are efficiently released
Zombie processes are temporary
Orphans are properly reparented
The system remains stable and performant

This layered design maintains process hygiene and prevents zombie buildup, even in edge cases like orphaned or debugged processes.

THANK YOU

Understanding Kernel Threads in Linux

What Makes Kernel Threads Different?

Where Are Kernel Threads Used?

How Kernel Threads Are Created

Using kthread_create() to Make Kernel Threads

The kthread_run() Convenience Macro

How Kernel Threads Terminate

Important Points

The Linux Implementation of Threads

A Unique Perspective: How Linux Handles Threads

The Role of task_struct and Resource Sharing

Comparing Linux to Other Operating Systems

Why This Matters: Efficiency Through Uniformity

Important Points

Creating Threads in Linux

Comparison with fork() and vfork()

clone() Flags and Their Meanings

Kernel Threads

Purpose

Differences from Normal Processes

Kernel Thread Usage

Kernel Thread Creation

Using kthread_create()

Using kthread_run()

Kernel Thread Termination

Important Points

Process Termination in Linux

Methods of Process Termination

1. Self-Induced Termination

2. Involuntary Termination

do_exit(): The Termination Handler

Steps in do_exit() Execution

1. Marking the Process for Termination

2. Removing Kernel Timers

3. Accounting and Logging

4. Releasing Address Space (exit_mm())

5. Releasing IPC Semaphores (exit_sem())

6. Releasing Open Files and Filesystem Data (exit_files() and exit_fs())

7. Storing Exit Code

8. Notifying the Parent and Reparenting Children (exit_notify())

9. Final Scheduling Call (schedule())

Table for reference : Linux Process Termination Overview

Diagram: Linux Process Termination Flow

Zombie Processes

Removing the Process Descriptor

Steps in release_task() Execution

1. Unhashing the Process

2. Releasing Remaining Resources

3. Notifying the Thread Group

4. Freeing Memory

The Dilemma of Orphaned Processes

How Reparenting Works

1. Finding a New Parent (find_new_reaper())

2. Updating Parent Pointers (reparent_thread())

3. Handling ptrace-Traced Processes (exit_ptrace())

The Role of the init Process in Cleanup

Important Points

Conclusion

Comments

Leave a Reply Cancel reply

Using `kthread_create()` to Make Kernel Threads

The `kthread_run()` Convenience Macro

The Role of `task_struct` and Resource Sharing

Using `kthread_create()`

Using `kthread_run()`

Steps in `do_exit()` Execution

4. Releasing Address Space (`exit_mm()`)

5. Releasing IPC Semaphores (`exit_sem()`)

6. Releasing Open Files and Filesystem Data (`exit_files()` and `exit_fs()`)

8. Notifying the Parent and Reparenting Children (`exit_notify()`)

9. Final Scheduling Call (`schedule()`)

Steps in `release_task()` Execution

1. Finding a New Parent (`find_new_reaper()`)

2. Updating Parent Pointers (`reparent_thread()`)

3. Handling ptrace-Traced Processes (`exit_ptrace()`)

The Role of the `init` Process in Cleanup