Java Multithreading Fundamentals | Part 0

Published on July 14, 2024 • 12 min read

Tags:

Java

Concurrency and Parallelism

Concurrency and parallelism are often used interchangeably, but they refer to different concepts in computing. To illustrate, let’s use a circus juggler as an analogy.

Serial Execution

In serial execution, tasks are handled one after the other on the CPU. Each task runs from start to finish before the next begins. Imagine a juggler who can only juggle one ball at a time. This scenario is not very dynamic!

Concurrency

A concurrent system can handle multiple tasks by allowing them to overlap in their execution. This doesn't mean tasks run simultaneously but that they make progress in an interleaved manner. For example, a single-core machine running a browser must be responsive to user inputs while rendering HTML. The system appears to handle tasks concurrently by switching between them rapidly.

In our juggler analogy, a concurrent juggler can juggle several balls, but only one ball is in hand at any time. The juggler switches between balls, giving each a moment to be caught and thrown. This represents how concurrent systems manage multiple processes by giving each a time slice to make progress.

Parallelism

Parallel systems, in contrast, execute multiple tasks simultaneously. This is facilitated by hardware like multicore processors or computing clusters. A problem must be divisible into independent parts that can be executed in parallel without affecting the outcome.

Consider our juggler analogy again: a parallel system is like having multiple jugglers, each juggling several balls. A multicore machine running an operating system can execute multiple tasks at once, making the execution truly parallel.

Concurrency vs Parallelism

It’s crucial to understand that:

Concurrency refers to dealing with multiple tasks by interleaving their execution. A system can be concurrent without being parallel.
Parallelism involves executing multiple tasks simultaneously. A parallel system is inherently concurrent.

Preemptive Multitasking and Cooperative Multitasking

Preemptive Multitasking

In preemptive multitasking, the operating system takes control over task scheduling. The scheduler decides which program or thread runs on the CPU and for how long. Programs cannot control when they use the CPU; the operating system manages this aspect.

This model ensures that no single program can monopolize the CPU. If a program enters an infinite loop, it impacts only itself, not other tasks. This approach also relieves programmers from managing CPU time directly. However, it introduces unpredictability in task scheduling, as the timing of task execution is managed by the OS.

Cooperative Multitasking

Cooperative multitasking requires programs to voluntarily yield control to the scheduler. Programs or threads must give up the CPU either after a set time or when they are idle. Unlike preemptive multitasking, the scheduler does not control how long a program runs.

In cooperative multitasking, a malicious or poorly designed program can halt the entire system by refusing to relinquish control. This model depends heavily on the cooperation of all programs to function correctly. The process scheduler in such systems is known as a cooperative scheduler, which relies on the active participation of programs.

Cooperative vs. Preemptive

Early versions of Windows and Mac OS utilized cooperative multitasking. Later, both systems adopted preemptive multitasking: Windows NT 3.1 and Mac OS X. Preemptive multitasking is a fundamental feature of Unix-based systems.

Synchronous vs. Asynchronous Execution

Synchronous Execution

Synchronous execution means that code runs line by line. When a function is called, the program halts until that function completes. Each method call blocks the execution of subsequent lines of code, ensuring that tasks are completed in the exact sequence they appear. This approach is also known as serial execution.

For example, if you have a series of function calls, each call waits for the previous one to finish before continuing. This method is straightforward but can be inefficient, especially for tasks that involve waiting, such as network requests or file operations.

Asynchronous Execution

Asynchronous execution allows the program to continue running while a task is being performed in the background. Instead of waiting for a function to complete, the program moves on to execute the next lines of code.

In asynchronous programming, tasks are executed out of sequence. When a function is called, it may return a future or promise representing the result of the computation. This allows the program to check the status of the operation or handle results through a callback function once the task is finished.

Asynchronous execution is ideal for applications with extensive network or disk I/O operations, as it minimizes waiting time. For example, JavaScript uses AJAX for asynchronous method calls, allowing web pages to load data without blocking user interactions.

In non-threaded environments, asynchronous programming provides a way to achieve concurrency without using multiple threads, fitting into the cooperative multitasking model.

I/O Bound vs. CPU Bound Programs

CPU Bound

A CPU bound program is one that relies heavily on CPU resources. These programs are compute-intensive and require close to 100% CPU utilization. Examples include data crunching, image processing, and matrix operations.

To speed up a CPU bound program, you can use a more powerful CPU or leverage multiple CPUs through multithreading. For instance, summing numbers from 1 to 1 million can be divided into two threads, each running on a separate CPU. This approach theoretically reduces execution time, though overhead for thread creation and result merging may occur.

I/O Bound

I/O bound programs spend a significant amount of time waiting for I/O operations to complete, such as reading from or writing to disk, or network interfaces. These operations can cause the CPU to remain idle while data moves between memory and peripherals. Consequently, I/O bound programs show lower CPU utilization compared to CPU bound programs.

Both CPU and I/O bound programs benefit from concurrent architectures. CPU bound programs can use multiple processors and threads for parallel execution. I/O bound programs can use threads that yield control while waiting for I/O, allowing other threads to utilize the CPU.

Different programming languages support multithreading to varying extents. Java offers full multithreading capabilities, while Python’s Global Interpreter Lock (GIL) limits effective multithreading. JavaScript, being single-threaded, uses asynchronous programming for concurrency.

Throughput vs. Latency

Throughput

Throughput measures the rate at which work is completed or the amount of work done per unit of time. For example, in the context of Instagram, throughput could be the number of images downloaded per second.

Latency

Latency refers to the time taken to complete a task or produce a result. Also known as response time, it measures how long it takes to download an Instagram image from the internet.

Understanding Critical Sections and Race Conditions

Critical Section

A critical section is a segment of code where shared data or resources are accessed by multiple threads. Proper synchronization is crucial to prevent data inconsistency.

Race Condition

Race conditions occur when multiple threads access a critical section without proper synchronization. This "race" can lead to inconsistent data if threads read or modify shared resources simultaneously.

For example, consider a thread that checks a condition and then acts based on that condition. If another thread changes the state between the check and the action, the result can be erroneous.

Example of a Race Condition

In the code below, two threads interact with a shared variable, randInt. One thread updates randInt, while the other prints its value if divisible by 5. Due to the lack of synchronization, some values printed are not divisible by 5, illustrating a race condition.

public class RaceCondition {
 
    int randInt;
    Random random = new Random(System.currentTimeMillis());
 
    void printer() {
        int i = 1000000;
        while (i != 0) {
            if (randInt % 5 == 0) {
                System.out.println(randInt);
            }
            i--;
        }
    }
 
    void modifier() {
        while (true) {
            randInt = random.nextInt(100);
        }
    }
 
    public static void runTest() throws InterruptedException {
        final RaceCondition rc = new RaceCondition();
        Thread thread1 = new Thread(rc::printer);
        Thread thread2 = new Thread(rc::modifier);
 
        thread1.start();
        thread2.start();
 
        thread1.join();
        thread2.join();
    }
}

Fixing the Race Condition

To address this, we use synchronization to ensure that randInt is accessed safely:

void printer() {
    int i = 1000000;
    while (i != 0) {
        synchronized(this) {
            if (randInt % 5 == 0) {
                System.out.println(randInt);
            }
        }
        i--;
    }
}

Deadlock, Live-Lock, and Starvation: Key Concepts in Concurrency

Deadlock

A deadlock occurs when two or more threads are stuck waiting for resources held by each other. Each thread holds a resource the other needs, causing a cycle of dependencies with no progress.

Live-Lock

A live-lock happens when threads continuously change states in response to each other but make no progress. It’s like two people in a hallway trying to let each other pass, but neither succeeds.

Starvation

Starvation occurs when a thread is perpetually denied access to resources due to other threads consuming them. The starving thread is unable to proceed because other threads keep hogging the shared resources.

Mutex vs Semaphore

Mutex

A Mutex (short for mutual exclusion) is designed to control access to shared resources, such as data structures or primitive types. It ensures that only one thread can access the critical section of code at a time.

How It Works

Acquiring a Mutex: When a thread acquires a mutex, other threads trying to acquire the same mutex are blocked.
Releasing a Mutex: Once the mutex is released by the thread, one of the waiting threads is chosen to acquire the mutex and proceed.

Example

In a scenario where two threads are trying to access shared data, a mutex ensures that only one thread can modify the data at a time, preventing corruption.

Semaphore

A Semaphore is used to manage access to a pool of resources. It operates with a set number of permits, allowing a limited number of threads to access resources concurrently.

Key Points

Permits: If all permits are in use, additional threads will be blocked until a permit is released.
Binary Semaphore: A semaphore with a single permit acts like a mutex but with different semantics.

Example

Consider a database connection pool with a fixed number of connections. A semaphore can control how many threads can access the pool at once.

Understanding Monitors in Concurrency

A monitor combines a mutex with condition variables, providing a more comprehensive synchronization mechanism. It can manage multiple condition variables but is always associated with a single mutex.

How Monitors Work

Entry Set: Threads waiting to acquire the monitor.
Wait Set: Threads that have acquired the monitor but are waiting for a condition to be met.

When a thread enters a monitor, it acquires ownership and blocks other threads from entering the critical section. Threads can call wait() to be placed in the wait set and signal() to notify waiting threads.

Example in Java

In Java, every object is implicitly a monitor, featuring built-in locking and condition variables. This simplifies synchronization by allowing objects to manage mutual exclusion and condition signaling directly.

A condition variable works with a mutex to provide a mechanism for waiting and signaling between threads. It solves the problem of spin waiting by allowing a thread to wait until a condition changes.

Understanding Amdahl's Law

In the realm of concurrency, Amdahl's Law is crucial for understanding the limits of parallelism. It defines the maximum theoretical speedup achievable when parallelizing a program.

The Concept of Amdahl's Law

Amdahl's Law highlights the limitation of parallelism by illustrating that not all parts of a program can be parallelized. Some parts must run serially and cannot benefit from multiple processors.

The Basic Idea

Imagine a poultry farm where 100 hens lay eggs each day. Regardless of how many workers you hire, you still have to wait a full day for all the eggs to be laid. Similarly, certain parts of a software program must execute serially and cannot be sped up by adding more processors.

The Law in Practice

The simplified equation of Amdahl's Law is:

\[ S(n) = \frac{1}{(1 - P) + \frac{P}{n}} \]
 
- **S(n)**: Speed-up achieved with **n** cores or threads.
- **P**: Fraction of the program that is parallelizable.
- **(1 - P)**: Fraction that must be executed serially.

Example Calculation

For a program with 90% parallelizable code (P = 0.9):

1 Processor: Speed-up = 1
2 Processors: Speed-up = 1.82
5 Processors: Speed-up = 3.33
10 Processors: Speed-up = 5
100 Processors: Speed-up = 9.09
1000 Processors: Speed-up = 9.9
Infinite Processors: Speed-up = 10

Even with an infinite number of processors, the maximum speed-up is capped at 10 times the original execution time due to the 10% serial portion of the program.

The Relevance of Moore's Law in Modern Computing

What is Moore's Law?

Gordon Moore, co-founder of Intel, observed that the number of transistors on a chip doubles approximately every two years. This doubling effect leads to increased processing power and reduced cost per transistor. While originally an observation rather than a strict scientific law, Moore’s insight has guided the tech industry since the 1970s.