Concurrency & Multithreading in iOS

Prayash Thapa, Former Developer

Article Category: #Code

Posted on February 25, 2020

C o n c u r r e n c y a n d m u l t i t h r e a d i n g a r e a c o r e p a r t o f i O S d e v e l o p m e n t . L e t ' s d i v e i n t o w h a t m a k e s t h e m s o p o w e r f u l , a n d h o w w e c a n l e v e r a g e t h e m i n o u r o w n C o c o a T o u c h a p p l i c a t i o n s .

Concurrency is the notion of multiple things happening at the same time. This is generally achieved either via time-slicing, or truly in parallel if multiple CPU cores are available to the host operating system. We've all experienced a lack of concurrency, most likely in the form of an app freezing up when running a heavy task. UI freezes don't necessarily occur due to the absence of concurrency — they could just be symptoms of buggy software — but software that doesn't take advantage of all the computational power at its disposal is going to create these freezes whenever it needs to do something resource-intensive. If you've profiled an app hanging in this way, you'll probably see a report that looks like this:

Anything related to file I/O, data processing, or networking usually warrants a background task (unless you have a very compelling excuse to halt the entire program). There aren't many reasons that these tasks should block your user from interacting with the rest of your application. Consider how much better the user experience of your app could be if instead, the profiler reported something like this:

Analyzing an image, processing a document or a piece of audio, or writing a sizeable chunk of data to disk are examples of tasks that could benefit greatly from being delegated to background threads. Let's dig into how we can enforce such behavior into our iOS applications.

A Brief History #

In the olden days, the maximum amount of work per CPU cycle that a computer could perform was determined by the clock speed. As processor designs became more compact, heat and physical constraints started becoming limiting factors for higher clock speeds. Consequentially, chip manufacturers started adding additional processor cores on each chip in order to increase total performance. By increasing the number of cores, a single chip could execute more CPU instructions per cycle without increasing its speed, size, or thermal output. There's just one problem...

How can we take advantage of these extra cores? Multithreading. #

Multithreading is an implementation handled by the host operating system to allow the creation and usage of n amount of threads. Its main purpose is to provide simultaneous execution of two or more parts of a program to utilize all available CPU time. Multithreading is a powerful technique to have in a programmer's toolbelt, but it comes with its own set of responsibilities. A common misconception is that multithreading requires a multi-core processor, but this isn't the case — single-core CPUs are perfectly capable of working on many threads, but we'll take a look in a bit as to why threading is a problem in the first place. Before we dive in, let's look at the nuances of what concurrency and parallelism mean using a simple diagram:

In the first situation presented above, we observe that tasks can run concurrently, but not in parallel. This is similar to having multiple conversations in a chatroom, and interleaving (context-switching) between them, but never truly conversing with two people at the same time. This is what we call concurrency. It is the illusion of multiple things happening at the same time when in reality, they're switching very quickly. Concurrency is about dealing with lots of things at the same time. Contrast this with the parallelism model, in which both tasks run simultaneously. Both execution models exhibit multithreading, which is the involvement of multiple threads working towards one common goal. Multithreading is a generalized technique for introducing a combination of concurrency and parallelism into your program.

The Burden of Threads #

A modern multitasking operating system like iOS has hundreds of programs (or processes) running at any given moment. However, most of these programs are either system daemons or background processes that have very low memory footprint, so what is really needed is a way for individual applications to make use of the extra cores available. An application (process) can have many threads (sub-processes) operating on shared memory. Our goal is to be able to control these threads and use them to our advantage.

Historically, introducing concurrency to an app has required the creation of one or more threads. Threads are low-level constructs that need to be managed manually. A quick skim through Apple's Threaded Programming Guide is all it takes to see how much complexity threaded code adds to a codebase. In addition to building an app, the developer has to:

Responsibly create new threads, adjusting that number dynamically as system conditions change
Manage them carefully, deallocating them from memory once they have finished executing
Leverage synchronization mechanisms like mutexes, locks, and semaphores to orchestrate resource access between threads, adding even more overhead to application code
Mitigate risks associated with coding an application that assumes most of the costs associated with creating and maintaining any threads it uses, and not the host OS

This is unfortunate, as it adds enormous levels of complexity and risk without any guarantees of improved performance.

Grand Central Dispatch #

iOS takes an asynchronous approach to solving the concurrency problem of managing threads. Asynchronous functions are common in most programming environments, and are often used to initiate tasks that might take a long time, like reading a file from the disk, or downloading a file from the web. When invoked, an asynchronous function executes some work behind the scenes to start a background task, but returns immediately, regardless of how long the original task might takes to actually complete.

A core technology that iOS provides for starting tasks asynchronously is Grand Central Dispatch (or GCD for short). GCD abstracts away thread management code and moves it down to the system level, exposing a light API to define tasks and execute them on an appropriate dispatch queue. GCD takes care of all thread management and scheduling, providing a holistic approach to task management and execution, while also providing better efficiency than traditional threads.

Let's take a look at the main components of GCD:

What've we got here? Let's start from the left:

DispatchQueue.main: The main thread, or the UI thread, is backed by a single serial queue. All tasks are executed in succession, so it is guaranteed that the order of execution is preserved. It is crucial that you ensure all UI updates are designated to this queue, and that you never run any blocking tasks on it. We want to ensure that the app's run loop (called CFRunLoop) is never blocked in order to maintain the highest framerate. Subsequently, the main queue has the highest priority, and any tasks pushed onto this queue will get executed immediately.
DispatchQueue.global: A set of global concurrent queues, each of which manage their own pool of threads. Depending on the priority of your task, you can specify which specific queue to execute your task on, although you should resort to using default most of the time. Because tasks on these queues are executed concurrently, it doesn't guarantee preservation of the order in which tasks were queued.

Notice how we're not dealing with individual threads anymore? We're dealing with queues which manage a pool of threads internally, and you will shortly see why queues are a much more sustainable approach to multhreading.

Serial Queues: The Main Thread #

As an exercise, let's look at a snippet of code below, which gets fired when the user presses a button in the app. The expensive compute function can be anything. Let's pretend it is post-processing an image stored on the device.

import UIKit

class ViewController: UIViewController {
    @IBAction func handleTap(_ sender: Any) {
        compute()
    }

    private func compute() -> Void {
        // Pretending to post-process a large image.
        var counter = 0
        for _ in 0..<9999999 {
            counter += 1
        }
    }
}

At first glance, this may look harmless, but if you run this inside of a real app, the UI will freeze completely until the loop is terminated, which will take... a while. We can prove it by profiling this task in Instruments. You can fire up the Time Profiler module of Instruments by going to Xcode > Open Developer Tool > Instruments in Xcode's menu options. Let's look at the Threads module of the profiler and see where the CPU usage is highest.

We can see that the Main Thread is clearly at 100% capacity for almost 5 seconds. That's a non-trivial amount of time to block the UI. Looking at the call tree below the chart, we can see that the Main Thread is at 99.9% capacity for 4.43 seconds! Given that a serial queue works in a FIFO manner, tasks will always complete in the order in which they were inserted. Clearly the compute() method is the culprit here. Can you imagine clicking a button just to have the UI freeze up on you for that long?

Background Threads #

How can we make this better? DispatchQueue.global() to the rescue! This is where background threads come in. Referring to the GCD architecture diagram above, we can see that anything that is not the Main Thread is a background thread in iOS. They can run alongside the Main Thread, leaving it fully unoccupied and ready to handle other UI events like scrolling, responding to user events, animating etc. Let's make a small change to our button click handler above:

class ViewController: UIViewController {
    @IBAction func handleTap(_ sender: Any) {
        DispatchQueue.global(qos: .userInitiated).async { [unowned self] in
            self.compute()
        }
    }

    private func compute() -> Void {
        // Pretending to post-process a large image.
        var counter = 0
        for _ in 0..<9999999 {
            counter += 1
        }
    }
}

Unless specified, a snippet of code will usually default to execute on the Main Queue, so in order to force it to execute on a different thread, we'll wrap our compute call inside of an asynchronous closure that gets submitted to the DispatchQueue.global queue. Keep in mind that we aren't really managing threads here. We're submitting tasks (in the form of closures or blocks) to the desired queue with the assumption that it is guaranteed to execute at some point in time. The queue decides which thread to allocate the task to, and it does all the hard work of assessing system requirements and managing the actual threads. This is the magic of Grand Central Dispatch. As the old adage goes, you can't improve what you can't measure. So we measured our truly terrible button click handler, and now that we've improved it, we'll measure it once again to get some concrete data with regards to performance.

Looking at the profiler again, it's quite clear to us that this is a huge improvement. The task takes an identical amount of time, but this time, it's happening in the background without locking up the UI. Even though our app is doing the same amount of work, the perceived performance is much better because the user will be free to do other things while the app is processing.

You may have noticed that we accessed a global queue of .userInitiated priority. This is an attribute we can use to give our tasks a sense of urgency. If we run the same task on a global queue of and pass it a qos attribute of background , iOS will think it's a utility task, and thus allocate fewer resources to execute it. So, while we don't have control over when our tasks get executed, we do have control over their priority.

A Note on Main Thread vs. Main Queue #

You might be wondering why the Profiler shows "Main Thread" and why we're referring to it as the "Main Queue". If you refer back to the GCD architecture we described above, the Main Queue is solely responsible for managing the Main Thread. The Dispatch Queues section in the Concurrency Programming Guide says that "the main dispatch queue is a globally available serial queue that executes tasks on the application’s main thread. Because it runs on your application’s main thread, the main queue is often used as a key synchronization point for an application."

The terms "execute on the Main Thread" and "execute on the Main Queue" can be used interchangeably.

Concurrent Queues #

So far, our tasks have been executed exclusively in a serial manner. DispatchQueue.main is by default a serial queue, and DispatchQueue.global gives you four concurrent dispatch queues depending on the priority parameter you pass in.

Let's say we want to take five images, and have our app process them all in parallel on background threads. How would we go about doing that? We can spin up a custom concurrent queue with an identifier of our choosing, and allocate those tasks there. All that's required is the .concurrent attribute during the construction of the queue.

class ViewController: UIViewController {
    let queue = DispatchQueue(label: "com.app.concurrentQueue", attributes: .concurrent)
    let images: [UIImage] = [UIImage].init(repeating: UIImage(), count: 5)

    @IBAction func handleTap(_ sender: Any) {
        for img in images {
            queue.async { [unowned self] in
                self.compute(img)
            }
        }
    }

    private func compute(_ img: UIImage) -> Void {
        // Pretending to post-process a large image.
        var counter = 0
        for _ in 0..<9999999 {
            counter += 1
        }
    }
}

Running that through the profiler, we can see that the app is now spinning up 5 discrete threads to parallelize a for-loop.

Parallelization of N Tasks #

So far, we've looked at pushing computationally expensive task(s) onto background threads without clogging up the UI thread. But what about executing parallel tasks with some restrictions? How can Spotify download multiple songs in parallel, while limiting the maximum number up to 3? We can go about this in a few ways, but this is a good time to explore another important construct in multithreaded programming: semaphores.

Semaphores are signaling mechanisms. They are commonly used to control access to a shared resource. Imagine a scenario where a thread can lock access to a certain section of the code while it executes it, and unlocks after it's done to let other threads execute the said section of the code. You would see this type of behavior in database writes and reads, for example. What if you want only one thread writing to a database and preventing any reads during that time? This is a common concern in thread-safety called Readers-writer lock. Semaphores can be used to control concurrency in our app by allowing us to lock n number of threads.

let kMaxConcurrent = 3 // Or 1 if you want strictly ordered downloads!
let semaphore = DispatchSemaphore(value: kMaxConcurrent)
let downloadQueue = DispatchQueue(label: "com.app.downloadQueue", attributes: .concurrent)

class ViewController: UIViewController {
    @IBAction func handleTap(_ sender: Any) {
        for i in 0..<15 {
            downloadQueue.async { [unowned self] in
                // Lock shared resource access
                semaphore.wait()

                // Expensive task
                self.download(i + 1)

                // Update the UI on the main thread, always!
                DispatchQueue.main.async {
                    tableView.reloadData()

                    // Release the lock
                    semaphore.signal()
                }
            }
        }
    }

    func download(_ songId: Int) -> Void {
        var counter = 0

        // Simulate semi-random download times.
        for _ in 0..<Int.random(in: 999999...10000000) {
            counter += songId
        }
    }
}

Notice how we've effectively restricted our download system to limit itself to k number of downloads. The moment one download finishes (or thread is done executing), it decrements the semaphore, allowing the managing queue to spawn another thread and start downloading another song. You can apply a similar pattern to database transactions when dealing with concurrent reads and writes.

Semaphores usually aren't necessary for code like the one in our example, but they become more powerful when you need to enforce synchronous behavior whille consuming an asynchronous API. The above could would work just as well with a custom NSOperationQueue with a maxConcurrentOperationCount, but it's a worthwhile tangent regardless.

Finer Control with OperationQueue #

GCD is great when you want to dispatch one-off tasks or closures into a queue in a 'set-it-and-forget-it' fashion, and it provides a very lightweight way of doing so. But what if we want to create a repeatable, structured, long-running task that produces associated state or data? And what if we want to model this chain of operations such that they can be cancelled, suspended and tracked, while still working with a closure-friendly API? Imagine an operation like this:

This would be quite cumbersome to achieve with GCD. We want a more modular way of defining a group of tasks while maintaining readability and also exposing a greater amount of control. In this case, we can use Operation objects and queue them onto an OperationQueue, which is a high-level wrapper around DispatchQueue. Let's look at some of the benefits of using these abstractions and what they offer in comparison to the lower-level GCI API:

You may want to create dependencies between tasks, and while you could do this via GCD, you're better off defining them concretely as Operation objects, or units of work, and pushing them onto your own queue. This would allow for maximum reusability since you may use the same pattern elsewhere in an application.
The Operation and OperationQueue classes have a number of properties that can be observed, using KVO (Key Value Observing). This is another important benefit if you want to monitor the state of an operation or operation queue.
Operations can be paused, resumed, and cancelled. Once you dispatch a task using Grand Central Dispatch, you no longer have control or insight into the execution of that task. The Operation API is more flexible in that respect, giving the developer control over the operation's life cycle.
OperationQueue allows you to specify the maximum number of queued operations that can run simultaneously, giving you a finer degree of control over the concurrency aspects.

The usage of Operation and OperationQueue could fill an entire blog post, but let's look at a quick example of what modeling dependencies looks like. (GCD can also create dependencies, but you're better off dividing up large tasks into a series of composable sub-tasks.) In order to create a chain of operations that depend on one another, we could do something like this:

class ViewController: UIViewController {
    var queue = OperationQueue()
    var rawImage = UIImage? = nil
    let imageUrl = URL(string: "https://example.com/portrait.jpg")!
    @IBOutlet weak var imageView: UIImageView!

    let downloadOperation = BlockOperation {
        let image = Downloader.downloadImageWithURL(url: imageUrl)
        OperationQueue.main.async {
            self.rawImage = image
        }
    }

    let filterOperation = BlockOperation {
        let filteredImage = ImgProcessor.addGaussianBlur(self.rawImage)
        OperationQueue.main.async {
            self.imageView = filteredImage
        }
    }

    filterOperation.addDependency(downloadOperation)

    [downloadOperation, filterOperation].forEach {
        queue.addOperation($0)
     }
}

So why not opt for a higher level abstraction and avoid using GCD entirely? While GCD is ideal for inline asynchronous processing, Operation provides a more comprehensive, object-oriented model of computation for encapsulating all of the data around structured, repeatable tasks in an application. Developers should use the highest level of abstraction possible for any given problem, and for scheduling consistent, repeated work, that abstraction is Operation. Other times, it makes more sense to sprinkle in some GCD for one-off tasks or closures that we want to fire. We can mix both OperationQueue and GCD to get the best of both worlds.

The Cost of Concurrency #

DispatchQueue and friends are meant to make it easier for the application developer to execute code concurrently. However, these technologies do not guarantee improvements to the efficiency or responsiveness in an application. It is up to you to use queues in a manner that is both effective and does not impose an undue burden on other resources. For example, it's totally viable to create 10,000 tasks and submit them to a queue, but doing so would allocate a nontrivial amount of memory and introduce a lot of overhead for the allocation and deallocation of operation blocks. This is the opposite of what you want! It's best to profile your app thoroughly to ensure that concurrency is enhancing your app's performance and not degrading it.

We've talked about how concurrency comes at a cost in terms of complexity and allocation of system resources, but introducing concurrency also brings a host of other risks like:

Deadlock: A situation where a thread locks a critical portion of the code and can halt the application's run loop entirely. In the context of GCD, you should be very careful when using the dispatchQueue.sync { } calls as you could easily get yourself in situations where two synchronous operations can get stuck waiting for each other.
Priority Inversion: A condition where a lower priority task blocks a high priority task from executing, which effectively inverts their priorities. GCD allows for different levels of priority on its background queues, so this is quite easily a possibility.
Producer-Consumer Problem: A race condition where one thread is creating a data resource while another thread is accessing it. This is a synchronization problem, and can be solved using locks, semaphores, serial queues, or a barrier dispatch if you're using concurrent queues in GCD.
...and many other sorts of locking and data-race conditions that are hard to debug! Thread safety is of the utmost concern when dealing with concurrency.

Parting Thoughts + Further Reading #

If you've made it this far, I applaud you. Hopefully this article gives you a lay of the land when it comes to multithreading techniques on iOS, and how you can use some of them in your app. We didn't get to cover many of the lower-level constructs like locks, mutexes and how they help us achieve synchronization, nor did we get to dive into concrete examples of how concurrency can hurt your app. We'll save those for another day, but you can dig into some additional reading and videos if you're eager to dive deeper.

Finding Balance With AI and Digital Teams

Beautiful, Dynamic Charts