Processes vs. Threads in Cocoa software architectures
Processes vs. Threads in Cocoa software architectures
- Subject: Processes vs. Threads in Cocoa software architectures
- From: Erik Buck <email@hidden>
- Date: Wed, 6 Sep 2006 14:12:57 -0700 (PDT)
Processes vs. Threads in Cocoa software architectures
Every modern Cocoa application is already multi-threaded. The NSUIHeartBeat is a separate thread that handles the pulsing of buttons, the movement of progress bars, and timers. It does this by messaging the main thread. Fortunately, Cocoa programmers seldom if ever need to deal with NSUIHeartBeat directly.
Are there any other threads created by default in Cocoa applications ? I dont know. If there are, they are well hidden within the frameworks and therefore not a problem for Cocoa developers.
When designing Cocoa software to solve specific problems, under what circumstances are additional threads preferred over multiple processes and what specific approaches are applicable for common Cocoa development tasks ?
----------------------------------------------------------------
Before I give my answer, step into the way-back machine for a moment. Mac OS 9 and its predecessors were widely derided for not providing memory protection. Memory protection was considered an essential capability for any modern operating system in 1990 and many people would argue it was required in 1960 too. Memory protection means that one application (a.k.a. a Process a.k.a. a Task) can not corrupt the memory used by another application or by the operating system. In other words, a poorly written application can not prevent the rest of the system from working. It also commonly means that when an application exits either normally or because of an error, all memory resources used by the application are automatically restored to the system and not leaked. This is important because applications sometimes crash, and you wouldnt want to have to reboot to recover memory lost due to an application crash.
About the same time memory protection became a must-have for desktop operating systems, the concept of threads a.k.a. light-weight-processes were introduced to desktop and workstation operating systems. A lot of applications perform parallel operations. For example, a database application may perform multiple queries simultaneously. A networking application may accept many client connections simultaneously. A number crunching application may divide a problem into multiple chunks and solve each chunk simultaneously. You get the idea.
For parallel-izable (sp?) applications (a.k.a. processes), multiple threads of execution can be a huge win. Each thread of execution is a sequence of computer instructions to be processed by a processor. Each application (a.k.a. process) is composed of one or more threads of execution. When a system has multiple processors available, multiple threads of execution can run concurrently thus maximizing utilization of the available processors.
Curiously, in operating systems that provide protected memory, the principal difference between a thread and a process is that a processes memory is protected from all other processes, but all threads within a process share the same memory. In multi-threaded applications, one thread can corrupt the memory used by another thread within the same application.
----------------------------------------------------------------
Multiple threads have a role in applications that have Graphical User Interfaces. Users prefer user interfaces that are responsive. If a user presses a button to start a lengthy number crunching operation, the user expects that user interface to continue operating even while the number crunching goes on. This is particularly important if the user wants to be able to press a Cancel button to stop the number crunching. With Cocoa applications, the symptom of an unresponsive application user interface is the appearance of the notorious spinning beach ball. One approach to keeping the user interface responsive and avoiding the beach ball is to respond to user input in one thread and do whatever else needs to be done in one or more other threads.
Sadly, all is not well. The very same reasons that processes need to be protected from each other apply to threads as well. In fact, using multiple threads in an application magnifies all of the problems that used to afflict applications before memory protection was available. There exists a whole host of subtle but profound errors that only occur when resources such as memory are shared by multiple threads. (e.g. race conditions, priority inversion, deadlock, data corruption, coherent shutdown issues, a variation of the halting problem, order of operations issues, data spew/overload situations, etc.)
With all the serious problems using multiple threads within an application introduces, why were threads ever invented ?
Before there were threads, parallel-izable applications were implemented using multiple processes (e.g. helper programs). For example, when a database application received a query, the application would start a helper program that performed the query, returned the result to the database application, and then quit. Multiple queries could be executed simultaneously by running multiple instances of the helper program simultaneously. However, that architecture had a number of bottle-necks that limited performance. It would take time to start each helper program and send the information about the query to the helper program. Similarly, it would take time to return the results to the main application and shutdown the helper application. If you wanted to perform 1,000 queries per second on a 486 DX2 running at 50 Mhz, the time needed to start, shutdown, and communicate with helper applications was too much.
Threads were invented to be light weight meaning that a thread could be started or shutdown much more quickly than a helper program. Similarly, communication overhead could be minimized between multiple threads running in one process. All you had to give up to use multiple threads was the memory protection needed to consider an operating system more than a toy. Ironic, huh ?
The good news today is that the slowest new Mac you can buy is at least 40 times faster than the canonical 486 DX2, and operating systems are more efficient too. For most types of application, none of the performance issues that prompted the invention of threads are relevant anymore. To answer the first part of the question, When designing Cocoa software to solve specific problems, under what circumstances are multiple threads preferred over multiple processes .. ?, I now reply, Damn Few!
Look, just dont use multiple threads OK. You either value memory protection or you dont. If you value it, use it, and than means use multiple processes instead of multiple threads.
The more interesting question is
what specific approaches are applicable for common Cocoa development tasks ?
Whether you are communicating between multiple threads or multiple processes, I highly recommend Cocoas Distributed Objects. Distributed objects provide a relatively seamless mechanism for one thread or process to send ordinary Objective-C messages to objects in another thread or process. It requires a little effort to setup distributed objects initially, but it is a lot less hassle than trying to debug even one race condition. Trust me.
Distributed objects are a heavy-weight approach, but they are flexible, convenient, and really cool. http://developer.apple.com/cgi-bin/search.pl?q=distributed+objects&num=10&site=default_collection
Another Cocoa approach is the NSPort class. NSPort class encapsulates an input/output source for use by the Cocoa NSRunLoop or AppKit event loop. http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Classes/NSPort_Class/Reference/Reference.html
For one-way communication between processes, consider NSPipe. http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Classes/NSPipe_Class/Reference/Reference.html You can even use two pipes to provide bi-directional communication. This is light-weight, fast, and very convenient.
Cocoas NSDistributedNotificationCenter is convenient for infrequent communication between processes. http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Classes/NSDistributedNotificationCenter_Class/Reference/Reference.html
At the Unix layer of Mac OS X, there are pipes, named pipes (a.k.a. FIFOs), message queues, and sockets for inter-process communication.
Finally, at the XNU/Darwin layer, there are Mach Messages conveniently accessible via http://developer.apple.com/documentation/Darwin/Reference/DarwinNotify/notify/CompositePage.html
Apple addresses inter-application communication in of all places, http://developer.apple.com/documentation/Cocoa/InterapplicationCommunication-date.html
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Cocoa-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden