In this book you will learn not only what the pthread calls are, but when it is a good idea to use threads and how to make them efficient which is the whole reason for using threads in the first place. The authors delves into performance issues, comparing threads to processes, contrasting kernel threads to user threads, and showing how to measure speed.
He also describes in a simple, clear manner what all the advanced features are for, and how threads interact with the rest of the UNIX system. Topics include: Basic design techniques Mutexes, conditions, and specialized synchronization techniques Scheduling, priorities, and other real-time issues Cancellation UNIX libraries and re-entrant routines Signals Debugging tips Measuring performance Special considerations for the Distributed Computing Environment DCE. As the first undergraduate text to directly address compiling and running parallel programs on multi-core and cluster architecture, this second edition carries forward its clear explanations for designing, debugging and evaluating the performance of distributed and shared-memory programs while adding coverage of accelerators via new content on GPU programming and heterogeneous programming.
New and improved user-friendly exercises teach students how to compile, run and modify example programs.
Takes a tutorial approach, starting with small programming examples and building progressively to more challenging examples Explains how to develop parallel programs using MPI, Pthreads and OpenMP programming models A robust package of online ancillaries for instructors and students includes lecture slides, solutions manual, downloadable source code, and an image bank New to this edition: New chapters on GPU programming and heterogeneous programming New examples and exercises related to parallel algorithms.
Master the essentials of concurrent programming,including testingand debugging This textbook examines languages and libraries for multithreadedprogramming. Moreover, the textbook sets itself apart from othercomparable works by helping readers to become proficient in keytesting and debugging techniques. The authors have developed and fine-tuned this book through theconcurrent programming courses they have taught for the past twentyyears. The material, which emphasizes practical tools andtechniques to solve concurrent programming problems, includesoriginal results from the authors' research.
These libraries and the testing techniques theysupport can be used to assess student-written programs. Each chapter includes exercises that build skills in programwriting and help ensure that readers have mastered the chapter'skey concepts.
The source code for all the listings in the text andfor the synchronization libraries is also provided, as well asstartup files and test cases for the exercises. This textbook is designed for upper-level undergraduates andgraduate students in computer science.
With its abundance ofpractical material and inclusion of working code, coupled with anemphasis on testing and debugging, it is also a highly usefulreference for practicing programmers.
Shared Memory Application Programming presents the key concepts and applications of parallel programming, in an accessible and engaging style applicable to developers across many domains. Multithreaded programming is today a core technology, at the basis of all software development projects in any branch of applied computer science. Focus progressively shifts from traditional thread parallelism to modern task parallelism deployed by modern programming environments.
Joining and Detaching Threads 4. Stack Management 5. Miscellaneous Routines 6. Exercise 1 7. Mutex Variables 1. Mutex Variables Overview 2. Creating and Destroying Mutexes 3. Locking and Unlocking Mutexes 8. Condition Variables 1. Condition Variables Overview 2.
Creating and Destroying Condition Variables 3. Waiting and Signaling on Condition Variables 9. Topics Not Covered Exercise 2 References and More Information Historically, hardware vendors have implemented their own proprietary versions of threads, making portability a concern for software developers. The tutorial begins with an introduction to concepts, motivations, and design considerations for using Pthreads. Example codes are used throughout to demonstrate how to use most of the Pthreads routines needed by a new Pthreads programmer.
A lab exercise, with numerous example codes C Language is also included. It is deal for those who are new to parallel programming with threads. A basic understanding of parallel programming in C is required. For those who are unfamiliar with Parallel Programming in general, the material covered in EC Introduction To Parallel Computing would be helpful. Pthreads Overview What is a Thread? Technically, a thread is defined as an independent stream of instructions that can be scheduled to run as such by the operating system.
But what does this mean? To the software developer, the concept of a 'procedure' that runs independently from its main program may best describe a thread. That would describe a 'multi-threaded' program. How is this accomplished? Before understanding a thread, one first needs to understand a UNIX process. A process is created by the operating system, and requires a fair amount of 'overhead'. Program instructions Registers Stack Heap File descriptors Signal actions Shared libraries Inter-process communication tools such as message queues, pipes, semaphores, or shared memory.
Section 1. There is nothing all that unusual in the brief story—but hereafter you will understand when I talk about programmers and buckets, which, otherwise, might seem mildly odd. The most important of these concepts deserves a special introduction, which will also serve to demonstrate the convention with which various particularly important points shall be emphasized throughout this book: Asynchronous: Any two operations are "asynchronous" when they can proceed independently of each other.
Threads are, to some extent, just one more way to make applications asynchronous, but threads have some advantages over other models that have been used to build asynchronous applications. You will get to see "threads in action" right away, with a brief description of the few Pthreads interfaces needed to build this simple application.
Armed, now, with a basic understanding of what threads are all about, you can go on to Section 1. Although there are a lot of excellent reasons to use threads, there is a price to be paid. What it boils down to, though, is simply that you need to learn how the model works, and then apply it carefully. It is not as hard as some folks would have you believe. You have seen some of the fundamental benefits and costs.
It may be obvious that you do not want to rush out and put threads into every application or library you write. You will know at that point what threads are, what they do, and when to use them. Aside from brief examples, you haven't yet seen any detailed information about the particular programming interfaces APIs that compose Pthreads.
The most important part of this section is 1. They sail quite some distance from shore, enjoying the sun and sea breeze, allowing the wind to carry them. The sky darkens, and a storm strikes. The small boat is tossed violently about, and when the storm abates the programmers are missing their boat's sail and most of the mast. The boat has sprung a small leak, and there is no land in sight.
The boat is equipped with food, water, oars, and a bailing bucket, and the programmers set to work. One programmer rows, and monitors the accumulating water in the bosom of the boat. The other programmers alternately sleep, watch the water level, or scan the horizon for sight of land or another ship.
An idle programmer may notice rising water in the boat, and begin bailing. If the rower decides that bailing is required while both his companions sleep a nudge is usually sufficient to awaken a programmer, allowing the other to continue sleeping. But if the rower is in a bad mood, he may resort to a loud yell, awakening both sleeping programmers.
While one programmer assumes the necessary duty, the other can try to fall asleep again. When the rower tires, he can signal one of the other programmers to take over the task, and immediately fall into a deep sleep waiting to be signaled in turn. In this way, they journey on for some time. So, just what do the Bailing Programmers have to do with threads? I'm glad you asked! The elements of the story represent analogies that apply to the Pthreads programming model. We'll explore some additional analogies in later sections, and even expand the story a little, but for now consider a few basics: A programmer is an entity that is capable of independent activity.
Our programmers represent threads. A thread is not really much like a programmer, who, as we all know, is a fascinatingly sophisticated mixture of engineer, mathematician, and artist that no computer can match. Still, as a representation of the "active element" in our programming model, it will be sufficient. The bailing bucket and the oars are "tokens" that can be held by only one individual at a time. They can be thought of as shared data, or as synchronization objects.
The primary Pthreads synchronization object, by the way, is called a mutex. Nudges and shouts are communication mechanisms associated with a synchronization object, on which individuals wait for some condition. Pthreads provides condition variables, which may be signaled or broadcast to indicate changes in shared data state.
Even if you are familiar with them, some of the terms have seen assorted and even contradictory uses within research and industry, and that is clearly not going to help communication. We need to begin by coming to a mutual agreement regarding the meaning of these terms, and, since I am writing the book, we will agree to use my definitions. Thank you. Life is asynchronous.
The dependencies are supplied by nature, and events that are not dependent on one another can occur simultaneously. A programmer cannot row without the oars, or bail effectively without the bucket--but a programmer with oars can row while another programmer with a bucket bails. The greatest complication of "asynchrony" has been that there's little advantage to being asynchronous unless you can have more than one activity going at a time.
If you can start an asynchronous operation, but then you can do nothing but wait for it, you're not getting much benefit from the asynchrony. Concurrency describes the behavior of threads or processes on a uniprocessor system. The definition of concurrent execution in POSIX requires that "functions that suspend the execution of the calling thread shall not cause the execution of other threads to be indefinitely suspended.
Nevertheless, concurrency allows applications to take advantage of asynchronous capabilities, and "do work" while independent operations are proceeding.
Most programs have asynchronous aspects that may not be immediately obvious. Users, for example, prefer asynchronous interfaces. They expect to be able to issue a command while they're thinking about it, even before the program has finished with the last one.
And when a windowing interface provides separate windows, don't you intuitively expect those windows to act asynchronously? Nobody likes a "busy" cursor. Pthreads provides you with both concurrency and asynchrony, and the combination is exactly what you need to easily write responsive and efficient programs. By uniprocessor, I mean a computer with a single programmer-visible execution unit processor.
By multiprocessor, I mean a computer with more than one processor sharing a common instruction set and access to the same physical memory. While the processors need not have equal access to all physical memory, it should be possible for any processor to gain access to most memory. A "massively parallel processor" MPP may or may not qualify as a multiprocessor for the purposes of this book.
Many MPP systems do qualify, because they provide access to all physical memory from every processor, even though the access times may vary widely. In other words, software "parallelism" is the same as English "concurrency" and different from software "concurrency.
True parallelism can occur only on a multiprocessor system, but concurrency can occur on both uniprocessor and multiprocessor systems. Concurrency can occur on a uniprocessor because concurrency is, essentially, the illusion of parallelism. While parallelism requires that a program be able to perform two computations at once, concurrency requires only that the programmer be able to pretend that two things can happen at once.
It does not require that the code run efficiently in multiple threads only that it can operate safely in multiple threads. Most existing functions can be made thread-safe using tools provided by Pthreads-- mutexes, condition variables, and thread-specific data.
Functions that don't require persistent context can be made thread-safe by serializing the entire function, for example, by locking a mutex on entry to the function, and unlocking the mutex before returning.
Functions made thread-safe by serializing the entire function can be called in multiple threads--but only one thread can truly perform the function at a time. More usefully, thread-safe functions can be broken down into smaller critical sections. That allows more than one thread to execute within the function, although not within the same part. Even better, the code can be redesigned to protect critical data rather than critical code, which may allow fully parallel execution of the code, when the threads don't need to use the same data at the same time.
That is, putchar might lock a "putchar mutex," write the character, and then unlock the putchar mutex. You could call putchar from two threads, and no data would be corrupted--it would be thread-safe. However, only one thread could write its character at a time, and the others would wait, even if they were writing to different stdio streams. The correct solution is to associate the mutex with the stream, protecting the data rather than the code.
Now your threads, as long as they are writing to different streams, can execute putchar in parallel. More importantly, all functions that access a stream can use the same mutex to safely coordinate their access to that stream. The term "reentrant" is sometimes used to mean "efficiently thread-safe. Although existing code can usually be made thread-safe by adding mutexes and thread-specific data, it is often necessary to change the interface to make a function reentrant.
Reentrant code should avoid relying on static data and, ideally, should avoid reliance on any form of synchronization between threads. Often, a function can avoid internal synchronization by saving state in a "context structure" that is controlled by the caller.
The caller is then responsible for any necessary synchronization of the data. The UNIX readdir function, for example, returns each directory entry in sequence.
To make readdir thread-safe, you might add a mutex that readdir locked each time it was called, and unlocked before it returned to the caller.
But remember that only the caller knows how the data will be used. If only one thread uses this particular directory context, for example, then no synchronization is needed. Even when the data is shared between threads, the caller may be able to supply more efficient synchronization, for example, if the context can be protected using a mutex that the application also uses for other data.
Here are three essential facilities, or aspects, of any concurrent system: 1. Execution context is the state of a concurrent entity. A concurrent system must provide a way to create and delete execution contexts, and maintain their state independently. It must be able to save the state of one context and dispatch to another at various times, for example, when one needs to wait for an external event.
It must be able to continue a context from the point where it last executed, with the same register contents, at a later time. Scheduling determines which context or set of contexts should execute at any given point in time, and switches between contexts when necessary. Synchronization provides mechanisms for concurrent execution contexts to coordinate their use of shared resources. We use this term in a way that is nearly the opposite of the standard dictionary meaning.
You'll find a definition much like "cause to occur at the same time," whereas we usually mean something that might better be expressed as "prevent from occurring at the same time.
This book will use the term "synchronization," though, because that is what you'll see used, almost universally. There are many ways to provide each of these facilities--but they are always present in some form. The particular choices presented in this book are dictated by the book's subject--Pthreads. Table 1. Execution context Scheduling Synchronization Real traffic automobile traffic lights and signs turn signals and brake lights UNIX process priority nice wait and pipes before threads Pthreads thread policy, priority condition variables and mutexes TABLE 1.
It may provide time-slicing, where each thread is forced to periodically yield so that other threads may run "round-robin". It may provide various scheduling policies that allow the application to control how each thread is scheduled according to that thread's function.
Synchronization may be provided using a wide variety of mechanisms. Some of the most common forms are mutexes, condition variables, semaphores, and events. You may also use message passing mechanisms, such as UNIX pipes, sockets, POSIX message queues, or other protocols for communicating between asynchronous processes--on the same system or across a network.
Any form of communication protocol contains some form of synchronization, because passing data around with no synchronization results in chaos, not communication.
The terms thread, mutex, and condition variable are the main topics of this book. For now, it is enough to know that a thread represents an "executable thing" on your computer. A mutex provides a mechanism to prevent threads from colliding unexpectedly, and a condition variable allows a thread, once it has avoided such a collision, to wait until it is safe to proceed.
Both mutexes and condition variables are used to synchronize the operation of threads. But you've probably been using asynchronous programming techniques all along.
You've also been using asynchronous "programming" techniques in real life since you were born. Most people understand asynchronous behavior much more thoroughly than they expect, once they get past the complications of formal and restricted definitions. Yes, until recently it was difficult to write individual programs for UNIX that behaved asynchronously--but UNIX has always made it fairly easy for you to behave asynchronously.
When you type a command to a shell, you are really starting an independent program--if you run the program in the background, it runs asynchronously with the shell. When you pipe the output of one command to another you are starting several independent programs, which synchronize between themselves using the pipe.
Time is a synchronization mechanism. In many cases you provide synchronization between a series of processes yourself, maybe without even thinking about it. For example, you run the compiler after you've finished editing the source files. It wouldn't occur to you to compile them first, or even at the same time.
That's elementary real-life synchronization. UNIX pipes and files can be synchronization mechanisms. In other cases you may use more complicated software synchronization mechanisms. The shell starts both commands right away, but the more command can't generate any output until it receives input from ls through the pipe. Both commands proceed concurrently or even in parallel on a multiprocessor with ls supplying data and more processing that data, independently of each other.
If the pipe buffer is big enough, ls could complete before more ever started; but more can't ever get ahead of ls. Some UNIX commands perform synchronization internally. For example, the command "cc -o thread thread. The cc command might be a "front end" to the C language environment, which runs a filter to expand preprocessor commands like include and if , a compiler to translate the program into an intermediate form, an optimizer to reorder the translation, an assembler to translate the intermediate form into object language, and a loader to translate that into an executable binary file.
As with ls more, all these programs may be running at the same time, with synchronization provided by pipes, or by access to temporary files. UNIX processes can operate asynchronously because each process includes all the information needed to execute code. The operating system can save the state of one process and switch to another without affecting the operation of either. Any general-purpose asynchronous "entity" needs enough state to enable the operating system to switch between them arbitrarily.
But a UNIX process includes additional state that is not directly related to "execution context," such as an address space and file descriptors. A thread is the part of a process that's necessary to execute code. On most computers that means each thread has a pointer to the thread's current instruction often called a "PC" or "program counter" , a pointer to the top of the thread's stack SP , general registers, and floating-point or address registers if they are kept separate.
A thread may have other things, such as processor status and coprocessor control registers. A thread does not include most of the rest of the state associated with a process; for example, threads do not have their own file descriptors or address space.
All threads within a process share all of the files and memory, including the program text and data segments. Threads are "simpler" than processes. You can think of a thread as a sort of "stripped down" process, lean and mean and ready to go. The system can switch between two threads within a process much faster than it can switch between processes. A large part of this advantage comes from the fact that threads within a process share the address space--code, data, stack, everything.
When a processor switches between two processes, all of the hardware state for that process becomes invalid. Some may need to be changed as part of the context switch procedure--data cache and virtual memory translation entries may be flushed, for example. Even when they do not need to be flushed immediately, however, the data is not useful to the new process.
Each process has a separate virtual memory address space, but threads running within the same process share the virtual address space and all other process data.
Threads can make high-bandwidth communication easier between independent parts of your program. You don't have to worry about message passing mechanisms like pipes or about keeping shared memory region address references consistent between several different address spaces.
Synchronization is faster, and programming is much more natural. If you create or open a file, all threads can use it. If you allocate a dynamic data structure with malloc, you can pass the address to other threads and they can reference it.
Threads make it easy to take advantage of concurrency. Start by getting over the unnatural expectation that everything will happen serially unless you do something "unusual.
You can go out for a cup of coffee, leaving your computer compiling some code and fully expecting that it will proceed without you. Parallelism happens everywhere in the real world, and you expect it. A row of cashiers in a store serve customers in parallel; the customers in each line generally wait their turn.
You can improve throughput by opening more lines, as long as there are registers and cashiers to serve them, and enough customers to be served by them. Creating two lines for the same register may avoid confusion by keeping lines shorter--but nobody will get served faster. Opening three registers to serve two customers may look good, but it is just a waste of resources.
In an assembly line, workers perform various parts of the complete job in parallel, passing work down the line. Adding a station to the line may improve performance if it parallels or subdivides a step in the assembly that was so complicated that the operator at the next station spent a lot of time waiting for each piece.
Beware of improving one step so much that it generates more work than the next step on the assembly line can handle. In an office, each project may be assigned to a "specialist. Each specialist handles her project independently on behalf of the customer or some other specialist, reporting back in some fashion when done. Assigning a second specialist to some task, or defining narrower specialties for example, assigning an engineer or manager permanently to one product may improve performance as long as there's enough work to keep her busy.
If not, some specialists play games while others' in-baskets overflow. Motor vehicles move in parallel on a highway. They can move at different speeds, pass each other, and enter and exit the highway independently. The drivers must agree to certain conventions in order to avoid collisions. Despite speed limits and traffic signs, compliance with the "rules of the road" is mostly voluntary. Similarly, threads must be coded to "agree" to rules that protect the program, or risk ending up undergoing emergency debugging at the thread hospital.
Software can apply parallelism in the same ways you might use it in real life, and for the same reasons. When you have more than one "thing" capable of doing work, you naturally expect them to all do work at the same time. A multiprocessor system can perform multiple computations, and any time-sharing system can perform computations while waiting for an external device to respond.
Software parallelism is subject to all of the complications and problems that we have seen in real life--and the solutions may not be as easy to see or to apply.
You need enough threads, but not too many; enough communication, but not too much. A key to good threaded programming is learning how to judge the proper balance for each situation. Each thread can process similar parts of a problem, just like supermarket cashiers handling customers. Each thread can perform a specific operation on each data item in turn, just like the workers on an assembly line. Each thread can specialize in some specific operation and perform that operation repeatedly on behalf of other threads.
You can combine these basic models in all sorts of ways; for example, in parallel assembly lines with some steps performed by a pool of servers. As you read this book you'll be introduced to concepts that may seem unfamiliar: mutexes, condition variables, race conditions, deadlocks, and priority inversions. Threaded programming may feel daunting and unnatural. Threads and all this other stuff are formalized and restricted representations of things you already understand.
If you find yourself thinking that someone shouldn't interrupt you because you have the conversation mutex locked, you've begun to develop an intuitive understanding of threaded programming. If something wouldn't make sense in real life, you probably shouldn't try it in a program either.
All of these programs do something, but many do not do anything of any particular importance. The purpose of the examples is to demonstrate thread management and synchronization techniques, which are mere overhead in most real programs.
They would be less effective at revealing the details if that "overhead" was buried within large programs that "did something. The source code is separated from the surrounding text by a header and trailer block which include the file name and, if the example comprises more than one section, a section number and the name of the function.
Each line of the source code has a line number at the left margin. Major functional blocks of each section are described in specially formatted paragraphs preceding the source code.
These paragraphs are marked by line numbers outside the left margin of the paragraph, denoting the line numbers in the source listing to which the paragraph refers.
Here's an example: These lines show the header files included in most of the examples. That is, I check for errors on each function call.
As long as you code carefully, this isn't necessary, and some experts recommend testing only for errors that can result from insufficient resources or other problems beyond your control. I disagree, unless of course you're the sort of programmer who never makes a mistake.
Checking for errors is not that tedious, and may save you a lot of trouble during debugging. A Makefile is provided to build all of the examples, though it requires modifications for various platforms. On Solaris 2. This function causes Solaris to provide the process with additional concurrency.
In a few cases, the example will not operate at all without this call, and in other cases, the example would fail to demonstrate some behavior. But then, this book is about threads, not user interfaces, and the code that I need to show takes up quite enough space already. The program prompts for input lines in a loop until it receives an error or end of file on stdin. On each line, the first nonblank token is interpreted as the number of seconds to wait, and the rest of the line up to 64 characters is a message that will be printed when the wait completes.
I will offer two additional versions-- one using multiple processes, and one using multiple threads. We'll use the three examples to compare the approaches. We don't use the error reporting macros in this particular example, but consistency is nice, sometimes. Most of main is a loop, which processes simple commands until fgets returns a NULL error or end of file.
If you set an alarm to remind you to do something in 10 minutes seconds , you can't decide to have it remind you of something else in 5 minutes. The program is doing something synchronously that you would probably like to be asynchronous. The new version is asynchronous--you can enter commands at any time, and they will be carried out independently. It isn't much more complicated than the original, which is nice.
If the program fails to do this, the system will save them all until the program terminates. The normal way to reap terminated child processes is to call one of the wait functions. The function will immediately reap one child process if any have terminated, or will immediately return with a process ID pid of 0. The parent process continues to reap terminated child processes until there are no more to reap. When the loop terminates, main loops back to line 13 to read a new command.
Finally, the thread frees the control packet and returns. DWF files are highly compressed, smaller and fast. Public Pastes. Arduino 25 sec ago 0. Python 48 sec ago 0. Bash 9 min ago 2.
0コメント