iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week
Null

Home > iSGTW 3 February 2010 > Back to basics - What makes parallel programming hard?

Back to Basics - What makes parallel programming hard?


 BY DARIN OHASHI
Darin Ohashi is a senior kernel developer at Maplesoft. His background is in mathematics and computer science, with a focus on algorithm and data structure design and analysis. For the last few years he has been focused on developing tools to enable parallel programming in Maple, a well-known scientific software package.

Dual and even quad-processor computers may be increasingly common, but software that is designed to run in parallel is another story entirely. In this column, Darin Ohashi takes us back to basics to explain why designing a program to run in parallel is a whole different ball game.

In my previous column, I tried to convince you that going parallel is necessary for high performance applications. In this week’s column, I will show you what makes parallel programming different and why that makes it harder.

Glossary of terms:
  • process: a running instance of a program. A process's memory is usually protected from access by other processes.
  • thread: within a single process each thread represents an independent, concurrent path of execution. Threads within a process share memory.

We think of a function as a series of discrete steps. Let’s say a function f is composed of steps f1, f2, f3... fn, and another function g is composed of steps g1, g2, g3...gn.

A single-threaded application can run f and g in only two ways:

f(); g(); or g(); f();

That would lead to two possible orders for the steps: f1, f2, f3... fn, g1, g2, g3... gn or g1, g2, g3... gn, f1, f2, f3... fn. The order is determined by the order in which the application calls f() and g(). And that, in turn, is determined by the programmer who wrote the application.

Now let's consider what happens if f and g are called in parallel – that is to say, at the same time. Although f2 will still execute before f3, there is no control over the order in which f2 will be executed relative to g2. Thus, the order could be g1, g2, f1, g3, f2, f3 .... or f1, g1, f2, f3, g2, g3... or any other valid order. Each time the application calls these functions in parallel, the steps could be executed in a different order. Given enough executions every possible order could eventually happen.

Suppose that fi must execute before gj for the code to execute correctly, for any i and j. Then executing f and g in parallel will occasionally fail, even if fi is the first step of f and gj is the last step of g. Every possible ordering will eventually occur. Therefore to write correct parallel code the programmer must guarantee that the code is correct for all possible orderings. Bugs caused by the order in which threads execute are often referred to as race conditions (you can imagine two or more threads racing towards a critical point; the correctness depends on who wins the race).

Consider a function f, that accepts a single integer n as an argument. f increments a global variable g, n times in a loop. If we were to run f twice in series, passing 1000 each time, the final value of g would be 2000. What happens if we run f in two threads in parallel, passing both calls 1000? We might expect to get 2000, as we did single threaded. However you probably won't, instead you'll get some number between 1000 and 2000, some of the increments will be lost. This happens because occasionally the two threads will attempt to update g at the same time, but only one of the updates will be stored, the other will be lost. Further, if we were to execute this example multiple times, you would see different answers each time. For each execution, the threads will be scheduled differently and this will cause a different number of conflicts.

There are orders where the total will be 2000, and in fact, every value between 1000 and 2000 could occur. So, even for this simple example, there are 1000 different possible outcomes, and even more possible orderings.

 

Summary:

The order in which single threaded code executes is determined by the programmer. This makes the execution of a single threaded application relatively predictable. When run in parallel, the order in which the code executes is not predictable. This makes it much harder to guarantee that code will execute correctly.

Tags:



Null
 iSGTW 22 December 2010

Feature – Army of Women allies with CaBIG for online longitudinal studies

Special Announcement - iSGTW on Holiday

Video of the Week - Learn about LiDAR

 Announcements

NeHC launches social media

PRACE announces third Tier-0 machine

iRODS 2011 User Group Meeting

Jobs in distributed computing

 Subscribe

Enter your email address to subscribe to iSGTW.

Unsubscribe

 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

December 2010

13-18, AGU Fall Meeting

14-16, UCC 2010

17, ICETI 2011 and ICSIT 2011

24, Abstract Submission deadline, EGI User Forum

 

January 2011

11, HPCS 2011 Submission Deadline

11, SPCloud 2011

22, ALENEX11

30 Jan – 3 Feb, ESCC/Internet2

 

February 2011

1 - 4, GlobusWorld '11

2, Lift 11

15 - 16, Cloudscape III


More calendar items . . .

 

FooterINFSOMEuropean CommissionDepartment of EnergyNational¬†Science¬†Foundation RSSHeadlines | Site Map