Archive

Archive for June, 2006

Understanding Data Races 1: the Role of Data Race Detection Tools

June 30th, 2006

This is the first of a series of blogs on understanding data races I am going to post.


With the release of Sun Studio Express (June 2006 Build), we are offering a run-time data race detection tool (DRDT) for developers on Sun’s platforms for FREE. It compliments other data race detection tools Sun already offers now.

If you have been bugged by data race problems in the past, you should give it a try. Go here (scroll to ‘How to get started’) to download it. And here is the page dedicated to the DRDT project.


I would like to start the series with understanding the role data race detection tools first.

Many mt programs have race conditions, the existence of which makes debugging mt programs very hard. One class of race conditions is data race condition or data race. (The difference between general race condtion and data race condition will be explained in another blog.)

Data race is a condition that happens in a program. People often think a data race is always a bug. This is not true. A data race could be the root cause of a bug; it could be caused by a bug; or it could be there because the programmer wants it there.

If a data race is the root cause of a bug, we want to find it. If a data race is caused by a bug, showing where the data race is can help the programmer locate the real bug. If a data race is there by design, we want to make sure it is there and we also want to make sure there is no unexpected data race.

The role of a data race detection tool is to check whether a program contains data races and pin-point the locations of them if there is any.

There are many ways of using a data race detection tool. Some use it as debugging tool: run it when there is a bug in the program. Someone use it as a sanity checking tool: run it as part of regression tests. And some use it as a programming assistance tool in parallelizing sequential programs: find thread unsafe routines and global variables that should be private to threads.

Categories: Parallel Programming Tags:

The idea behind environment variable “SUNW_MP_MAX_POOL_THREADS”

June 29th, 2006

Sun’s OpenMP implementation supports true nested parallel regions - when nested parallelism is enabled, the inner parallel region can be executed by multiple threads concurrently.

We provide an environment variable called SUNW_MP_MAX_POOL_THREADS for users to control the total number of OpenMP slave threads in a process.

For example, if you have want a maximum of 16 threads to be used for a nest of parallel regions in your program, you can set SUNW_MP_MAX_POOL_THREADS to 15. That’s 15 slave threads (some of them may become masters in inner parallel regions) plus one user thread which is the master thread for the out-most parallel region.

Why did we design an environment variable like SUNW_MP_MAX_NUM_THREADS so that a user can set it to 16 in the above example? Intel’s implementation has KMP_ALL_THREADS and KMP_MAX_THREADS which do that.

Well, we were trying to have a scheme that works on more general cases, not just pure OpenMP codes. In particular, we think our scheme works better than others for mixed pthread and OpenMP thread code. The pool defines a set of threads that can be used as OpenMP slave threads. If the program has two pthreads and both will create a team, then both will try to grab slave threads from the same pool. The env var SUNW_MP_MAX_POOL_THREADS was NOT designed for users to control the total number of threads in a process. We cannot control that because of the use of pthreads. The env var is designed for users to control the total number of OpenMP slave threads.

The env var SUNW_MP_MAX_NUM_THREADS is documented here. We also have a short article “How Many Threads Does It Take?” if you want to understand it better.

Categories: Parallel Programming Tags:

Common Mistakes in Using OpenMP 4: Orphaned Worksharing Constructs

June 11th, 2006

More precisely, this mistake should be classified as a common mis-understanding of OpenMP.

When a worksharing construct, such omp for or omp sections, is encountered outside any explicit parallel region, the arising worksharing region is called orphaned worksharing region. A common mis-understanding is that in this case the worksharing construct is simply being ignored and the region is executed sequentially.

Orphaned worksharing constructs are not ignored. All the data sharing attribute clauses are honored. The worksharing regin is executed as if a team of only one thread is executing the region.

For example, in the following C++ code,

     main()
     {
         class_type_1  a;
         #pragma omp for private(a) schedule(dynamic)
         for (i=1; i<100; i++) {
             printf("%dn", i);
         }
     }

the default constructor for class_type_1 will be called, and a comforming implementation is not forced to execute the loop in the order of 1, 2, 3, …, 99.

Categories: Parallel Programming Tags:

Concurrency vs Parallelism, Concurrent Programming vs Parallel Programming

June 11th, 2006

In the danger of hairsplitting, …

Concurrency and parallelism are NOT the same thing. Two tasks T1 and T2 are concurrent if the order in which the two tasks are executed in time is not predetermined,

  • T1 may be executed and finished before T2,
  • T2 may be executed and finished before T1,
  • T1 and T2 may be executed simultaneously at the same instance of time (parallelism),
  • T1 and T2 may be executed alternatively,

If two concurrent threads are scheduled by the OS to run on one single-core non-SMT non-CMP processor, you may get concurrency but not parallelism. Parallelism is possible on multi-core, multi-processor or distributed systems.

Concurrency is often referred to as a property of a program, and is a concept more general than parallelism.

Interestingly, we cannot say the same thing for concurrent programming and parallel programming. They are overlapped, but neither is the superset of the other. The difference comes from the sets of topics the two areas cover. For example, concurrent programming includes topic like signal handling, while parallel programming includes topic like memory consistency model. The difference reflects the different orignal hardware and software background of the two programming practices.

Update: More on Concurrency vs Parallelism

Categories: Parallel Programming Tags:

Common Mistakes in Using OpenMP 3: Fifteen Cases from a IWOMP 2006 paper by Michael Süß and Claudia Leopold

June 7th, 2006

The coming International Workshop on OpenMP (IWOMP 2006) has a paper titled “Common Mistakes in OpenMP and How to Avoid Them” written by Michael Süß and Claudia Leopold (University of Kassel, Germany).

The result is based on a survey of two undergraduate courses. The authors of the paper kindly allow me to list the 15 common mistakes presented in their paper here,

  1. (Correctness) Access to shared variables not protected
  2. (Correctness) Use of locks without flush
  3. (Correctness) Read of shared variable without flush
  4. (Correctness) Forget to mark private variables as such
  5. (Correctness) Use of ordered clause without ordered construct
  6. (Correctness) Declare loop variable in #pragma omp parallel for as shared
  7. (Correctness) Forget to put down for in #pragma omp parallel for
  8. (Correctness) Try to change num. of thr. in parallel reg. after start of reg.
  9. (Correctness) omp_unset_lock() called from non-owner thread
  10. (Correctness) Attempt to change loop variable while in #pragma omp for
  11. (Performance) Use of critical when atomic would be sufficient
  12. (Performance) Put too much work inside critical region
  13. (Performance) Use of orphaned construct outside parallel region
  14. (Performance) Use of unnecessary flush
  15. (Performance) Use of unnecessary critical

For detail, please read the full paper.

Categories: Parallel Programming Tags:

Read: “The Rise and Fall of CORBA”

June 4th, 2006

The June 2006 issue (Vol 4, No 5) of ACM Queue features an aritcle by Michi Henning of ZeroC on the rise and fall of CORBA.

Technical issues and procedural issues contribute to the fall of CORBA. And the procedural problems are the root cause of the procedural problems. Many of the issues the article points out are alarming familiar!

The following is a list of lessons learnt in how to have a better standards process,

  • Standards consortia need iron-clad rules to ensure that they standardize existing best practice.
  • No standard should be approved without a reference implementation.
  • No standard should be approved without having been used to implement a few projects of realistic complexity.
  • Open source innovation usually is subject to a Darwinian selection proecess.
  • To create quality software, the ability to say “no” is usually far more important than the ability to say “yes”.

Read the whole article.

Categories: Parallel Programming Tags: