The following examples show how to use several OpenMP* features.
A Simple Difference Operator
This example shows a simple parallel loop where the amount of work in each iteration is different. Dynamic scheduling is used to improve load balancing.
The for has a nowait because there is an implicit barrier at the end of the parallel region.
Two Difference Operators: for Loop Version
The example uses two parallel loops fused to reduce fork/join overhead. The first omp for pragma has a nowait clause because all the data used in the second loop is different than all the data used in the first loop.
Example |
|---|
void for2(float a[], float b[], float c[], float d[], int n, int m) {
int i, j;
#pragma omp parallel shared(a,b,c,d,n,m) private(i,j) {
#pragma omp for schedule(dynamic,1) nowait
for (i = 1; i < n; i++)
for (j = 0; j < i; j++)
b[j + n*i] = ( a[j + n*i] + a[j + n*(i-1)] )/2.0;
#pragma omp for schedule(dynamic,1) nowait
for (i = 1; i < m; i++)
for (j = 0; j < i; j++)
d[j + m*i] = ( c[j + m*i] + c[j + m*(i-1)] )/2.0;
}
} |
Two Difference Operators: sections Version
The example demonstrates the use of the omp sections pragma . The logic is identical to the preceding omp for example, but uses omp sections instead of omp for. Here the speedup is limited to two because there are only two units of work whereas in the example above there are (n-1) + (m-1) units of work.
Example |
|---|
void sections1(float a[], float b[], float c[], float d[], int n, int m) {
int i, j;
#pragma omp parallel shared(a,b,c,d,n,m) private(i,j) {
#pragma omp sections nowait {
#pragma omp section
for (i = 1; i < n; i++)
for (j = 0; j < i; j++)
b[j + n*i] = ( a[j + n*i] + a[j + n*(i-1)] )/2.0;
#pragma omp section
for (i = 1; i < m; i++)
for (j = 0; j < i; j++)
d[j + m*i] = ( c[j + m*i] + c[j + m*(i-1)] )/2.0;
}
}
} |