Notes for Parallel Scientific Computing, Feb 20, 2003

Reading advice: Study Part I of Gregory R. Andrews, Foundations of
Multithreaded, Parallel and Distributed Programming, Addison-Wesley,
2000 (on course reserve).  This text has been the basis for our
discussion of shared memory concurrent programming, but there is much
more in the Andrews text than was said in class.

If you need to brush up your C language programming skills then I
advise: Brian W. Kernighan and Dennis M. Ritchie: The C Programming
Language, 2nd ed. (ANSI C), 1989.

Handout, February 20: One simple sequential C program that solves an
instance of the Laplace equation on a square mesh by point relaxation,
and one incomplete C program that uses Pthreads to do the same in
parallel, with one thread per interior mesh point.  The latter code is
missing all synchronization statements.  Please complete according to
taste.  You can uses mutexes, condition variables and/or semaphores.

This is truly the crudest discretization of the Laplace equation in
the simplest context.  We have a mesh of N*N cells, with the function
discretized by point values at the (N+1)*(N+1) vertices.  We have
fixed the values at the boundary vertices.  (Dirichlet boundary
conditions.)  At the interior (i,j) vertex the discretized Laplace
equation is

  a[i,j-1]+a[i,j+1]+a[i-1,j]+a[i+1,j]-4*a[i,j] = 0

The solution procedure employs "simultaneous relaxation".  First, at
each interior point the quantity da[i,j] is evaluated such that after
the assignment a[i,j] = a[i,j]+da[i,j] the discrete Laplace equation
would be satisfied at the (i,j) point, were it not for changes to the
neighbouring values.  We iterate until convergence.  With the
convergence test used, the code terminates after about 100 iterations.


#include <stdio.h> #include <math.h> /* Solve a Laplace equation by simple point relaxation. One single problem instance; trivial code structure. */ #define N 6 #define tol 1.0e-8 /* N: size of the problem instance. The mesh has (N+1)*(N+1) points. */ /* tol: convergence tolerance. Iteration will stop when the maximum absolute change does not exceed tol. */ main() { int i, j, iters; double a[N+1][N+1], da[N+1][N+1], t0, maxda; /* iters: iteration counter */ /* a[0:N][0:N]: Initially arbitrary; the interior values a[1:N-1][1:N-1] evolve by point relaxation towards a solution of the Laplace equation with the given boundary values. */ /* da[0:N][0:N]: used to hold the change to a[0:N][0:N] */ /* maxda: maximum absolute value of da[1:N-1][1:N-1] */ /* Initialize */ for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { if (i==0||i==N||j==0||j==N) a[i][j]=0.0; else a[i][j]=1.0; }; }; maxda=HUGE_VAL; iters=0; /* The problem instance just defined has boundary values identically 0 and initial interior values all equal to 1. The iterative process should converge towards the solution that has a[0:N][0:N] all 0. */ /* Iterate */ while (tol<maxda) { t0=0.0; for (i=1; i<N; i++) { for (j=1; j<N; j++) { da[i][j]=(a[i+1][j]+a[i-1][j]+a[i][j+1]+a[i][j-1]-4*a[i][j])/4; if (t0<fabs(da[i][j])) t0=fabs(da[i][j]); }; }; maxda=t0; for (i=1; i<N; i++) { for (j=1; j<N; j++) { a[i][j]=a[i][j]+da[i][j]; }; }; iters++; }; /* Print */ printf("%7d\n",iters); for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { printf("%10.1e",a[i][j]); }; printf("\n"); }; }
#include <stdio.h> #include <math.h> #include <pthread.h> #include <semaphore.h> /* Solve a Laplace equation by simple point relaxation. One single problem instance. Uses one thread per mesh point. */ #define N 6 #define tol 1.0e-8 /* N: size of the problem instance. The mesh has (N+1)*(N+1) points. */ /* tol: convergence tolerance. Iteration will stop when the maximum absolute change does not exceed tol. */ double a[N+1][N+1], maxda, newmaxda; struct point { int i; int j; }; void *rx(void *); /* advance declaration, see below */ [ ... declare here variables for synchronization ... ] main() { int i, j, iters; struct point p[N+1][N+1]; pthread_t thread_rx[N+1][N+1]; /* we've declared an array of threads, one for each mesh point. In fact, only the interior mesh points will have a thread. In the present problem the boundary values are fixed. */ /* Initialize data */ for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { if (i==0||i==N||j==0||j==N) a[i][j]=0.0; else a[i][j]=1.0; }; }; maxda=HUGE_VAL; newmaxda=0.0; iters=0; /* Initialize synchronization variables */ [ ... ] /* Create the subtasks */ for (i=1; i<N; i++) { for (j=1; j<N; j++) { p[i][j].i=i; p[i][j].j=j; pthread_create(&thread_rx[i][j], NULL, rx, (void *) &p[i][j]); }; }; /* Iterate */ while (tol<maxda) { [ ... synchronization will be needed in this loop ... ] printf("%7d %10.1e\n",iters,maxda); maxda=newmaxda; newmaxda=0.0; iters++; }; /* Collect all threads */ for (i=1; i<N; i++) { for (j=1; j<N; j++) { pthread_join(thread_rx[i][j],NULL); }; }; /* Print */ printf("%7d\n",iters); for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { printf("%10.1e",a[i][j]); }; printf("\n"); }; } /* relaxation on one mesh point */ void *rx(void *arg) { struct point *p = arg; int i, j; double da; i=(*p).i; j=(*p).j; while (tol<maxda) { [ ... synchronization will be needed in this loop ... ] da=(a[i+1][j]+a[i-1][j]+a[i][j+1]+a[i][j-1]-4*a[i][j])/4; if (newmaxda<fabs(da)) newmaxda=fabs(da); a[i][j]=a[i][j]+da; }; return NULL; }