Notes for Parallel Scientific Computing, Feb 20, 2003 Reading advice: Study Part I of Gregory R. Andrews, Foundations of Multithreaded, Parallel and Distributed Programming, Addison-Wesley, 2000 (on course reserve). This text has been the basis for our discussion of shared memory concurrent programming, but there is much more in the Andrews text than was said in class. If you need to brush up your C language programming skills then I advise: Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language, 2nd ed. (ANSI C), 1989. Handout, February 20: One simple sequential C program that solves an instance of the Laplace equation on a square mesh by point relaxation, and one incomplete C program that uses Pthreads to do the same in parallel, with one thread per interior mesh point. The latter code is missing all synchronization statements. Please complete according to taste. You can uses mutexes, condition variables and/or semaphores. This is truly the crudest discretization of the Laplace equation in the simplest context. We have a mesh of N*N cells, with the function discretized by point values at the (N+1)*(N+1) vertices. We have fixed the values at the boundary vertices. (Dirichlet boundary conditions.) At the interior (i,j) vertex the discretized Laplace equation is a[i,j-1]+a[i,j+1]+a[i-1,j]+a[i+1,j]-4*a[i,j] = 0 The solution procedure employs "simultaneous relaxation". First, at each interior point the quantity da[i,j] is evaluated such that after the assignment a[i,j] = a[i,j]+da[i,j] the discrete Laplace equation would be satisfied at the (i,j) point, were it not for changes to the neighbouring values. We iterate until convergence. With the convergence test used, the code terminates after about 100 iterations.

#include <stdio.h> #include <math.h> /* Solve a Laplace equation by simple point relaxation. One single problem instance; trivial code structure. */ #define N 6 #define tol 1.0e-8 /* N: size of the problem instance. The mesh has (N+1)*(N+1) points. */ /* tol: convergence tolerance. Iteration will stop when the maximum absolute change does not exceed tol. */ main() { int i, j, iters; double a[N+1][N+1], da[N+1][N+1], t0, maxda; /* iters: iteration counter */ /* a[0:N][0:N]: Initially arbitrary; the interior values a[1:N-1][1:N-1] evolve by point relaxation towards a solution of the Laplace equation with the given boundary values. */ /* da[0:N][0:N]: used to hold the change to a[0:N][0:N] */ /* maxda: maximum absolute value of da[1:N-1][1:N-1] */ /* Initialize */ for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { if (i==0||i==N||j==0||j==N) a[i][j]=0.0; else a[i][j]=1.0; }; }; maxda=HUGE_VAL; iters=0; /* The problem instance just defined has boundary values identically 0 and initial interior values all equal to 1. The iterative process should converge towards the solution that has a[0:N][0:N] all 0. */ /* Iterate */ while (tol<maxda) { t0=0.0; for (i=1; i<N; i++) { for (j=1; j<N; j++) { da[i][j]=(a[i+1][j]+a[i-1][j]+a[i][j+1]+a[i][j-1]-4*a[i][j])/4; if (t0<fabs(da[i][j])) t0=fabs(da[i][j]); }; }; maxda=t0; for (i=1; i<N; i++) { for (j=1; j<N; j++) { a[i][j]=a[i][j]+da[i][j]; }; }; iters++; }; /* Print */ printf("%7d\n",iters); for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { printf("%10.1e",a[i][j]); }; printf("\n"); }; }

#include <stdio.h> #include <math.h> #include <pthread.h> #include <semaphore.h> /* Solve a Laplace equation by simple point relaxation. One single problem instance. Uses one thread per mesh point. */ #define N 6 #define tol 1.0e-8 /* N: size of the problem instance. The mesh has (N+1)*(N+1) points. */ /* tol: convergence tolerance. Iteration will stop when the maximum absolute change does not exceed tol. */ double a[N+1][N+1], maxda, newmaxda; struct point { int i; int j; }; void *rx(void *); /* advance declaration, see below */ [ ... declare here variables for synchronization ... ] main() { int i, j, iters; struct point p[N+1][N+1]; pthread_t thread_rx[N+1][N+1]; /* we've declared an array of threads, one for each mesh point. In fact, only the interior mesh points will have a thread. In the present problem the boundary values are fixed. */ /* Initialize data */ for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { if (i==0||i==N||j==0||j==N) a[i][j]=0.0; else a[i][j]=1.0; }; }; maxda=HUGE_VAL; newmaxda=0.0; iters=0; /* Initialize synchronization variables */ [ ... ] /* Create the subtasks */ for (i=1; i<N; i++) { for (j=1; j<N; j++) { p[i][j].i=i; p[i][j].j=j; pthread_create(&thread_rx[i][j], NULL, rx, (void *) &p[i][j]); }; }; /* Iterate */ while (tol<maxda) { [ ... synchronization will be needed in this loop ... ] printf("%7d %10.1e\n",iters,maxda); maxda=newmaxda; newmaxda=0.0; iters++; }; /* Collect all threads */ for (i=1; i<N; i++) { for (j=1; j<N; j++) { pthread_join(thread_rx[i][j],NULL); }; }; /* Print */ printf("%7d\n",iters); for (i=0; i<=N; i++) { for (j=0; j<=N; j++) { printf("%10.1e",a[i][j]); }; printf("\n"); }; } /* relaxation on one mesh point */ void *rx(void *arg) { struct point *p = arg; int i, j; double da; i=(*p).i; j=(*p).j; while (tol<maxda) { [ ... synchronization will be needed in this loop ... ] da=(a[i+1][j]+a[i-1][j]+a[i][j+1]+a[i][j-1]-4*a[i][j])/4; if (newmaxda<fabs(da)) newmaxda=fabs(da); a[i][j]=a[i][j]+da; }; return NULL; }