by implementing an MPI-based domain decomposition.
The PDE is dicretized on a regular finite-difference grid with fixed (Diriclet) boundary conditions:
The PDE is discretized on a regular finite-difference grid with fixed (Dirichlet) boundary conditions:
$$
\begin{align}
u(0,y) &= 0 \\
@ -65,7 +65,7 @@ Your task is to decompose the finite-difference grid into domain regions such th
The decoupling of the regions is achieved by introducing a *ghost layer* of grid points which surrounds each region.
The values in the ghost layer of a region are not updated during an iteration.
Instead, after an iteration is finished the updated values for the ghost layer are received from the neighbouring regions, and the boundary layer is sent to the neighouring regions (see Figure below).
Instead, after an iteration is finished the updated values for the ghost layer are received from the neighboring regions, and the boundary layer is sent to the neighboring regions (see Figure below).
- `resolution`: number of grid points along each dimension of the unit square; the gridspacing is $`h = 1.0/(\text{resolution}-1)`$
- `resolution`: number of grid points along each dimension of the unit square; the gridspacing is $`h = 1.0/(\text{resolution}-1)`$
- `iterations`: number of Jacobi iterations to perform
Further and more specifically, your program should
- use $`\bar{u}_h=\mathbf{0}`$ as initial approximation to $`u`$, and (after finishing all iterations)
- print the Euclidean $`\parallel \cdot \parallel_2`$ and Maximum $`\parallel \cdot \parallel_{\infty}`$ norm of the residual $`\parallel A_h\bar{u}_h-b_h \parallel`$ and of the total error $`\parallel \bar{u}_h-u_p \parallel`$ to the console,
- print the average runtime per iteration to the console, and
- print the average runtime per iteration to the console, and
- produce the same results as a serial run.
Finally, benchmark the parallel performance of your program `jacobiMPI` using 2 nodes of the IUE-Cluster for 4 different `resolution`s=$`\{125,250,1000,4000\}`$ using between 1 and 80 MPI-processes (`NUMMPIPROC`).
- the new parameter `DIM` has two valid values `1D` or `2D` and switches between one-dimensional and two-dimensional decomposition.
Ensure a correct implementation by comparing your results to a serial run. Benmarking on the cluster is **not**
Ensure a correct implementation by comparing your results to a serial run. Benchmarking on the cluster is **not**
required.
**Notes:**
@ -145,7 +145,7 @@ required.
- Your login credentials will be provided via email.
- You need to enable a "TU Wien VPN" connection.
- You can login to the cluser using `ssh` and your credentials.
- You can login to the cluster using `ssh` and your credentials.
- You will be asked to change your initial password upon first login.
**File Transfer**
@ -159,8 +159,8 @@ required.
- The cluster has a *login node* (the one you `ssh` to, details will be announced in the email with the credentials)
- This login node must only be used to compile your project and **never** to perform any benchmarks or MPI-runs (beside minimal lightweight tests of for the MPI configuration)
- All other nodes of the cluster are used to run the "jobs" you submit.
- To support cluster users, a set of *environement modules* (relevant for us is only the "MPI"-module) is made available. You can list all modules using `module avail`
- Note that you also need to load the modules you require in your job subsmission scripts (see example provided in this repo).
- To support cluster users, a set of *environment modules* (relevant for us is only the "MPI"-module) is made available. You can list all modules using `module avail`
- Note that you also need to load the modules you require in your job submission scripts (see example provided in this repo).