Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
dumux
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
dumux-repositories
dumux
Commits
824eb40b
Commit
824eb40b
authored
6 years ago
by
Timo Koch
Browse files
Options
Downloads
Patches
Plain Diff
[handbook] Update parallel
parent
5cb67f60
No related branches found
No related tags found
Loading
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc/handbook/5_parallel.tex
+40
-71
40 additions, 71 deletions
doc/handbook/5_parallel.tex
with
40 additions
and
71 deletions
doc/handbook/5_parallel.tex
+
40
−
71
View file @
824eb40b
\section
{
Parallel Computation
}
\label
{
sec:parallelcomputation
}
Multicore processors are standard nowadays and parallel programming is the key to gain
performance from modern computers. This section explains how
\Dumux
can be used
on multicore systems, ranging from the users desktop computer to high performance
computing clusters.
This section explains how
\Dumux
can be used
on multicore / multinode systems.
There are different concepts and methods for parallel programming, which are
often grouped in
\textit
{
shared-memory
}
and
\textit
{
distributed-memory
}
approaches. The parallelization in
\Dumux
is based on the
\textit
{
Message Passing Interface
}
(MPI), which is usually called MPI parallelization (distributed-memory approach).
It is the MPI parallelization that allows the user to run
\Dumux
applications in parallel on a desktop computer, the users laptop or
large high performance clusters. However, the chosen
\Dumux
model must support parallel computations.
This is the case for most
\Dumux
applications, except for multidomain and
freeflow.
The main idea behind the MPI parallelization is the concept of
\textit
{
domain
decomposition
}
. For parallel simulations, the computational domain is split into
subdomains and one process (
\textit
{
rank
}
) is used to solve the local problem of each
subdomain. During the global solution process, some data exchange between the
ranks/subdomains is needed. MPI is used to send data to other ranks and to receive
data from other ranks.
Most grid managers contain own domain decomposition methods to split the
computational domain into subdomains. Some grid managers also support external
often grouped in
\textit
{
shared-memory
}
and
\textit
{
distributed-memory
}
approaches. The parallelization in
\Dumux
is based on the model supported by Dune which is currently based on
\textit
{
Message Passing Interface
}
(MPI) (distributed-memory approach).
The main idea behind the MPI parallelization is the concept of
\textit
{
domain
decomposition
}
. For parallel simulations, the computational domain is split into
subdomains and one process (
\textit
{
rank
}
) is used to solve the local problem of each
subdomain. During the global solution process, some data exchange between the
ranks/subdomains is needed. MPI is used to send data to other ranks and to receive
data from other ranks. The domain decomposition in Dune is handled by the grid managers.
The grid is partitioned and distributed on several nodes. Most grid managers contain own domain decomposition methods to split the
computational domain into subdomains. Some grid managers also support external
tools like METIS, ParMETIS, PTScotch or ZOLTAN for partitioning.
On the other hand linear algebra types such as matrices and vectors
do not know that they are in a parallel environment. Communication is then handled by the components of the
parallel solvers. Currently, the only parallel solver backend is
\texttt
{
Dumux::AMGBackend
}
, a parallel AMG-preconditioned
BiCGSTAB solver.
Before
\Dumux
can be started in parallel, an
MPI library (e.g. OpenMPI, MPICH or IntelMPI)
must be installed on the system and all
\Dune
modules and
\Dumux
must be recompiled.
In order for
\Dumux
simulation to run in parallel, an
MPI library (e.g. OpenMPI, MPICH or IntelMPI) implementation
must be installed on the system.
\subsection
{
Prepare a Parallel Application
}
Not all parts of
\Dumux
can be used in parallel. One example are the linear solvers
of the sequential backend. However, with the AMG backend
\Dumux
provides
a parallel solver backend based on Algebraic Multi Grid (AMG) that can be used in
parallel.
If an application uses not already the AMG backend, the
user must switch the backend to AMG to run the application also in parallel.
First, the header file for the parallel AMG backend must be included.
Not all parts of
\Dumux
can be used in parallel. In order to switch to the parallel
\texttt
{
Dumux::AMGBackend
}
solver backend include the respective header
\begin{lstlisting}
[style=DumuxCode]
#include <dumux/linear/amgbackend.hh>
\end{lstlisting}
so that the backend can be used. The header file of the sequential
backend
Second, the linear solver must be switched to the AMG
backend
\begin{lstlisting}
[style=DumuxCode]
#include <dumux/linear/seqsolverbackend.hh>
using LinearSolver = Dumux::AMGBackend<TypeTag>;
\end{lstlisting}
can be removed.
Second, the linear solver must be switched to the AMG backend
and the application must be recompiled. The parallel
\texttt
{
Dumux::AMGBackend
}
instance has to be
constructed with a
\texttt
{
Dune::GridView
}
object and a mapper, in order to construct the
parallel index set needed for communication.
\begin{lstlisting}
[style=DumuxCode]
using L
inearSolver =
Dumux::AMGBackend<TypeTag>
;
auto l
inearSolver =
std::make
_
shared<LinearSolver>(leafGridView, fvGridGeometry->dofMapper())
;
\end{lstlisting}
and the application must be compiled.
\subsection
{
Run a Parallel Application
}
The starting procedure for parallel simulations depends on the chosen MPI library.
The starting procedure for parallel simulations depends on the chosen MPI library.
Most MPI implementations use the
\textbf
{
mpirun
}
command
\begin{lstlisting}
[style=Bash]
mpirun -np <n
_
cores> <executable
_
name>
\end{lstlisting}
where
\textit
{
-np
}
sets the number of cores (
\texttt
{
n
\_
cores
}
) that should be used for the
computation. On a cluster you usually have to use a queu
e
ing system (e.g. slurm) to
submit a job.
where
\textit
{
-np
}
sets the number of cores (
\texttt
{
n
\_
cores
}
) that should be used for the
computation. On a cluster you usually have to use a queuing system (e.g. slurm) to
submit a job.
Check with your cluster administrator how to run parallel applications on the cluster.
\subsection
{
Handling Parallel Results
}
For most models, the results should not differ between parallel and serial
runs. However, parallel computations are not naturally deterministic.
A typical case where one can not assume a deterministic behavior are models where
small differences in the solution can cause large differences in the results
(e.g. for some turbulent flow problems). Nevertheless, it is useful to expect that
the simulation results do not depend on the number of cores. Therefore you should double check
the model, if it is really not deterministic. Typical reasons for a wrong non-deterministic
behavior are errors in the parallel computation of boundary conditions or missing/reduced
data exchange in higher order gradient approximations. Also, you should keep in mind that
for iterative solvers differences in the solution can occur due to the error threshold.
For serial computations,
\Dumux
produces single vtu-files as default output format.
During a simulation, one vtu-file is written for every output step.
In the parallel case, one vtu-file for each step and processor is created.
For parallel computations, an additional variable "process rank" is written
into the file. The process rank allows the user to inspect the subdomains
after the computation.
\subsection
{
MPI scaling
}
For parallel computations, the number of cores must be chosen
carefully. Using too many cores will not always lead to more performance, but
can lead to inefficiency. One reason is that for small subdomains, the
communication between the subdomains becomes the limiting factor for parallel computations.
The user should test the MPI scaling (relation between the number of cores and the computation time)
for each specific application to ensure a fast and efficient use of the given resources.
For serial computations,
\Dumux
produces single vtu-files as default output format.
During a simulation, one vtu-file is written for every output step.
In the parallel case, one vtu-file for each step and processor is created.
For parallel computations, an additional variable
\texttt
{
"process rank"
}
is written
into the file. The process rank allows the user to inspect the subdomains
after the computation. The parallel vtu-files are combined in a single pvd file
like in sequential simulations that can be opened with e.g. ParaView.
This diff is collapsed.
Click to expand it.
Timo Koch
@timok
mentioned in commit
a47c325a
·
6 years ago
mentioned in commit
a47c325a
mentioned in commit a47c325a10eca6ea83d1b420584b0925f9a4de2f
Toggle commit list
Timo Koch
@timok
mentioned in merge request
!1428 (merged)
·
6 years ago
mentioned in merge request
!1428 (merged)
mentioned in merge request !1428
Toggle commit list
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment