This program was contributed by Magdalena Schreter-Fleischhacker and Peter Munch. Many ideas presented here are the result of common code development with Maximilian Bergbauer, Marco Feder, Niklas Fehn, Johannes Heinz, Luca Heltai, Martin Kronbichler, Nils Much, and Judith Pauen.
This tutorial is loosely based on chapter 3.4 of the submitted PhD thesis "Matrix-free finite element computations at extreme scale and for
challenging applications" by Peter Munch. Magdalena Schreter-Fleischhacker is funded by the Austrian Science Fund (FWF) Schrödinger Fellowship (J4577).
Note
If you use this program as a basis for your own work, please consider citing it in your list of references. The initial version of this work was contributed to the deal.II project by the authors listed in the following citation:
Introduction
This tutorial presents the advanced point-evaluation functionalities of deal.II, specifically useful for evaluating finite element solutions at arbitrary points. The underlying finite element mesh can be distributed among processes, which makes the operations more involved due to communication. In the examples discussed in this tutorial, we focus on point evaluation for MPI-parallel computations, like parallel::distributed::Triangulation. Nevertheless, the application to non-distributed meshes is also possible.
Point evaluation
In the context of the finite element method (FEM), it is a common task to query the solution at an arbitrary point in the domain of interest
by evaluating the shape functions at this point together with the corresponding solution coefficients . After identifying the cell where the point resides, the transformation between and the corresponding coordinates in the reference cell is obtained by the mapping . In this setting, the evaluation of the solution at an arbitrary point boils down to a cell-local evaluation
with being the shape functions defined on the reference cell and the solution coefficients restricted to the cell .
Alternatively to point evaluation, evaluating weak-form (integration) operations of the type
is possible, with being quadrature points at arbitrary positions. After the values at the quadrature points have been multiplied by the integration weights, this operation can be interpreted as the transpose of the evaluation. Not surprisingly, such an operation can be also implemented as a cell loop.
Setup and communication
To determine the cell and the reference position within the cell for a given point on distributed meshes, deal.II performs a two-level-search approach. First, all processes whose portion of the global mesh might contain the point are determined ("coarse search"). For this purpose, e.g., a distributed tree based on bounding boxes around locally owned domains using "ArborX" [lebrun2020arborx] is applied. After the potentially owning processes have been determined and the points have been sent to them as a request, one can start to find the cells that surround the points among locally owned cells ("fine search"). In order to accelerate this search, an R-tree from "boost::geometry" built around the vertices of the mesh is used.
Once the cell that surrounds point has been found, the reference position is obtained by performing the minimization:
With the determined pieces of information, the desired evaluation can be performed by the process that owns the cell. The result can now be communicated to the requesting process.
In summary, the coarse search determines, for each point, a list of processes that might own it. The subsequent fine search by each process determines whether the processes actually own these points by the sequence of request ("Does the process own the point?") and answer ("Yes."/"No."). Processes might post any number of point requests and communicate with any process. We propose to collect the point requests to a process to use the dynamic, sparse, scalable consensus-based communication algorithms [hoefler2010scalable], and to consider the obtained information to set up point-to-point communication patterns.
The algorithm described above is implemented in Utilities::MPI::RemotePointEvaluation (short: "rpe") and related classes/functions. In this section, basic functionalities are briefly summarized. Their advanced capabilities will be shown subsequently based on concrete application cases.
The following code snippet shows the setup steps for the communication pattern:
std::vector<Point<dim>> points; // ... (filling of points not shown)
evaluate the values and gradients of a solution defined by DoFHandler and a vector at the requested points. Internally, a lambda function is passed to Utilities::MPI::RemotePointEvaluation. Additionally it handles the special case if points belong to multiple cells by taking, e.g., the average, the minimum, or the maximum via an optional argument of type EvaluationFlags::EvaluationFlags. This occurs when a point lies on a cell boundary or within a small tolerance around it and might be relevant for discontinuous solution quantities, such as values of discontinuous Galerkin methods or gradients in continuous finite element methods.
Motivation: two-phase flow
The minimal code examples (short "mini examples") presented in this tutorial are motivated by the application of two-phase-flow simulations formulated in a one-fluid setting using a Eulerian framework. In diffuse interface methods, the two phases may be implicitly described by a level-set function, here chosen as a signed distance function in and illustrated for a popular benchmark case of a rising bubble in the following figure.
The discrete interface is represented implicitly through a certain isosurface of the level-set function e.g. for the signed-distance function . We would like to note that deal.II provides a set of analytical signed distance functions for simple geometries in the Functions::SignedDistance namespace. Those can be combined via Boolean operations to describe more complex geometries [burman2015cutfem]. The temporal evolution of the level-set field is obtained by the transport equation
with the transport velocity at the interface , which might be approximated by the local fluid velocity . To reobtain the signed-distance property of the level-set field throughout the numerical solution procedure, PDE-based or, alternatively, also geometric reinitialization methods are used. For the latter, an algorithm for computing the distance from the support points to the discrete interface, e.g., via closest-point point projection [henri2022geometrical], is needed. This will be part of one of the mini examples, where we describe how to obtain the closest point to the interface for an arbitrary point . For the simplest case, the former can be computed from the following equation
assuming that the interface normal vector and represent exact quantities. Typically, this projection is only performed for points located within a narrow band region around the interface, indicated in the right panel of the figure above.
Alternatively to the implicit representation of the interface, in sharp interface methods, e.g., via front tracking, the interface is explicitly represented by a surface mesh. The latter is immersed into a background mesh, from which the local velocity at the support points of the surface mesh is extracted and leads to a movement of the support points of the immersed mesh as
which considers an explicit Euler time integration scheme from time step to with time step-size .
For a two-phase-flow model considering the incompressible Navier-Stokes equations, the two phases are usually coupled by a singular surface-tension force , which results, together with the difference in fluid properties, in discontinuities across the interface:
Here represents the surface-tension coefficient, the interface normal vector and the interface mean curvature field. The singularity at the interface is imposed by the Dirac delta function
with support on the interface . In a finite element context, the weak form of the surface-tension force is needed. The latter can be applied as a sharp surface-tension force model
exploiting the property of the Dirac delta function for any smooth function , i.e., . For front-tracking methods, the curvature and the normal vector are directly computed from the surface mesh.
Alternatively, in regularized surface-tension-force models [brackbill1992continuum][olsson2005conservative][kronbichler2018fast], the Dirac delta function is approximated by a smooth ansatz
considering the absolute value of the gradient of a regularized indicator function , which is related to the level-set field. In such models, the interface normal vector
and the interface curvature field
are derived from the level-set function.
Overview
In the following, we present three simple use cases of Utilities::MPI::RemotePointEvaluation. We start with discussing a serial code in mini example 0. In the subsequent mini examples, advanced problems are solved on distributed meshes:
mini example 1: we evaluate values and user quantities along a line;
mini example 2: we perform a closest-point projection within a narrow band, based on a level-set function, use the information to update the distance and to perform an extrapolation from the interface;
mini example 3: we compute the surface-tension term sharply with the interface given by an codim-1 mesh, which is advected by the velocity from the background mesh (front tracking; solution transfer between a background mesh and an immersed surface mesh).
The commented program
Include files
The program starts with including all the relevant header files.
The following header file provides the class FEPointEvaluation, which allows us to evaluate values of a local solution vector at arbitrary unit points of a cell.
In the following, we declare utility functions that are used in the mini examples below. You find the definitions at the end of the tutorial.
The minimum requirement of this tutorial is MPI. If deal.II is built with p4est, we use parallel::distributed::Triangulation as distributed mesh. The class parallel::shared::Triangulation is used if deal.II is built without p4est or if the dimension of the triangulation is 1D, e.g., in the case of codim-1 meshes.
#ifdef DEAL_II_WITH_P4EST
template <int dim, int spacedim = dim>
using DistributedTriangulation = typename std::conditional_t<
A given list of points and the corresponding values values_0 and values_1 (optional) are printed column-wise to a file file_name. In addition, the first column represents the distance of the points from the first point.
From the provided Mappingmapping and the DoFHandlerdof_handler collect the global DoF indices and corresponding support points within a narrow band around the zero-level-set isosurface. Thereto, the value of the finite element function signed_distance corresponding to the DoFHandlerdof_handler_support_points is evaluated at each support point. A support point is only collected if the absolute value is below the value for the narrow_band_threshold.
Mini example 0: Evaluation at given points for a serial mesh
In this introductory example, we demonstrate basic functionalities available in deal.II to evaluate solution quantities at arbitrary points on a serial mesh. The same functionalities are used directly or indirecly in the distributed case to evaluate solution on locally owned cells. This, however, needs to be augmented by communication, as presented in following examples.
We first create the typical objects needed for a finite element discretization (defined by mapping, triangulation, and finite element) and a vector containing finite element solution coefficients.
Now, we loop over all evaluation points. In the first step, we determine via GridTools::find_active_cell_around_point() the cell that surrounds the point and translate the given real coordinate to the corresponding coordinate on the unit cell according to the provided mapping. The resulting information is printed to the screen.
Having determined and , we can perform the evaluation of the finite element solution at this point. In the following, we show three approaches for this purpose. In the first approach, we follow a traditional technique by using FEValues based on a cell-specific quadrature rule consisting of the unit point.
The second approach considers FEPointEvaluation, which directly takes a list of unit points for the subsequent evaluation. The class FEPointEvaluation is a class optimized for the evaluation on cell level at arbitrary points and should be favored for such tasks.
Finally, in the third approach, the function VectorTools::point_value() is considered. It performs both the search of the surrounding cell and the evaluation at the requested point. However, its application is limited to a serial run of the code.
Obviously, the code above cannot work for distributed meshes, since the search (which might require communication) is called within a for-loop with loop bounds possibly different on each process. In the following code examples, we present the usage of arbitrary point evaluation in a parallel computation.
Mini example 1: Evaluation at given points on a distributed mesh
Just like in the introductory example, we evaluate the solution along a line, however, on a distributed mesh. We again start with setting up the objects needed for a finite element discretization.
We determine a finite element solution representing implicitly the geometry of a sphere with a radius of and the center at via a signed distance function.
static constexpr Point< dim, Number > unit_vector(const unsigned int i)
We create a list of arbitrary (evaluation) points along a horizontal line, which intersects the center of the sphere. We do this only on the root rank, since we intend to output the results to a CSV file by the root rank.
Now, we can evaluate the results, e.g., for the signed distance function at all evaluation points in one go. First, we create a modifiable Utilities::MPI::RemotePointEvaluation object. We use VectorTools::point_values() by specifying the number of components of the solution vector (1 for the present example) as a template parameter. Within this function, the provided object for Utilities::MPI::RemotePointEvaluation is automatically reinitialized with the given points (profile). The ghost values of the solution vector need to be updated from the user.
for (unsignedint q = 0; q < unit_points.size(); ++q)
local_value[q] = fe_point.get_value(q);
}
};
The lambda function is passed to Utilities::MPI::RemotePointEvaluation::evaluate_and_process(), where the values are processed accordingly and stored within the created output vector. Again, the ghost values of the vector to be read need to be updated by the user.
Finally, we output all results: the mesh as a VTU file and the results along the line as a CSV file. You can import the CSV file in ParaView and compare the output with the native line plot of ParaView based on the VTU file.
Mini example 2: Closest-point evaluation on a distributed mesh
In this mini example, we perform a closest-point projection for each support point of a mesh within a narrow band by iteratively solving for
Once the closest point is determined, we can compute the distance and extrapolate the values from the interface. Note that the demonstrated algorithm does not guarantee that the closest points are collinear (see discussion in [coquerelle2016fourth]). For the latter, one might also need to perform a tangential correction, which we omit here to keep the discussion concise.
We start with creating the objects for the finite element representation of the background mesh.
We compute finite element solution vector, based on an arbitrary function. In addition, a finite element function computed from a signed distance function represents the geometry of a sphere implicitly.
In the next step, we collect the points in the narrow band around the zero-level-set isosurface for which we would like to perform a closest point projection. To this end, we loop over all support points and collect the coordinates and the DoF indices of those with a maximum distance of 0.1 from the zero-level-set isosurface.
For the iterative solution procedure of the closest-point projection, the maximum number of iterations and the tolerance for the maximum absolute acceptable change in the distance in one iteration are set.
pcout << " - determine closest point iteratively" << std::endl;
constexprint max_iter = 30;
constexprdouble tol_distance = 1e-6;
Now, we are ready to perform the algorithm by setting an initial guess for the projection points simply corresponding to the collected support points. We collect the global indices of the support points and the total number of points that need to be processed and do not fulfill the required tolerance. Those will be gradually reduced upon the iterative process.
T sum(const T &t, const MPI_Comm mpi_communicator)
Now, we create a Utilities::MPI::RemotePointEvaluation cache object and start the loop for the fix-point iteration. We update the list of points that still need to be processed and subsequently pass this information to the Utilities::MPI::RemotePointEvaluation object. For the sake of illustration, we export the coordinates of the points to be updated for each iteration to a CSV file. Next, we can evaluate the signed distance function and the gradient at those points to update the current solution for the closest points. We perform the update only if the signed distance of the closest point is not already within the tolerance and register those points that still need to be processed.
We print a warning message if we exceed the maximum number of allowed iterations and if there are still projection points with a distance value exceeding the tolerance.
if (n_unmatched_points > 0)
pcout << "WARNING: The tolerance of " << n_unmatched_points
<< " points is not yet attained." << std::endl;
As a result, we obtain a list of support points and corresponding closest points at the zero-isosurface level set. This information can be used to update the signed distance function, i.e., the reinitialization the values of the level-set function to maintain the signed distance property [henri2022geometrical].
for (unsignedint i = 0; i < closest_points.size(); ++i)
solution_distance[support_points_idx[i]] =
support_points[i].distance(closest_points[i]);
In addition, we use the information of the closest point to extrapolate values from the interface, i.e., the zero-level set isosurface, to the support points within the narrow band. This might be helpful to improve accuracy, e.g., for diffuse interface fluxes where certain quantities are only accurately determined at the interface (e.g. curvature for surface tension [coquerelle2016fourth]).
Mini example 3: Sharp interface method on the example of surface tension for front tracking
The final mini example presents a basic implementation of front tracking [peskin1977numerical], [unverdi1992front] of a surface mesh immersed in a Eulerian background fluid mesh .
We assume that the immersed surface is transported according to a prescribed velocity field from the background mesh. Subsequently, we perform a sharp computation of the surface-tension force:
We decompose this operation into two steps. In the first step, we evaluate the force contributions at the quadrature points defined on the immersed mesh and multiply them with the mapped quadrature weight :
In the second step, we compute the discretized weak form by multiplying with test functions on the background mesh:
We start with setting the parameters consisting of the polynomial degree of the shape functions, the dimension of the background mesh, the time-step size to be considered for transporting the surface mesh and the number of time steps.
void example_3()
{
constexprunsignedint degree = 3;
constexprunsignedint dim = 2;
constdouble dt = 0.01;
constunsignedint n_time_steps = 200;
This program is intended to be executed in 2D or 3D.
static_assert(dim == 2 || dim == 3, "Only implemented for 2D or 3D.");
and, similarly, for the immersed surface mesh. We use a sphere with radius which is placed in the center of the top half of the cubic background domain.
Two different mappings are considered for the immersed surface mesh: one valid for the initial configuration and one that is updated in every time step according to the nodal displacements. Two types of finite elements are used to represent scalar and vector-valued DoF values.
We fill a DoF vector on the background mesh with an analytical velocity field considering the Rayleigh-Kothe vortex flow and initialize a DoF vector for the weak form of the surface-tension force.
We initialize a Utilities::MPI::RemotePointEvaluation object and start the time loop. For any other step than the initial one, we first move the support points of the surface mesh according to the fluid velocity of the background mesh. Thereto, we first update the time of the velocity function. Then, we update the internal data structures of the Utilities::MPI::RemotePointEvaluation object with the collected support points of the immersed mesh. We throw an exception if one of the points cannot be found within the domain of the background mesh. Next, we evaluate the velocity at the surface-mesh support points and compute the resulting update of the coordinates. Finally, we update the mapping of the immersed surface mesh to the current position.
Next, we loop over all locally owned cells of the immersed mesh and its quadrature points to compute the value for the local surface tension force contribution . We store the real coordinates of the quadrature points and the corresponding force contributions in two individual vectors. For computation of the latter, the normal vector can be directly extracted from the surface mesh via FEValues and, for the curvature, we use the following approximation:
which we can apply since the immersed mesh is consistently orientated. The surface tension coefficient is set to 1 for the sake of demonstration.
pcout << " - compute to be tested values (immersed mesh)" << std::endl;
Before we evaluate the weak form of the surface-tension force, the communication pattern of Utilities::MPI::RemotePointEvaluation is set up from the quadrature points of the immersed mesh, determining the surrounding cells on the background mesh.
pcout << " - test values (background mesh)" << std::endl;
In preparation for utilizing Utilities::MPI::RemotePointEvaluation::process_and_evaluate that performs the multiplication with the test function, we set up a callback function that contains the operation on the intersected cells of the background mesh. Within this function, we initialize a FEPointEvaluation object that allows us to integrate values at arbitrary points within a cell. We loop over the cells that surround quadrature points of the immersed mesh – provided by the callback function. From the provided CellData object, we retrieve the unit points, i.e., the quadrature points of the immersed mesh that lie within the current cell and a pointer to the stored values on the current cell (local surface-tension force) for convenience. We reinitialize the data structure of FEPointEvaluation on every cell according to the unit points. Next, we loop over the quadrature points attributed to the cell and submit the local surface-tension force to the FEPointEvaluation object. Via FEPointEvaluation::test_and_sum(), the submitted values are multiplied by the values of the test function and a summation over all given points is performed. Subsequently, the contributions are assembled into the global vector containing the weak form of the surface-tension force.
The callback function is passed together with the vector holding the surface-tension force contribution at each quadrature point of the immersed mesh to Utilities::MPI::RemotePointEvaluation::process_and_evaluate. The only missing step is to compress the distributed force vector.
After every tenth step or at the beginning/end of the time loop, we output the force vector and the velocity of the background mesh to a VTU file. In addition, we also export the geometry of the (deformed) immersed surface mesh to a separate VTU file.
We present a part of the terminal output. It shows, for each point, the determined cell and reference position. Also, one can see that the values evaluated with FEValues, FEPointEvaluation, and VectorTools::point_values() are identical, as expected.
Running: example 0
- Found point with real coordinates: 0 0.5
- in cell with vertices: (0 0.4) (0.2 0.4) (0 0.6) (0.2 0.6)
- with coordinates on the unit cell: (0 0.5)
- Values at point:
- 0.25002 (w. FEValues)
- 0.25002 (w. FEPointEvaluation)
- 0.25002 (w. VectorTools::point_value())
- Found point with real coordinates: 0.05 0.5
- in cell with vertices: (0 0.4) (0.2 0.4) (0 0.6) (0.2 0.6)
- with coordinates on the unit cell: (0.25 0.5)
- Values at point:
- 0.20003 (w. FEValues)
- 0.20003 (w. FEPointEvaluation)
- 0.20003 (w. VectorTools::point_value())
...
- Found point with real coordinates: 1 0.5
- in cell with vertices: (0.8 0.4) (1 0.4) (0.8 0.6) (1 0.6)
- with coordinates on the unit cell: (1 0.5)
- Values at point:
- 0.25002 (w. FEValues)
- 0.25002 (w. FEPointEvaluation)
- 0.25002 (w. VectorTools::point_value())
- writing csv file
The CSV output is as follows and contains, in the first column, the distances with respect to the first point, the second and the third column represent the coordinates of the points and the fourth column the evaluated solution values at those points.
The CSV output is as follows and identical to the results of the serial case presented in mini example 0. The fifth column shows the user quantity evaluated additionally in this mini example.
Running: example 2
- create system
- determine narrow band
- determine closest point iteratively
- iteration 0: 7076 -> 7076
- iteration 1: 7076 -> 104
- iteration 2: 104 -> 0
- determine distance in narrow band
- perform extrapolation in narrow band
- output results
The following three plots, representing the performed iterations of the closest-point projection, show the current position of the closest points exceeding the required tolerance of the discrete interface of the circle and still need to be corrected. It can be seen that the numbers of points to be processed decrease from iteration to iteration.
The output visualized in Paraview looks like the following: On the left, the original distance function is shown as the light gray surface. In addition, the contour values refer to the distance values determined from calculation of the distance to the closest points at the interface in the narrow band. It can be seen that the two functions coincide. Similarly, on the right, the original solution and the extrapolated solution from the interface is shown.
Mini example 3
We show a shortened version of the terminal output.
Running: example 3
- creating background mesh
- creating immersed mesh
time: 0
- compute to be tested values (immersed mesh)
- test values (background mesh)
- write data (background mesh)
- write mesh (immersed mesh)
time: 0.01
- move support points (immersed mesh)
- compute to be tested values (immersed mesh)
- test values (background mesh)
time: 0.02
- move support points (immersed mesh)
- compute to be tested values (immersed mesh)
- test values (background mesh)
...
time: 2
- move support points (immersed mesh)
- compute to be tested values (immersed mesh)
- test values (background mesh)
- write data (background mesh)
- write mesh (immersed mesh)
The output visualized in Paraview looks like the following: The deformation of the immersed mesh by the reversible vortex flow can be seen. Due to discretization errors, the shape is not exactly circular at the end, illustrated in the right figure. The sharp nature of the surface-tension force vector, shown as vector plots, can be seen by its restriction to cells that are intersected by the immersed mesh.
Possibilities for extension
This program highlights some of the main capabilities of the distributed evaluation routines in deal.II. However, there are many related topics worth mentioning:
Performing a distributed search is an expensive step. That is why we suggest to provide hints to Utilities::MPI::RemotePointEvaluation and to reuse Utilities::MPI::RemotePointEvaluation instances in the case that the communication pattern has not changed. Furthermore, there are instances where no search is needed and the points are already sorted into the right cells. This is the case if the points are generated on the cell level (see step-85; CutFEM) or the points are automatically sorted into the correct (neighboring) cell (see step-68; PIC with Particles::ParticleHandler). Having said that, the Particles::ParticleHandler::insert_global_particles() function uses the described infrastructure to perform the initial sorting of particles into cells.
We concentrated on parallelization aspects in this tutorial. However, we would like to point out the need for fast evaluation on cell level. The task for this in deal.II is FEPointEvaluation. It exploits the structure of
to derive fast implementations, e.g., for tensor-product elements
Since only 1D shape functions are queried and are re-used in re-occurring terms, this formulation is faster than without exploitation of the structure.
Utilities::MPI::RemotePointEvaluation is used in multiple places in deal.II. The class DataOutResample allows to output results on a different mesh than the computational mesh. This is useful if one wants to output the results on a coarser mesh or one does not want to output 3D results but instead 2D slices. In addition, MGTwoLevelTransferNonNested allows to prolongate solutions and restrict residuals between two independent meshes. By passing a sequence of such two-level transfer operators to MGTransferMF and, finally, to Multigrid, non-nested multigrid can be computed.
Utilities::MPI::RemotePointEvaluation can be used to couple non-matching grids via surfaces (example: fluid-structure interaction, independently created grids). The evaluation points can come from any side (pointwise interpolation) or are defined on intersected meshes (Nitsche-type mortaring [heinz2022high]). Concerning the creation of such intersected meshes and the corresponding evaluation points, see GridTools::internal::distributed_compute_intersection_locations().
Alternatively to the coupling via Utilities::MPI::RemotePointEvaluation, preCICE [bungartz2016precice][chourdakis2021precice] can be used. The code-gallery program "Laplace equation coupled to an external simulation
program" shows how to use this library with deal.II.