Posts Tagged ‘fem’

Understanding mass matrix lumping in terms of functions spaces

Monday, November 14th, 2016

$$\newcommand{\u}{\mathbf{u}}$$ $$\newcommand{\f}{\mathbf{f}}$$

Mass matrix lumping is the process of using a diagonal mass matrix instead of the “full” mass matrix resulting from Galerkin Finite Element Method. Lumping is often recommended because a diagonal mass matrix easy to invert (just invert the entries along the diagonal). As the name implies, diagonalizing the mass matrix is often justified by arguing rather hand-wavily that we’re just “lumping all of the mass or integration” at the vertices of the mesh rather than integrating over the elements. This sounds intuitive but it leaves me wanting a more formal justification.

Here’s an attempt to justify mass lumping in terms of mixed Finite Element approach to discretizing problems of the form: minimize,

$$E(u) = \frac{1}{2} \int_\Omega (u-f)^2 + \nabla u \cdot \nabla u\ dA$$

If we discretize this irreducible form directly using, for example, piecewise-linear basis functions over a mesh (e.g., \(u \approx \sum_i \phi_i u_i\), and we use \(\u\) to also mean the vector of coefficients \(u_i\)), then we get:

$$E(\u) = \frac{1}{2} (\u-\f)^TM(\u-\f) + \frac{1}{2} \u^TK\u$$

where we call \(M\) the mass matrix so that \(M_{ij} = \int_\Omega \phi_i \phi_j\ dA\). Then when we “mass lump” we are replacing \(M\) with a diagonal matrix \(D\) such that \(D_{ii} = \sum_j M_{ij}\) or \(D_{ii} = \) the area associated with vertex \(i\). This “associated area” might be the Voronoi area.

In physical dynamics, if \(\u=\u^t\) are displacements of some dynamic system and \(\f=\u^{t-\Delta t}\) are the mesh displacements of the previous time step then this gives us a physical formula for kinetic and potential energy:

$$E(\u^t) = \frac{1}{2 \Delta t^2} (\u^t-\u^{t-\Delta t})^T M (\u^t-\u^{t-\Delta t}) + \frac{1}{2} {\u^t}^TK\u^t$$

where the stiffness matrix \(K\) now might measure, for example, linear elasticity (I’m mostly going to ignore what’s going on with \(K\)).

When we write this physical energy in the continuous form we’ll introduce a variable to keep track of velocities: \(\dot{u} = \partial u / \partial t\),

$$E(u,\dot{u}) = \frac{1}{2} \int_\Omega \dot{u}^2 + (\nabla u+\nabla u^T) \cdot (\nabla u + \nabla u^T)\ dA$$

$$\dot{u} = \partial u / \partial t$$

Now we have the potential energy in terms of velocity \(\dot{u}\) and the kinetic energy in terms of displacements \(u\). This is a very natural mixed form. If we discretize both \(u\) and \(\dot{u}\) using piecewise-linear basis functions over the same mesh then we’ll have done a lot of re-shuffling to get the identical discretization above. In particular, we’ll just get the same non-diagonal mass matrix \(M\).

Instead, we’ll discretize only displacements \(u\) using piecewise-linear basis functions, and for velocities \(\dot{u}\) we’ll use piecewise-constant basis functions defined on the dual mesh. A standard choice for the dual mesh is the Voronoi dual mesh, where each zero dimensional vertex becomes zero-codimensional “cell” whose area/volume is the Voronoi area/volume of that vertex as a seed amongst all other vertices as fellow seeds. One can construct this dual mesh by connected circumcenters of all of the triangles in the primary mesh. An alternative dual mesh, we’ll call the barycentric dual, will connect all barycenters of the triangles in the primary mesh. All this to say that we approximate the velocities as sum over piecewise constant basis functions located at primary mesh vertices (aka dual mesh cells): \(\dot{u}^t \approx \sum_i \psi_i \dot{u}^t_i\). The basis functions \(\psi_i\) and \(\phi_i\) are in correspondence so that \(\dot{\u}^t\) collects elements \(\dot{u}^t_i\) in the same number and order as \(\u\).

Now we can discretize the energy as:

$$E(\dot{\u}^t,\u^t) = \frac{1}{2} \dot{\u}^T D \dot{\u} + \frac{1}{2} {\u^t}^T K \u^t,$$

and voilà the diagonal mass matrix $D$ appears, where

$$D_{ij} = \int_\Omega \psi_i \psi_j\ dA = \begin{cases} \text{area}_i & i=j \ 0 & i\neq j \end{cases}$$

As a side note, if we use the barycentric dual mesh then areai \(= \sum_jM_{ij}\) , if we use the Voronoi dual mesh then \(\text{area}_i\) is the Voronoi area which may become negative for non-Delaunay meshes (and we might want to use the “fix” described in “Discrete Differential-Geometry Operators for Triangulated 2-Manifolds” [Mark Meyer et al. 2002]).

Our work is not quite done since we haven’t coupled our discretization of velocity and displacements together. We haven’t discretized the constraint: \(\dot{u} = \partial u / \partial t\). We could just quietly say that they “need to agree” at mesh vertices so that \(\dot{u}^t_i = (u^t_i – u_i^{t – \Delta t})/\Delta t\). If we do then we can immediately substitute-out all of the velocity variables and arrive at an “flattened” form of the energy only in terms of displacements $\u$:

$$E(\u^t) = \frac{1}{2 \Delta t^2} (\u^t-\u^{t-\Delta t})^T D (\u^t-\u^{t-\Delta t}) + \frac{1}{2} {\u^t}^T K \u^t.$$

Deeper down the rabbit hole: This is a bit unsettling because it seems like we just pushed the “lumping magic” deeper into the constraints between velocities and displacements. Indeed, if we try to enforce this constraint by the Lagrange multiplier method:

$$L(u,\dot{u},\lambda) = E(u,\dot{u}) + \frac{1}{2} \int_\Omega \lambda \left(\dot{u} – \frac{\partial u}{\partial t}\right) \ dA$$

(which admittedly reads a bit like nonsense because before the only difference between \(\dot{u}\) and \(\partial u / \partial t\) was notation, but think of \(\dot{u}\) as a first-class variable function).

then we have to pick a discretization for the Lagrange multiplier function \(\lambda\). Surprisingly, if we use either piecewise-linear on the primary mesh or piecewise-constant basis functions on the dual mesh then we’ll get a non-diagonal matrix coupling \(\dot{u}\) and \(\u\) which means that we can’t simply substitute-out $\dot{\u}$.

To tie things up, I found one (potentially dubious) discretization of \(\lambda\) that makes things work: Dirac basis functions

$$\delta_i(x) = \int_\Omega \delta(x-x_i) \ dA$$

Where \(\delta(x)\) is Dirac’s delta function so that \(\delta(x) = \begin{cases} \infty & x=0 \ 1 & x\neq 0 \end{cases}\) and \(\int \delta(x) \ dA = 1\).

Assuming I have permission for the Discretization God’s, then I let \(\lambda \approx \sum_i \delta_i \lambda_i\) where (by abuse of notation/LaTeX’s inability to typeset bold Greek letters) \(\lambda\) is a vector of coefficients \(\lambda_i\) in corresponding number and order with \(\u\) and \(\dot{\u}\).

Since (crossing fingers) \(\int_\Omega \delta_i \psi_j \ dA = \begin{cases} 1 & i=j \ 0 & i\neq j\end{cases}\) and \(\int_\Omega \delta_i \phi_j \ dA = \begin{cases} 1 & i=j \ 0 & i\neq j\end{cases}\). Our discretized Lagrangian is:

$$L(\u^t,\dot{\u}^t,\lambda) = \frac{1}{2} \dot{\u}^T D \dot{\u} + \frac{1}{2} {\u^t}^T K \u^t + \frac{1}{2} \lambda I \left(\u^t – \frac{\u^t – \u^{t-\Delta t}}{\Delta t} \right),$$

where \(I\) truly madly deeply is the identity matrix. This implies that a direct substitution of \(\dot{\u}^t\) with \((\u^t – \u^{t-\Delta t})/\Delta t\) is valid.

Update: Some papers making similar claims:

“A note on mass lumping and related processes” [Hinton et al. 1976]
(“Analysis Of Structural Vibration And Response” [Clough 1971] (cited in [Hinton et al. 1976] but I haven’t found a copy yet)

How does Galerkin multigrid scale for irregular grids?

Tuesday, August 4th, 2015

There are different flavors of the multigrid method. Let’s consider the geometric subclass and within that the “Galerkin” or projection-style multigrid method. This method is favorable because it only requires discretization at the finest level and prolongation and restriction between levels. This means, given a problem A1 x1 = B1 on the finest mesh, we don’t need to know how to create an analogous problem A2 x2 = B2 on the second finest mesh and so on. Instead, we define A2 = R A1 P where R is the #coarse vertices by #fine vertices restriction operator taking fine mesh values to coarse mesh values and P is the #fine vertices by #coarse vertices prolongation operator. A Galerkin multigrid V-cycle looks something like this:

  • relax current guess x1 according to A1 x1 = B1
  • compute residual r1 = B1 - A1 x1
  • restrict residual to coarse mesh r2 = R r1
  • restrict system matrix A2 = R A1 P
  • solve (recursively) for update A2 u2 = r2
  • prolong update and add to guess u1 = P u2, x1 += u1
  • relax current guess x1 according to A1 x1 = B1

Often (for good reasons), we take the restriction operator to be the transpose of the prolongation operator R = P'. When we’re working with nested meshes, it’s natural to use the linear interpolation operator as P.

This V-cycle if A1 x1 = B1 is coming from a quadratic energy minimization problem:

min  ½ x1' A1 x1 - x1' B1
 x1

Suppose we freeze our current guess x1 and look for the best update u1, we’d change the problem to

min  ½ (x1 + u1)' A1 (x1 + u1) - (x1+u1)' B1
 u1

Rearranging terms according to u1 this is

min  ½ u1' A1 u1 - u1' (B1 - A1 x1) + constants
 u1

Now we recognize our residual r1 = B1 - A1 x1. Let’s substitute that in:

min  ½ u1' A1 u1 - u1' r1
 u1

If we were to optimize this for u1 we’d get a system of equations:

A1 u1 = r1

Instead, let’s make use of our multires hierarchy. The coarse mesh spans a smaller space of functions that the finer mesh. We can take any function living on the coarse mesh u2 and interpolate it to give us a function on the fine mesh u1 = P u2, thus we could consider restricting the optimization search space of u1 functions to those we can create via P u2:

min  ½ (P u2)' A1 (P u2) - (P u2)' r1
 u2

rearranging terms and substituting R = P' we get:

min  ½ u2' R A1 P u2 - u2' R r1
 u2

finally optimizing this for u2 we get a system of equations:

R A1 P u2 = R r1

or

A2 u2 = r2

This Galerkin-style multigrid is great because we don’t need to worry about re-discretizing the system matrix (e.g. using FEM on the coarse mesh) or worry about redefining boundary conditions on the coarse mesh. The projection takes care of everything.

But there’s a catch.

For regular grids with carefully chosen decimation patterns (#vertices on a side is a power of 2 plus 1) then the projection matrix will be well behaved. Each row corresponding to a fine mesh vertex will at most 1 vertex if perfectly align atop a coarse mesh vertex and will involve the nearest 2 vertices on the coarse mesh if lying on a coarse mesh edge. Then examining a row of A2 = P' A1 P we can read off the change in the stencil. Let’s say A1 is a one-ring Laplacian on the fine mesh. Hitting it on the left with P' will form the matrix P' A1, a row of which corresponds to a coarse mesh vertex and the columns to the fine mesh vertices it “touches”. Since coarse vertices lie exactly at fine mesh vertices, this will simply connect each coarse mesh to its “half-radius-one-ring”, i.e. the fine mesh vertices lying on coarse mesh edges around it. Finally hitting this matrix with P on the right creates A2 = P' A1 P connecting coarse to coarse vertices. Since each coarse vertex was connected to its “half-radius-one-ring” via the fine mesh, now in P' A P` all coarse mesh vertices are connected to its true one-ring neighbors: just like if we built the Laplacian directly on the coarse mesh. The sparsity pattern is maintained.

For irregular grids this is unfortunately not the case. Each row in P will contain, in general, d+1 entries. When we examine P' A1, we see that we smear the stencil around the coarse mesh to include the “half-radius-one-rings” from d+1 nearest fine-mesh vertices. This effect is repeated when finally creating P' A1 P. We see the sparsity worsening with each level.

Here’s a plot of the number non-zeros per row of the projected system matrix for a multires hierarchy with 7 levels.

galerkin multigrid number of non-zeros per row irregular

The orange lines shows the average number of non-zeros per row for the cotangent Laplacian projected down different sequences of random irregular meshes of the unit square. You can see a rough 2x increase in the number of non-zeros in the beginning and then things taper-off as the meshes get very coarse: meshes range from roughly 2562 to 42. The blue lines show the average number of non-zeros per row for the cotangent Laplacian rediscretized (aka rebuilt from scratch) on each mesh: this is and should be roughly 7 for any triangle mesh (the slight decrease is due to the increase effect of the boundary on the average).

multigrid irregular meshes

This doesn’t look so bad. Even for the third level (562 vertices) the number of non-zeros per row is 25: still _very sparse).

Things get worse, much worse in 3D.

Already the average number of non-zeros per row of a the 3D tet-mesh Laplacian is not neatly bounded to 7, though it tends to be around 16 for a “reasonable mesh”.
multigrid 3D irregular grids

In this plot we see the number of non-zeros in the projected system matrix increasing by around a factor of 3 each level, peaking around 112 on the 173 mesh. Notice on the 93=729 vertex grid the average number of non-zeros is 100. That’s a sparsity ratio of 13% or 729,000 non-zeros: not very sparse for such a small mesh. Especially considering that the rediscretization system matrix would have 12 non-zeros per row, a sparsity ratio of 1.6%, or only 8,700 total non-zeros.

multigrid number of non-zeros per row

I didn’t have the patience to let this run on a larger initial fine-mesh, but you can imagine that the situation gets drastically worse. Galerkin multigrid fails to improve performance and gets choked by memory consumption on irregular 3D tetrahedral meshes using piecewise linear prolongation operators.

$E = mc^2$