Posts Tagged ‘linear’

Understanding mass matrix lumping in terms of functions spaces

Monday, November 14th, 2016

$$\newcommand{\u}{\mathbf{u}}$$ $$\newcommand{\f}{\mathbf{f}}$$

Mass matrix lumping is the process of using a diagonal mass matrix instead of the “full” mass matrix resulting from Galerkin Finite Element Method. Lumping is often recommended because a diagonal mass matrix easy to invert (just invert the entries along the diagonal). As the name implies, diagonalizing the mass matrix is often justified by arguing rather hand-wavily that we’re just “lumping all of the mass or integration” at the vertices of the mesh rather than integrating over the elements. This sounds intuitive but it leaves me wanting a more formal justification.

Here’s an attempt to justify mass lumping in terms of mixed Finite Element approach to discretizing problems of the form: minimize,

$$E(u) = \frac{1}{2} \int_\Omega (u-f)^2 + \nabla u \cdot \nabla u\ dA$$

If we discretize this irreducible form directly using, for example, piecewise-linear basis functions over a mesh (e.g., \(u \approx \sum_i \phi_i u_i\), and we use \(\u\) to also mean the vector of coefficients \(u_i\)), then we get:

$$E(\u) = \frac{1}{2} (\u-\f)^TM(\u-\f) + \frac{1}{2} \u^TK\u$$

where we call \(M\) the mass matrix so that \(M_{ij} = \int_\Omega \phi_i \phi_j\ dA\). Then when we “mass lump” we are replacing \(M\) with a diagonal matrix \(D\) such that \(D_{ii} = \sum_j M_{ij}\) or \(D_{ii} = \) the area associated with vertex \(i\). This “associated area” might be the Voronoi area.

In physical dynamics, if \(\u=\u^t\) are displacements of some dynamic system and \(\f=\u^{t-\Delta t}\) are the mesh displacements of the previous time step then this gives us a physical formula for kinetic and potential energy:

$$E(\u^t) = \frac{1}{2 \Delta t^2} (\u^t-\u^{t-\Delta t})^T M (\u^t-\u^{t-\Delta t}) + \frac{1}{2} {\u^t}^TK\u^t$$

where the stiffness matrix \(K\) now might measure, for example, linear elasticity (I’m mostly going to ignore what’s going on with \(K\)).

When we write this physical energy in the continuous form we’ll introduce a variable to keep track of velocities: \(\dot{u} = \partial u / \partial t\),

$$E(u,\dot{u}) = \frac{1}{2} \int_\Omega \dot{u}^2 + (\nabla u+\nabla u^T) \cdot (\nabla u + \nabla u^T)\ dA$$

$$\dot{u} = \partial u / \partial t$$

Now we have the potential energy in terms of velocity \(\dot{u}\) and the kinetic energy in terms of displacements \(u\). This is a very natural mixed form. If we discretize both \(u\) and \(\dot{u}\) using piecewise-linear basis functions over the same mesh then we’ll have done a lot of re-shuffling to get the identical discretization above. In particular, we’ll just get the same non-diagonal mass matrix \(M\).

Instead, we’ll discretize only displacements \(u\) using piecewise-linear basis functions, and for velocities \(\dot{u}\) we’ll use piecewise-constant basis functions defined on the dual mesh. A standard choice for the dual mesh is the Voronoi dual mesh, where each zero dimensional vertex becomes zero-codimensional “cell” whose area/volume is the Voronoi area/volume of that vertex as a seed amongst all other vertices as fellow seeds. One can construct this dual mesh by connected circumcenters of all of the triangles in the primary mesh. An alternative dual mesh, we’ll call the barycentric dual, will connect all barycenters of the triangles in the primary mesh. All this to say that we approximate the velocities as sum over piecewise constant basis functions located at primary mesh vertices (aka dual mesh cells): \(\dot{u}^t \approx \sum_i \psi_i \dot{u}^t_i\). The basis functions \(\psi_i\) and \(\phi_i\) are in correspondence so that \(\dot{\u}^t\) collects elements \(\dot{u}^t_i\) in the same number and order as \(\u\).

Now we can discretize the energy as:

$$E(\dot{\u}^t,\u^t) = \frac{1}{2} \dot{\u}^T D \dot{\u} + \frac{1}{2} {\u^t}^T K \u^t,$$

and voilà the diagonal mass matrix $D$ appears, where

$$D_{ij} = \int_\Omega \psi_i \psi_j\ dA = \begin{cases} \text{area}_i & i=j \ 0 & i\neq j \end{cases}$$

As a side note, if we use the barycentric dual mesh then areai \(= \sum_jM_{ij}\) , if we use the Voronoi dual mesh then \(\text{area}_i\) is the Voronoi area which may become negative for non-Delaunay meshes (and we might want to use the “fix” described in “Discrete Differential-Geometry Operators for Triangulated 2-Manifolds” [Mark Meyer et al. 2002]).

Our work is not quite done since we haven’t coupled our discretization of velocity and displacements together. We haven’t discretized the constraint: \(\dot{u} = \partial u / \partial t\). We could just quietly say that they “need to agree” at mesh vertices so that \(\dot{u}^t_i = (u^t_i – u_i^{t – \Delta t})/\Delta t\). If we do then we can immediately substitute-out all of the velocity variables and arrive at an “flattened” form of the energy only in terms of displacements $\u$:

$$E(\u^t) = \frac{1}{2 \Delta t^2} (\u^t-\u^{t-\Delta t})^T D (\u^t-\u^{t-\Delta t}) + \frac{1}{2} {\u^t}^T K \u^t.$$

Deeper down the rabbit hole: This is a bit unsettling because it seems like we just pushed the “lumping magic” deeper into the constraints between velocities and displacements. Indeed, if we try to enforce this constraint by the Lagrange multiplier method:

$$L(u,\dot{u},\lambda) = E(u,\dot{u}) + \frac{1}{2} \int_\Omega \lambda \left(\dot{u} – \frac{\partial u}{\partial t}\right) \ dA$$

(which admittedly reads a bit like nonsense because before the only difference between \(\dot{u}\) and \(\partial u / \partial t\) was notation, but think of \(\dot{u}\) as a first-class variable function).

then we have to pick a discretization for the Lagrange multiplier function \(\lambda\). Surprisingly, if we use either piecewise-linear on the primary mesh or piecewise-constant basis functions on the dual mesh then we’ll get a non-diagonal matrix coupling \(\dot{u}\) and \(\u\) which means that we can’t simply substitute-out $\dot{\u}$.

To tie things up, I found one (potentially dubious) discretization of \(\lambda\) that makes things work: Dirac basis functions

$$\delta_i(x) = \int_\Omega \delta(x-x_i) \ dA$$

Where \(\delta(x)\) is Dirac’s delta function so that \(\delta(x) = \begin{cases} \infty & x=0 \ 1 & x\neq 0 \end{cases}\) and \(\int \delta(x) \ dA = 1\).

Assuming I have permission for the Discretization God’s, then I let \(\lambda \approx \sum_i \delta_i \lambda_i\) where (by abuse of notation/LaTeX’s inability to typeset bold Greek letters) \(\lambda\) is a vector of coefficients \(\lambda_i\) in corresponding number and order with \(\u\) and \(\dot{\u}\).

Since (crossing fingers) \(\int_\Omega \delta_i \psi_j \ dA = \begin{cases} 1 & i=j \ 0 & i\neq j\end{cases}\) and \(\int_\Omega \delta_i \phi_j \ dA = \begin{cases} 1 & i=j \ 0 & i\neq j\end{cases}\). Our discretized Lagrangian is:

$$L(\u^t,\dot{\u}^t,\lambda) = \frac{1}{2} \dot{\u}^T D \dot{\u} + \frac{1}{2} {\u^t}^T K \u^t + \frac{1}{2} \lambda I \left(\u^t – \frac{\u^t – \u^{t-\Delta t}}{\Delta t} \right),$$

where \(I\) truly madly deeply is the identity matrix. This implies that a direct substitution of \(\dot{\u}^t\) with \((\u^t – \u^{t-\Delta t})/\Delta t\) is valid.

Update: Some papers making similar claims:

“A note on mass lumping and related processes” [Hinton et al. 1976]
(“Analysis Of Structural Vibration And Response” [Clough 1971] (cited in [Hinton et al. 1976] but I haven’t found a copy yet)

Plot piecewise constant function over 2D triangle mesh as height field

Tuesday, October 8th, 2013

I typically deal with piecewise linear functions defined over a triangle mesh (values defined per vertex). I can plot this over a 2d triangle mesh domain using the trisurf function:

trisurf(F,V(:,1),V(:,2),S);

I wrap this into my own function tsurf to make it easier to call since I call them so often:

tsurf(F,[V S]);

piecewise linear height-field over triangle mesh

Sometimes I deal with piecewise constant functions (values defined per triangle). To plot this I use:

tsurf(bsxfun(@plus,size(F,1)*(0:size(F,2)-1),(1:size(F,1))'),[V(F(:),:) repmat(S,size(F,2),1)]);

Or broken down:

% positions of all "Corners"
C = V(F(:),:);
% Scalar field defined at corners
SC = repmat(S,size(F,2),1);
% new disjoint triangle indices into C for each original triangle
FC = bsxfun(@plus,size(F,1)*(0:size(F,2)-1),(1:size(F,1))');
% plot each triangle as piecewise linear function (each of which happens to be constant)
tsurf(FC,[C SC]);

piecewise constant height-field over triangle mesh

Minimizing quadratic energies with constant constraints

Tuesday, June 21st, 2011

There are a many different ways to minimize a quadratic energy with constant constraints. I wrote out the exact steps for three such ways:

  • Separate the energy into knowns and unknowns, then solve for the unknowns by setting gradient with respect to those unknowns to zero.
  • Add additional term to energy that punishes not satisfying the known values of your constant constraints in the least squares sense, the solve for all variables by setting gradient with respect to all variables to zero.
  • Add additional variables called lagrange multipliers to energy so that the problem is no longer to find a minimum but a stationary point in the new energy. Such a stationary point guarantees that our constraints are met. Find the stationary point by the usual method of setting the gradient to zero, this time the gradient is with respect to all original variables and these new variables.

The pretty print type-up with MATLAB pseudocode.
A plain text, unicode version:


Quadratic energy
E = ZT * Q * Z + ZT * L + C

We want to fix some of the values of Z, we can rearrange the Z's that are still
free and the ones we want to fix so that Z = [X;Y], where X are the free values
and Y are the fixed values;

Split energy into knowns and unknowns
Z = [X;Y]
or 
Z(unknown) = X and Z(known) = Y
E = [X;Y]T * Q * [X;Y] + [X;Y]T * L + C
E = [X;0]T * Q * [X;Y] + [0;Y]T * Q * [X;Y] + [X;0] * L + [X;Y]T * L + C
E = [X;0]T * Q * [X;0] + 
    [X;0]T * Q * [0;Y] + 
    [0;Y]T * Q * [X;0] + 
    [0;Y]T * Q * [0;Y] + 
    [X;0]T * L + 
    [0;Y]T * L + 
    C
E = [X;0]T * [Q(unknown,unknown) 0; 0 0] * [X;0] + 
    [X;0]T * [0 Q(unknown,known); 0 0] * [0;Y] + 
    [0;Y]T * [0 0; Q(known,unknown) 0] * [X;0] + 
    [0;Y]T * [0 0; 0 Q(known,known)] * [0;Y] + 
    [X;0]T * L + 
    [0;Y]T * L + 
    C
E = XT * Q(unknown,unknown) * X +
  XT * Q(unknown,known) * Y +
  YT * Q(known,unknown) * X +
  YT * Q(known,known) * Y + 
  XT * L(unknown) +
  YT * L(known) +
  C

E = XT * Q(unknown,unknown) * X +
  XT * Q(unknown,known) * Y +
  (YT * Q(known,unknown) * X)T +
  YT * Q(known,known) * Y + 
  XT * L(unknown) +
  YT * L(known) +
  C

E = XT * Q(unknown,unknown) * X +
  XT * Q(unknown,known) * Y +
  XT * Q(known,unknown)T * Y +
  YT * Q(known,known) * Y + 
  XT * L(unknown) +
  YT * L(known) +
  C

E = XT * NQ * X + XT * NL + NC
NQ = Q(unknown,unknown)
NL = Q(unknown,known) * Y + Q(known,unknown)T * Y + L(unknown)
NC = YT * Q(known,known) * Y + YT * L(known) + C

∂E/∂XT = 2 * NQ * X + NL

Solve for X with:
X = (- 2 * NQ)^-1 * NL

Enforce fixed values via soft contraints with high weights
E = ZT * Q * Z + ZT * L + C 
Add new energy term punishing being far from fixed variables
NE = ZT * Q * Z + ZT * L + C + w * (Z(known) - Y)T * I * (Z(known) - Y)
where w is a large number, e.g. 10000
NE = ZT * Q * Z + ZT * L + C + w * (Z - [0;Y])T * [0 0;0 I(known,known)] * (Z - [0;Y])
where W is a diagonal matrix with Wii = 0 if i∈unknown and Wii = wii if j∈known
NE = ZT * Q * Z + ZT * L + C + (Z - [0;Y])T * W * (Z - [0;Y])
NE = ZT * Q * Z + ZT * L + C + ZT * W * Z - 2 * ZT * W * [0;Y] + [0;Y]T * W * [0;Y]
NE = ZT * NQ * Z + ZT * NL + NC
NQ = Q + W
NL = L - 2W * [0;Y]
NC = C + [0;Y]T * W * [0;Y]

Differentiate with respect to ZT
∂E/∂ZT = 2 * NQ * Z + NL

Solve for Z with:
Z = -0.5 * NQ^-1 * NL
or re-expanded
Z = -0.5 * (Q + W)^-1 * (L - 2 * W * [0;Y])

Discard known parts of Z to find X
X = Z(unknown)
But be careful to look at Z(known) - Y to be sure your fixed values are being
met, if not then w is too low. If the X values look like garbage then perhaps
you're getting numerical error because w is too high.

Lagrange multipliers
We want to minimize
E = ZT * Q * Z + ZT * L + C 
subject to the constant constraints:
Z(known) = Y
Find stationary point of new energy:
NE = E + ∑λi *(Zi - Yi)
        i∈known
NE = E + [Z;λ]T * QC * [Z;λ] - [Z;λ]T * [0;Y]
where QC = [0 0 0;0 0 I(known,known);0 I(known,known) 0]
NE = [Z;λ]T * NQ * [Z;λ] + [Z;λ]T * NL + C 
NQ = [Q 0;0 0] + QC =  [         Q        0             ]
                       [                  I(known,known)]
                       [ 0 I(known,known) 0             ]
NL = [L;0] + [0;Y];
Differentiate with respect to all variables, including lagrange multipliers
∂E/∂[Z;λ]T  = 2 * NQ * Z + NL
Solve with
[Z;λ] = -0.5 * NQ^-1 * NL

Discard fixed values and langrange multipliers.
X = Z(known)
The value of λi is the "force" of the constraint or how hard we had to pull the
energery minimum to meet the constraints. See
http://www.slimy.com/~steuard/teaching/tutorials/Lagrange.html#Meaning