Solid mechanics miniapp¶
This example is located in the subdirectory examples/solids
.
It solves the steadystate static momentum balance equations using unstructured highorder finite/spectral element spatial discretizations.
As for the Compressible NavierStokes miniapp case, the solid mechanics elasticity example has been developed using PETSc, so that the pointwise physics (defined at quadrature points) is separated from the parallelization and meshing concerns.
In this miniapp, we consider three formulations used in solid mechanics applications: linear elasticity, NeoHookean hyperelasticity at small strain, and NeoHookean hyperelasticity at finite strain. We provide the strong and weak forms of static balance of linear momentum in the small strain and finite strain regimes. The stressstrain relationship (constitutive law) for each of the material models is provided. Due to the nonlinearity of material models in NeoHookean hyperelasticity, the Newton linearization of the material models is provided.
Note
Linear elasticity and smallstrain hyperelasticity can both by obtained from the finitestrain hyperelastic formulation by linearization of geometric and constitutive nonlinearities. The effect of these linearizations is sketched in the diagram below, where \(\bm \sigma\) and \(\bm \epsilon\) are stress and strain, respectively, in the small strain regime, while \(\bm S\) and \(\bm E\) are their finitestrain generalizations (second PiolaKirchoff tensor and GreenLagrange strain tensor, respectively) defined in the reference configuration, and \(\mathsf C\) is a linearized constitutive model.
Running the miniapp¶
The elasticity minapp is controlled via commandline options, the following of which are mandatory.
Option 
Description 


Path to mesh file in any format supported by PETSc. 

Polynomial degree of the finite element basis 

Young’s modulus, \(E > 0\) 

Poisson’s ratio, \(\nu < 0.5\) 

List of face sets on which to displace by 
Note: The default for a clamped face is zero displacement. All displacement is with respect to the initial configuration.
bc_traction [int list]
List of face sets on which to set traction boundary conditions with the traction vector
bc_traction_[facenumber] [tx,ty,tz]
Note
This solver can use any mesh format that PETSc’s DMPlex
can read (Exodus, Gmsh, Med, etc.).
Our tests have primarily been using Exodus meshes created using CUBIT; sample meshes used for the example runs suggested here can be found in this repository.
Note that many mesh formats require PETSc to be configured appropriately; e.g., downloadexodusii
for Exodus support.
Consider the specific example of the mesh seen below:
With the sidesets defined in the figure, we provide here an example of a minimal set of command line options:
./elasticity mesh [.exo file] degree 4 E 1e6 nu 0.3 bc_clamp 998,999 bc_clamp_998_translate 0,0.5,1
In this example, we set the left boundary, face set \(999\), to zero displacement and the right boundary, face set \(998\), to displace \(0\) in the \(x\) direction, \(0.5\) in the \(y\), and \(1\) in the \(z\).
As an alternative to specifying a mesh with mesh
, the user may use a DMPlex box mesh by specifying dm_plex_box_faces [int list]
, dm_plex_box_upper [real list]
, and dm_plex_box_lower [real list]
.
As an alternative example exploiting dm_plex_box_faces
, we consider a 4 x 4 x 4
mesh where essential (Drichlet) boundary condition is placed on all sides. Sides 1 through 6 are rotated around \(x\)axis:
./elasticity problem hyperFS E 1 nu 0.3 num_steps 40 snes_linesearch_type cp dm_plex_box_faces 4,4,4 bc_clamp 1,2,3,4,5,6 bc_clamp_1_rotate 0,0,1,0,.3 bc_clamp_2_rotate 0,0,1,0,.3 bc_clamp_3_rotate 0,0,1,0,.3 bc_clamp_4_rotate 0,0,1,0,.3 bc_clamp_5_rotate 0,0,1,0,.3 bc_clamp_6_rotate 0,0,1,0,.3
Note
If the coordinates for a particular side of a mesh are zero along the axis of rotation, it may appear that particular side is clamped zero.
On each boundary node, the rotation magnitude is computed: theta = (c_0 + c_1 * cx) * loadIncrement
where cx = kx * x + ky * y + kz * z
, with kx
, ky
, kz
are normalized values.
The command line options just shown are the minimum requirements to run the miniapp, but additional options may also be set as follows
Option 
Description 
Default value 


CEED resource specifier 


Number of extra quadrature points 


Run in test mode 


Problem to solve ( 


Forcing term option ( 


Forcing vector 


Multigrid coarsening to use ( 


Poisson’s ratio for multigrid smoothers, \(\nu < 0.5\) 


Number of load increments for continuation method 


Output solution at each load increment for viewing 


Output solution at final load increment for viewing 


View PETSc 


View PETSc performance log 


View comprehensive information about runtime options 
To verify the convergence of the linear elasticity formulation on a given mesh with the method of manufactured solutions, run:
./elasticity mesh [mesh] degree [degree] nu [nu] E [E] forcing mms
This option attempts to recover a known solution from an analytically computed forcing term.
On algebraic solvers¶
This miniapp is configured to use the following NewtonKrylovMultigrid method by default.
Newtontype methods for the nonlinear solve, with the hyperelasticity models globalized using load increments.
Preconditioned conjugate gradients to solve the symmetric positive definite linear systems arising at each Newton step.
Preconditioning via \(p\)version multigrid coarsening to linear elements, with algebraic multigrid (PETSc’s
GAMG
) for the coarse solve. The default smoother uses degree 3 Chebyshev with Jacobi preconditioning. (Lower degree is often faster, albeit less robust; tryouter_mg_levels_ksp_max_it 2
, for example.) Application of the linear operators for all levels with degree \(p > 1\) is performed matrixfree using analytic Newton linearization, while the lowest order \(p = 1\) operators are assembled explicitly (using coloring at present).
Many related solvers can be implemented by composing PETSc commandline options.
Nondimensionalization¶
Quantities such as the Young’s modulus vary over many orders of magnitude, and thus can lead to poorly scaled equations. One can nondimensionalize the model by choosing an alternate system of units, such that displacements and residuals are of reasonable scales.
Option 
Description 
Default value 


1 meter in scaled length units 


1 second in scaled time units 


1 kilogram in scaled mass units 

For example, consider a problem involving metals subject to gravity.
Quantity 
Typical value in SI units 

Displacement, \(\bm u\) 
\(1 \,\mathrm{cm} = 10^{2} \,\mathrm m\) 
Young’s modulus, \(E\) 
\(10^{11} \,\mathrm{Pa} = 10^{11} \,\mathrm{kg}\, \mathrm{m}^{1}\, \mathrm s^{2}\) 
Body force (gravity) on volume, \(\int \rho \bm g\) 
\(5 \cdot 10^4 \,\mathrm{kg}\, \mathrm m^{2} \, \mathrm s^{2} \cdot (\text{volume} \, \mathrm m^3)\) 
One can choose units of displacement independently (e.g., units_meter 100
to measure displacement in centimeters), but \(E\) and \(\int \rho \bm g\) have the same dependence on mass and time, so cannot both be made of order 1.
This reflects the fact that both quantities are not equally significant for a given displacement size; the relative significance of gravity increases as the domain size grows.
Diagnostic Quantities¶
Diagnostic quantities for viewing are provided when the command line options for visualization output, view_soln
or view_final_soln
are used.
The diagnostic quantities include displacement in the \(x\) direction, displacement in the \(y\) direction, displacement in the \(z\) direction, pressure, \(\operatorname{trace} \bm{E}\), \(\operatorname{trace} \bm{E}^2\), \(\lvert J \rvert\), and strain energy density.
The table below summarizes the formulations of each of these quantities for each problem type.
Quantity 
Linear Elasticity 
Hyperelasticity, Small Strain 
Hyperelasticity, Finite Strain 

Pressure 
\(\lambda \operatorname{trace} \bm{\epsilon}\) 
\(\lambda \log \operatorname{trace} \bm{\epsilon}\) 
\(\lambda \log J\) 
Volumetric Strain 
\(\operatorname{trace} \bm{\epsilon}\) 
\(\operatorname{trace} \bm{\epsilon}\) 
\(\operatorname{trace} \bm{E}\) 
\(\operatorname{trace} \bm{E}^2\) 
\(\operatorname{trace} \bm{\epsilon}^2\) 
\(\operatorname{trace} \bm{\epsilon}^2\) 
\(\operatorname{trace} \bm{E}^2\) 
\(\lvert J \rvert\) 
\(1 + \operatorname{trace} \bm{\epsilon}\) 
\(1 + \operatorname{trace} \bm{\epsilon}\) 
\(\lvert J \rvert\) 
Strain Energy Density 
\(\frac{\lambda}{2} (\operatorname{trace} \bm{\epsilon})^2 + \mu \bm{\epsilon} : \bm{\epsilon}\) 
\(\lambda (1 + \operatorname{trace} \bm{\epsilon}) (\log(1 + \operatorname{trace} \bm{\epsilon} )  1) + \mu \bm{\epsilon} : \bm{\epsilon}\) 
\(\frac{\lambda}{2}(\log J)^2 + \mu \operatorname{trace} \bm{E}  \mu \log J\) 
Linear Elasticity¶
The strong form of the static balance of linear momentum at small strain for the threedimensional linear elasticity problem is given by [Hug12]:
where \(\bm{\sigma}\) and \(\bm{g}\) are stress and forcing functions, respectively. We multiply (24) by a test function \(\bm v\) and integrate the divergence term by parts to arrive at the weak form: find \(\bm u \in \mathcal V \subset H^1(\Omega)\) such that
where \(\bm{\sigma} \cdot \hat{\bm{n}}_{\partial \Omega}\) is replaced by an applied force/traction boundary condition written in terms of the reference configuration.
Constitutive modeling¶
In their most general form, constitutive models define \(\bm \sigma\) in terms of state variables. In the model taken into consideration in the present miniapp, the state variables are constituted by the vector displacement field \(\bm u\), and its gradient \(\nabla \bm u\). We begin by defining the symmetric (small/infintesimal) strain tensor as
This constitutive model \(\bm \sigma(\bm \epsilon)\) is a linear tensorvalued function of a tensorvalued input, but we will consider the more general nonlinear case in other models below. In these cases, an arbitrary choice of such a function will generally not be invariant under orthogonal transformations and thus will not admissible as a physical model must not depend on the coordinate system chosen to express it. In particular, given an orthogonal transformation \(Q\), we desire
which means that we can change our reference frame before or after computing \(\bm \sigma\), and get the same result either way. Constitutive relations in which \(\bm \sigma\) is uniquely determined by \(\bm \epsilon\) while satisfying the invariance property (27) are known as Cauchy elastic materials. Here, we define a strain energy density functional \(\Phi(\bm \epsilon) \in \mathbb R\) and obtain the strain energy from its gradient,
Note
The strain energy density functional cannot be an arbitrary function \(\Phi(\bm \epsilon)\); it can only depend on invariants, scalarvalued functions \(\gamma\) satisfying
for all orthogonal matrices \(Q\).
For the linear elasticity model, the strain energy density is given by
The constitutive law (stressstrain relationship) is therefore given by its gradient,
where \(\bm I_3\) is the \(3 \times 3\) identity matrix, the colon represents a double contraction (over both indices of \(\bm \epsilon\)), and the Lamé parameters are given by
The constitutive law (stressstrain relationship) can also be written as
For notational convenience, we express the symmetric second order tensors \(\bm \sigma\) and \(\bm \epsilon\) as vectors of length 6 using the Voigt notation. Hence, the fourth order elasticity tensor \(\mathsf C\) (also known as elastic moduli tensor or material stiffness tensor) can be represented as
Note that the incompressible limit \(\nu \to \frac 1 2\) causes \(\lambda \to \infty\), and thus \(\mathsf C\) becomes singular.
Hyperelasticity at Small Strain¶
The strong and weak forms given above, in (24) and (25), are valid for NeoHookean hyperelasticity at small strain. However, the strain energy density differs and is given by
As above, we have the corresponding constitutive law given by
where \(\bm{\epsilon}\) is defined as in (26).
Newton linearization¶
Due to nonlinearity in the constitutive law, we require a Newton linearization of (31). To derive the Newton linearization, we begin by expressing the derivative,
where
and
Therefore,
where we have introduced the symbol
where volumetric strain is given by \(\epsilon_v = \sum_i \epsilon_{ii}\).
Equation (32) can be written in Voigt matrix notation as follows:
Hyperelasticity at Finite Strain¶
In the total Lagrangian approach for the NeoHookean hyperelasticity problem, the discrete equations are formulated with respect to the reference configuration. In this formulation, we solve for displacement \(\bm u(\bm X)\) in the reference frame \(\bm X\). The notation for elasticity at finite strain is inspired by [Hol00] to distinguish between the current and reference configurations. As explained in the Common notation section, we denote by capital letters the reference frame and by small letters the current one.
The strong form of the static balance of linearmomentum at finite strain (total Lagrangian) is given by:
where the \(_X\) in \(\nabla_X\) indicates that the gradient is calculated with respect to the reference configuration in the finite strain regime. \(\bm{P}\) and \(\bm{g}\) are the first PiolaKirchhoff stress tensor and the prescribed forcing function, respectively. \(\rho_0\) is known as the reference mass density. The tensor \(\bm P\) is not symmetric, living in the current configuration on the left and the reference configuration on the right.
\(\bm{P}\) can be decomposed as
where \(\bm S\) is the second PiolaKirchhoff stress tensor, a symmetric tensor defined entirely in the reference configuration, and \(\bm{F} = \bm I_3 + \nabla_X \bm u\) is the deformation gradient. Different constitutive models can define \(\bm S\).
Constitutive modeling¶
For the constitutive modeling of hyperelasticity at finite strain, we begin by defining two symmetric tensors in the reference configuration, the right CauchyGreen tensor
and the GreenLagrange strain tensor
the latter of which converges to the linear strain tensor \(\bm \epsilon\) in the smalldeformation limit. The constitutive models considered, appropriate for large deformations, express \(\bm S\) as a function of \(\bm E\), similar to the linear case, shown in equation (29), which expresses the relationship between \(\bm\sigma\) and \(\bm\epsilon\).
Recall that the strain energy density functional can only depend upon invariants. We will assume without loss of generality that \(\bm E\) is diagonal and take its set of eigenvalues as the invariants. It is clear that there can be only three invariants, and there are many alternate choices, such as \(\operatorname{trace}(\bm E), \operatorname{trace}(\bm E^2), \lvert \bm E \rvert\), and combinations thereof. It is common in the literature for invariants to be taken from \(\bm C = \bm I_3 + 2 \bm E\) instead of \(\bm E\).
For example, if we take the compressible NeoHookean model,
where \(J = \lvert \bm F \rvert = \sqrt{\lvert \bm C \rvert}\) is the determinant of deformation (i.e., volume change) and \(\lambda\) and \(\mu\) are the Lamé parameters in the infinitesimal strain limit.
To evaluate (28), we make use of
where the factor of \(\frac 1 2\) has been absorbed due to \(\bm C = \bm I_3 + 2 \bm E.\) Carrying through the differentiation (28) for the model (37), we arrive at
Tip
An equivalent form of (38) is
which is more numerically stable for small \(\bm E\), and thus preferred for computation. Note that the product \(\bm C^{1} \bm E\) is also symmetric, and that \(\bm E\) should be computed using (36).
Similarly, it is preferable to compute \(\log J\) using log1p
, especially in case of nearly incompressible materials.
To sketch this idea, suppose we have the \(2\times 2\) symmetric matrix \(C = \left( \begin{smallmatrix} 1 + e_{00} & e_{01} \\ e_{01} & 1 + e_{11} \end{smallmatrix} \right)\).
Then we compute
which gives accurate results even in the limit when the entries \(e_{ij}\) are very small. For example, if \(e_{ij} \sim 10^{8}\), then naive computation of \(\bm I_3  \bm C^{1}\) and \(\log J\) will have a relative accuracy of order \(10^{8}\) in double precision and no correct digits in single precision. When using the stable choices above, these quantities retain full \(\varepsilon_{\text{machine}}\) relative accuracy.
Note
One can linearize (38) around \(\bm E = 0\), for which \(\bm C = \bm I_3 + 2 \bm E \to \bm I_3\) and \(J \to 1 + \operatorname{trace} \bm E\), therefore (38) reduces to
which is the St. VenantKirchoff model (constitutive linearization without geometric linearization; see (23)).
This model can be used for geometrically nonlinear mechanics (e.g., snapthrough of thin structures), but is inappropriate for large strain.
Alternatively, one can drop geometric nonlinearities, \(\bm E \to \bm \epsilon\) and \(\bm C \to \bm I_3\), while retaining the nonlinear dependence on \(J \to 1 + \operatorname{trace} \bm \epsilon\), thereby yielding (31) (see (23)).
Weak form¶
We multiply (34) by a test function \(\bm v\) and integrate by parts to obtain the weak form for finitestrain hyperelasticity: find \(\bm u \in \mathcal V \subset H^1(\Omega_0)\) such that
where \(\bm{P} \cdot \hat{\bm{N}}_{\partial\Omega}\) is replaced by any prescribed force/traction boundary condition written in terms of the reference configuration. This equation contains material/constitutive nonlinearities in defining \(\bm S(\bm E)\), as well as geometric nonlinearities through \(\bm P = \bm F\, \bm S\), \(\bm E(\bm F)\), and the body force \(\bm g\), which must be pulled back from the current configuration to the reference configuration. Discretization of (40) produces a finitedimensional system of nonlinear algebraic equations, which we solve using NewtonRaphson methods. One attractive feature of Galerkin discretization is that we can arrive at the same linear system by discretizing the Newton linearization of the continuous form; that is, discretization and differentiation (Newton linearization) commute.
Newton linearization¶
To derive a Newton linearization of (40), we begin by expressing the derivative of (35) in incremental form,
where
The quantity \({\partial \bm S} / {\partial \bm E}\) is known as the incremental elasticity tensor, and is analogous to the linear elasticity tensor \(\mathsf C\) of (30). We now evaluate \(\diff \bm S\) for the NeoHookean model (38),
where we have used
Note
In the smallstrain limit, \(\bm C \to \bm I_3\) and \(\log J \to 0\), thereby reducing (42) to the St. VenantKirchoff model (39).
Note
Some cancellation is possible (at the expense of symmetry) if we substitute (42) into (41),
where we have exploited \(\bm F \bm C^{1} = \bm F^{T}\) and
We prefer to compute with (42) because (43) is more expensive, requiring access to (nonsymmetric) \(\bm F^{1}\) in addition to (symmetric) \(\bm C^{1} = \bm F^{1} \bm F^{T}\), having fewer symmetries to exploit in contractions, and being less numerically stable.
It is sometimes useful to express (42) in index notation,
where we have identified the effective elasticity tensor \(\mathsf C = \mathsf C_{IJKL}\). It is generally not desirable to store \(\mathsf C\), but rather to use the earlier expressions so that only \(3\times 3\) tensors (most of which are symmetric) must be manipulated. That is, given the linearization point \(\bm F\) and solution increment \(\diff \bm F = \nabla_X (\diff \bm u)\) (which we are solving for in the Newton step), we compute \(\diff \bm P\) via
recover \(\bm C^{1}\) and \(\log J\) (either stored at quadrature points or recomputed),
proceed with \(3\times 3\) matrix products as in (42) or the second line of (44) to compute \(\diff \bm S\) while avoiding computation or storage of higher order tensors, and
conclude by (41), where \(\bm S\) is either stored or recomputed from its definition exactly as in the nonlinear residual evaluation.
Note
The decision of whether to recompute or store functions of the current state \(\bm F\) depends on a roofline analysis [WWP09,Brown10] of the computation and the cost of the constitutive model. For loworder elements where flops tend to be in surplus relative to memory bandwidth, recomputation is likely to be preferable, where as the opposite may be true for highorder elements. Similarly, analysis with a simple constitutive model may see better performance while storing little or nothing while an expensive model such as ArrudaBoyce [AB93], which contains many special functions, may be faster when using more storage to avoid recomputation. In the case where complete linearization is preferred, note the symmetry \(\mathsf C_{IJKL} = \mathsf C_{KLIJ}\) evident in (44), thus \(\mathsf C\) can be stored as a symmetric \(6\times 6\) matrix, which has 21 unique entries. Along with 6 entries for \(\bm S\), this totals 27 entries of overhead compared to computing everything from \(\bm F\). This compares with 13 entries of overhead for direct storage of \(\{ \bm S, \bm C^{1}, \log J \}\), which is sufficient for the NeoHookean model to avoid all but matrix products.