Full article: Lie group integrators for mechanical systems

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge–Kutta–Munthe–Kaas methods and the commutator free Lie group integrators. We give a short introduction to these classes of methods. The Hamiltonian framework is attractive for many mechanical problems, and in particular we shall consider Lie group integrators for problems on cotangent bundles of Lie groups where a number of different formulations are possible. There is a natural symplectic structure on such manifolds and through variational principles one may derive symplectic Lie group integrators. We also consider the practical aspects of the implementation of Lie group integrators, such as adaptive time stepping. The theory is illustrated by applying the methods to two nontrivial applications in mechanics. One is the N-fold spherical pendulum where we introduce the restriction of the adjoint action of the group $S E (3)$ to $T S^{2}$ , the tangent bundle of the two-dimensional sphere. Finally, we show how Lie group integrators can be applied to model the controlled path of a payload being transported by two rotors. This problem is modelled on $R^{6} \times {(S O (3) \times s o (3))}^{2} \times (T S^{2})^{2}$ and put in a format where Lie group integrators can be applied.

Keywords:

2010 AMS Subject Classifications:

1. Introduction

In many physical problems, including multi-body dynamics, the configuration space is not a linear space, but rather consists of a collection of rotations and translations. A simple example is the free rigid body whose configuration space consists of rotations in 3D. A more advanced example is the simplified model of the human body, where the skeleton at a given time is described as a system of interacting rods and joints. Mathematically, the structure of such problems is usually best described as a manifold. Since manifolds by definition can be equipped with local coordinates, one can always describe and simulate such systems locally as if they were linear spaces. There are of course many choices of local coordinates, for rotations some famous ones are: Euler angles, the Tait–Bryan angles commonly used in aerospace applications, the unit length quaternions and the exponentiated skew-symmetric $3 \times 3$ -matrices. Lie group integrators represent a somewhat different strategy. Rather than specifying a choice of local coordinates from the outset, in this approach the model and the numerical integrator are expressed entirely in terms of a Lie group and its action on the phase space. This often leads to a more abstract and simpler formulation of the mechanical system and of the numerical schemes, deferring further details to the implementation phase.

In the literature, one can find many different types and formats of Lie group integrators. Some of these are completely general and intrinsic, meaning that they only make use of inherent properties of Lie groups and manifolds as was suggested in [Citation6,Citation11,Citation41]. But many numerical methods have been suggested that add structure or utilize properties which are specific to a particular Lie group or manifold. Notable examples of this are the methods based on canonical coordinates of the second kind [Citation45], and the methods based on the Cayley transformation [Citation13,Citation32], applicable, e.g. to the rotation groups and Euclidean groups. In some applications, e.g. in multi-body systems, it may be useful to formulate the problem as a mix between Lie groups and kinematic constraints, introducing for instance Lagrange multipliers. Sometimes this may lead to more practical implementations where a basic general setup involving Lie groups can be further equipped with different choices of constraints depending on the particular application. Such constrained formulations are outside the scope of the present paper. It should also be noted that the Lie group integrators devised here do not make any a-priori assumptions about how the manifold is represented.

The applications of Lie group integrators for mechanical problems also have a long history, two of the early important contributions were the Newmark methods of Simo and Vu–Quoc [Citation50] and the symplectic and energy-momentum methods by Lewis and Simo [Citation32]. Mechanical systems are often described as Euler–Lagrange equations or as Hamiltonian systems on manifolds, with or without external forces [Citation28]. Important ideas for the discretization of mechanical systems originated also from the work of Moser and Veselov [Citation39,Citation51] on discrete integrable systems. This work served as motivation for further developments in the field of geometric mechanics and for the theory of (Lie group) discrete variational integrators [Citation20,Citation27,Citation30]. The majority of Lie group methods found in the literature are one-step type generalizations for classical methods, such as Runge–Kutta type formulas. In mechanical engineering, the classical BDF methods have played an important role and were recently generalized [Citation54] to Lie groups. Similarly, the celebrated α-method for linear spaces proposed by Hilber, Hughes and Taylor [Citation22] has been popular for solving problems in multibody dynamics, and in [Citation1,Citation2,Citation4] this method is generalized to a Lie group integrator.

The literature on Lie group integrators is rich and diverse, the interested reader may consult the surveys [Citation7,Citation10,Citation26,Citation46] and Chapter 4 of the monograph [Citation18] for further details.

In this paper, we discuss different ways of applying Lie group integrators to simulating the dynamics of mechanical multi-body systems. Our point of departure is the formulation of the models as differential equations on manifolds. Assuming to be given either a Lie group acting transitively on the manifold $M$ or a set of frame vector fields on $M$ , we use them to describe the mechanical system and further to build the numerical integrator. We shall here mostly consider schemes of the types commonly known as Crouch–Grossman methods [Citation11], Runge–Kutta–Munthe–Kaas methods [Citation40,Citation41] and Commutator-free Lie group methods [Citation6].

The choice of Lie group action is often not unique and thus the same mechanical system can be described in different equivalent ways. Under numerical discretization, different formulations can lead to the conservation of different geometric properties of the mechanical system. In particular, we explore the effect of these different formulations on a selection of examples in multi-body dynamics. Lie group integrators have been successfully applied for the simulation of mechanical systems, and in problems of control, bio-mechanics and other engineering applications, see, e.g. [Citation9,Citation25,Citation27,Citation47]. The present work is motivated by applications in modelling and simulation of slender structures like Cosserat rods and beams [Citation50], and one of the examples presented here is the application to a chain of pendula. Another example considers an application for the controlled dynamics of a multibody system.

In Section 2, we give a review of the methods using only the essential intrinsic tools of Lie group integrators. The algorithms are simple and amenable for a coordinate-free description suited to object oriented implementations. In Section 3, we discuss Hamiltonian systems on Lie groups, and we present three different Lie group formulations of the heavy top equations. These systems (and their Lagrangian counterpart) often arise in applications as building blocks of more realistic systems which comprise also damping and control forces. In Section 4, we discuss some ways of adapting the integration step size in time. In Section 5, we consider the application to a chain of pendula. And in Section 6, we consider the application of a multi-body system of interest in the simulation and control of drone dynamics.

2. Lie group integrators

2.1. The formulation of differential equations on manifolds

Lie group integrators solve differential equations whose solution evolve on a manifold $M$ . For ease of notation, we restrict the discussion to the case of autonomous vector fields, although allowing for explicit t-dependence could easily have been included. This means that we seek a curve $y (t) \in M$ whose tangent at any point coincides with a vector field $F \in X (M)$ and passing through a designated initial value $y_{0}$ at $t = t_{0}$ (1) $\dot{y} (t) = F |_{y (t)}, y (t_{0}) = y_{0} .$ (1) Before addressing numerical methods for solving (Equation1(1) $\dot{y} (t) = F |_{y (t)}, y (t_{0}) = y_{0} .$ (1) ) it is necessary to introduce a convenient way of representing the vector field F. There are different ways of doing this. One is to furnish $M$ with a transitive action $ψ : G \times M \to M$ by some Lie group G of dimension $d \geq \dim M$ . We denote the action of g on m as $g \cdot m$ , i.e. $g \cdot m = ψ (g, m)$ . Let $g$ be the Lie algebra of G, and denote by $\exp : g \to G$ the exponential map. We define $ψ_{*} : g \to X (M)$ to be the infinitesimal generator of the action, i.e. (2) ${F_{ξ}|}_{m} = {ψ_{*} (ξ)|}_{m} = {\frac{d}{d t}|}_{t = 0} ψ (\exp (t ξ), m)$ (2) The transitivity of the action now ensures that $ψ_{*} (g) |_{m} = T_{m} M$ for any $m \in M$ , such that any tangent vector $v_{m} \in T_{m} M$ can be represented as $v_{m} = ψ_{*} (ξ_{v}) |_{m}$ for some $ξ_{v} \in g$ ( $ξ_{v}$ may not be unique). Consequently, for any vector field $F \in X (M)$ there exists a map $f : M \to g$ Footnote¹ such that (3) $F |_{m} = {ψ_{*} (f (m))|}_{m}, f o r a l l m \in M$ (3) This is the original tool [Citation41] for representing a vector field on a manifold with a group action. Another approach was used in [Citation11] where a set of frame vector fields $E_{1}, \dots, E_{d}$ in $X (M)$ was introduced assuming that for every $m \in M$ , $s p a n {{E_{1}|}_{m}, \dots, {E_{d}|}_{m}} = T_{m} M .$ Then, for any vector field $F \in X (M)$ there are, in general non-unique, functions $f_{i} : M \to R$ , which can be chosen with the same regularity as F, such that $F |_{m} = \sum_{i = 1}^{d} f_{i} (m) {E_{i}|}_{m} .$ A fixed vector $ξ \in R^{d}$ will define a vector field $F_{ξ}$ on $M$ similar to (Equation2(2) ${F_{ξ}|}_{m} = {ψ_{*} (ξ)|}_{m} = {\frac{d}{d t}|}_{t = 0} ψ (\exp (t ξ), m)$ (2) ) (4) ${F_{ξ}|}_{m} = \sum_{i = 1}^{d} ξ_{i} E_{i} |_{m}$ (4) If $ξ_{i} = f_{i} (p)$ for some $p \in M$ , the corresponding $F_{ξ}$ will be a vector field in the linear span of the frame which coincides with F at the point p. Such a vector field was named by Crouch and Grossman [Citation11] as a the vector field frozen at p.

The two formulations just presented are in many cases connected and can then be used in an equivalent manner. Suppose that $e_{1}, \dots, e_{d}$ is a basis of the Lie algebra $g$ , then we can simply define frame vector fields as $E_{i} = ψ_{*} (e_{i})$ and the vector field we aim to describe is $F |_{m} = {ψ_{*} (f (m))|}_{m} = {ψ_{*} (\sum_{i} f_{i} (m) e_{i})|}_{m} = \sum_{i} f_{i} {E_{i}|}_{m} .$ As mentioned above, there is a non-uniqueness issue when defining a vector field by means of a group action or a frame. A more fundamental description can be obtained using the machinery of connections. The assumption is that the simply connected manifold $M$ is equipped with a connection which is flat and has constant torsion. Then $F_{p}$ , the frozen vector field of F at p defined above, can be defined as the unique element $F_{p} \in X (M)$ satisfying

$F_{p} |_{p} = F |_{p}$
$\nabla_{X} F_{p} = 0$ for any $X \in X (M)$ .

F_{p}

is the vector field that coincides with F at p and is parallel transported to any other point on

M

by the connection ∇. Since the connection is flat, the parallel transport from the point p to another point

m \in M

does not depend on the chosen path between the two points. For further details, see, e.g. Lundervold and Munthe-Kaas [Citation33].

Example 2.1

For mechanical systems on Lie groups, two important constructions are the adjoint and coadjoint representations. For every $g \in G$ , there is an automorphism ${A d}_{g} : g \to g$ defined as ${A d}_{g} (ξ) = T L_{g} \circ T R_{g^{- 1}} (ξ)$ where $L_{g}$ and $R_{g}$ are the left and right multiplications respectively, $L_{g} (h) = g h$ and $R_{g} (h) = h g$ . Since $A d$ is a representation, i.e. ${A d}_{g h} = {A d}_{g} \circ {A d}_{h}$ it also defines a left Lie group action by G on $g$ . From this definition and a duality pairing $〈 \cdot, \cdot 〉$ between $g$ and $g^{*}$ , we can also derive a representation on $g^{*}$ denoted ${A d}_{g}^{*}$ , simply by $〈 {A d}_{g}^{*} (μ), ξ 〉 = 〈 μ, {A d}_{g} (ξ) 〉, ξ \in g, μ \in g^{*} .$ The action $g \cdot μ = {A d}_{g^{- 1}}^{*} (μ)$ has infinitesimal generator given as ${ψ_{*} (ξ)|}_{μ} = - {a d}_{ξ}^{*} μ$ Following [Citation35], for a Hamiltonian $H : T^{*} G \to R$ , define $H^{-}$ to be its restriction to $g^{*}$ . Then the Lie–Poisson reduction of the dynamical system is defined on $g^{*}$ as $\dot{μ} = - {a d}_{\frac{\partial H^{-}}{\partial μ}}^{*} μ$ and this vector field is precisely of the form (Equation3(3) $F |_{m} = {ψ_{*} (f (m))|}_{m}, f o r a l l m \in M$ (3) ) with $f (μ) = \frac{\partial H^{-}}{\partial μ} (μ)$ . A side effect of this is that the integral curves of these Lie–Poisson systems preserve coadjoint orbits, making the coadjoint action an attractive choice for Lie group integrators.

Let us now detail the situation for the very simple case where $G = S O (3)$ . The Lie algebra $s o (3)$ can be modelled as $3 \times 3$ skew-symmetric matrices, and via the standard basis we identify each such matrix $\hat{ξ}$ by a vector $ξ \in R^{3}$ , this identification is known as the hat map (5) $\hat{ξ} = [\begin{array}{ccc} 0 & - ξ_{3} & ξ_{2} \\ ξ_{3} & 0 & - ξ_{1} \\ - ξ_{2} & ξ_{1} & 0 \end{array}]$ (5) Now, we also write the elements of $s o (3)^{*}$ as vectors in $R^{3}$ with duality pairing $〈 μ, ξ 〉 = μ^{T} ξ$ . With these representations, we find that the coadjoint action can be expressed as $g \cdot μ = ψ (g, μ) = {A d}_{g^{- 1}}^{*} μ = g μ$ the rightmost expression being a simple matrix-vector multiplication. Since g is orthogonal, it follows that the coadjoint orbits foliate 3-space into spherical shells, and the coadjoint action is transitive on each of these orbits. The free rigid body can be cast as a problem on $T S O (3)^{*}$ with a left invariant Hamiltonian which reduces to the function $H^{-} (μ) = \frac{1}{2} 〈 μ, I^{- 1} μ 〉$ on $s o (3)^{*}$ where $I : s o (3) \to s o (3)^{*}$ is the inertia tensor. From this, we can now set $f (μ) = \partial H^{-} / \partial μ = I^{- 1} μ$ . We then recover the Euler free rigid body equation as $\dot{μ} = {ψ_{*} (f (μ)|}_{μ} = - {a d}_{I^{- 1} μ}^{*} μ = - I^{- 1} μ \times μ$ where the last expression involves the cross product of vectors in $R^{3}$ .

2.2. Two classes of Lie group integrators

The simplest numerical integrator for linear spaces is the explicit Euler method. Given an initial value problem $\dot{y} = F (y)$ , $y (0) = y_{0}$ the method is defined as $y_{n + 1} = y_{n} + h F (y_{n})$ for some stepsize h. In the spirit of the previous section, one could think of the Euler method as the h-flow of the constant vector field $F_{y_{n}} (y) = F (y_{n})$ , that is $y_{n + 1} = \exp (h F_{y_{n}}) y_{n}$ This definition of the Euler method makes sense also when F is replaced by a vector field on some manifold. In this general situation, it is known as the Lie–Euler method.

We shall here consider the two classes of methods known as Runge–Kutta–Munthe–Kaas (RKMK) methods and Commutator-free Lie group methods.

For RKMK methods, the underlying idea is to transform the problem from the manifold $M$ to the Lie algebra $g$ , take a time step, and map the result back to $M$ . The transformation we use is $y (t) = \exp (σ (t)) \cdot y_{0}, σ (0) = 0.$ The transformed differential equation for $σ (t)$ makes use of the derivative of the exponential mapping, the reader should consult [Citation41] for details about the derivation, we give the final result (6) $\dot{σ} (t) = {d e x p}_{σ (t)}^{- 1} (f (\exp (σ (t)) \cdot y_{0}))$ (6) The map $v \mapsto {d e x p}_{u} (v)$ is linear and invertible when u belongs to some sufficiently small neighbourhood of $0 \in g$ . It has an expansion in nested Lie brackets [Citation21]. Using the operator ${a d}_{u} (v) = [u, v]$ and its powers ${a d}_{u}^{2} v = [u, [u, v]]$ , etc., one can write (7) ${d e x p}_{u} (v) = {\frac{e^{z} - 1}{z}|}_{z = {a d}_{u}} (v) = v + \frac{1}{2} [u, v] + \frac{1}{6} [u, [u, v]] + \dots$ (7) and the inverse is (8) ${d e x p}_{u}^{- 1} (v) = {\frac{z}{e^{z} - 1}|}_{z = {a d}_{u}} (v) = v - \frac{1}{2} [u, v] + \frac{1}{12} [u, [u, v]] + \dots$ (8) The RKMK methods are now obtained simply by applying some standard Runge–Kutta method to the transformed Equation (Equation6(6) $\dot{σ} (t) = {d e x p}_{σ (t)}^{- 1} (f (\exp (σ (t)) \cdot y_{0}))$ (6) ) with a time step h, using initial value $σ (0) = 0$ . This leads to an output $σ_{1} \in g$ and one simply sets $y_{1} = \exp (σ_{1}) \cdot y_{0}$ . Then one repeats the procedure replacing $y_{0}$ by $y_{1}$ in the next step, etc. While solving (Equation6(6) $\dot{σ} (t) = {d e x p}_{σ (t)}^{- 1} (f (\exp (σ (t)) \cdot y_{0}))$ (6) ), one needs to evaluate ${d e x p}_{u}^{- 1} (v)$ as a part of the process. This can be done by truncating the series (Equation8(8) ${d e x p}_{u}^{- 1} (v) = {\frac{z}{e^{z} - 1}|}_{z = {a d}_{u}} (v) = v - \frac{1}{2} [u, v] + \frac{1}{12} [u, [u, v]] + \dots$ (8) ) since $σ (0) = 0$ implies that we always evaluate ${d e x p}_{u}^{- 1}$ with $u = O (h)$ , and thus, the kth iterated commutator ${a d}_{u}^{k} = O (h^{k})$ . For a given Runge–Kutta method, there are some clever tricks that can be done to minimize the total number of commutators to be included from the expansion of ${d e x p}_{u}^{- 1} v$ [Citation5,Citation42]. We give here one concrete example of an RKMK method proposed in [Citation5] $\begin{aligned} f_{n, 1} & = h f (y_{n}), \\ f_{n, 2} & = h f (\exp (\frac{1}{2} f_{n, 1}) \cdot y_{n}), \\ f_{n, 3} & = h f (\exp (\frac{1}{2} f_{n, 2} - \frac{1}{8} [f_{n, 1}, f_{n, 2}]) \cdot y_{n}), \\ f_{n, 4} & = h f (\exp (f_{n, 3}) \cdot y_{n}), \\ y_{n + 1} & = \exp (\frac{1}{6} (f_{n, 1} + 2 f_{n, 2} + 2 f_{n, 3} + f_{n, 4} - \frac{1}{2} [f_{n, 1}, f_{n, 4}])) \cdot y_{n} . \end{aligned}$ The other option is to compute the exact expression for ${d e x p}_{u}^{- 1} (v)$ for the particular Lie algebra we use. For instance, it was shown in [Citation8] that for the Lie algebra $s o (3)$ one has ${d e x p}_{u}^{- 1} (v) = v - \frac{1}{2} u \times v + α^{- 2} (1 - \frac{α}{2} \cot \frac{α}{2}) u \times (u \times v)$ We will present the corresponding formula for $s e (3)$ in Section 2.3.

The second class of Lie group integrators to be considered here are the commutator-free methods, named this way in [Citation6] to emphasize the contrast to RKMK schemes which usually include commutators in the method format. These schemes include the Crouch–Grossman methods [Citation11] and they have the format $\begin{aligned} Y_{n, r} & = \exp (h \sum_{k} α_{r, J}^{k} f_{n, k}) \dots \exp (h \sum_{k} α_{r, 1}^{k} f_{n, k}) \cdot y_{n} \\ f_{n, r} & = f (Y_{n, r}) \\ y_{n + 1} & = \exp (h \sum_{k} β_{J}^{k} f_{n, k}) \dots \exp (h \sum_{k} β_{1}^{k} f_{n, k}) \cdot y_{n} \end{aligned}$ Here the Runge–Kutta coefficients $α_{r, j}^{k}$ , $β_{j}^{r}$ are related to a classical Runge–Kutta scheme with coefficients $a_{r}^{k}$ , $b_{r}$ in that $a_{r}^{k} = \sum_{j} α_{r, j}^{k}$ and $b_{r} = \sum_{j} β_{j}^{r}$ . The $α_{r, j}^{k}$ , $β_{j}^{r}$ are usually chosen to obtain computationally inexpensive schemes with the highest possible order of convergence. The computational complexity of the above schemes depends on the cost of computing an exponential as well as of evaluating the vector field. Therefore it makes sense to keep the number of exponentials J in each stage as low as possible, and possibly also the number of stages s. A trick proposed in [Citation6] was to select coefficients that make it possible to reuse exponentials from one stage to another. This is perhaps best illustrated through the following example from [Citation6], a generalization of the classical fourth-order Runge–Kutta method. (9) $\begin{aligned} \begin{aligned} Y_{n, 1} & = y_{n} \\ Y_{n, 2} & = \exp (\frac{1}{2} h f_{n, 1}) \cdot y_{n} \\ Y_{n, 3} & = \exp (\frac{1}{2} h f_{n, 2}) \cdot y_{n} \\ Y_{n, 4} & = \exp (h f_{n, 3} - \frac{1}{2} h f_{n, 1}) \cdot Y_{n, 2} \\ y_{n + \frac{1}{2}} & = \exp (\frac{1}{12} h (3 f_{n, 1} + 2 f_{n, 2} + 2 f_{n, 3} - f_{n, 4})) \cdot y_{n} \\ y_{n + 1} & = \exp (\frac{1}{12} h (- f_{n, 1} + 2 f_{n, 2} + 2 f_{n, 3} + 3 f_{n, 4})) \cdot y_{n + \frac{1}{2}} \end{aligned} \end{aligned}$ (9) where $f_{n, i} = f (Y_{n, i})$ . Here, we see that one exponential is saved in computing $Y_{n, 4}$ by making use of $Y_{n, 2}$ .

2.3. An exact expression for ${d e x p}_{u}^{- 1} (v)$ in $s e (3)$

As an alternative to using a truncated version of the infinite series for ${d e x p}_{u}^{- 1}$ (Equation8(8) ${d e x p}_{u}^{- 1} (v) = {\frac{z}{e^{z} - 1}|}_{z = {a d}_{u}} (v) = v - \frac{1}{2} [u, v] + \frac{1}{12} [u, [u, v]] + \dots$ (8) ), one can consider exact expressions obtained for certain Lie algebras. Since $s e (3)$ is particularly important in applications to mechanics, we give here its exact expression. For this, we represent elements of $s e (3)$ as a pair $(A, a) \in R^{3} \times R^{3} ≅ R^{6}$ , the first component corresponding to a skew-symmetric matrix $\hat{A}$ via (Equation5(5) $\hat{ξ} = [\begin{array}{ccc} 0 & - ξ_{3} & ξ_{2} \\ ξ_{3} & 0 & - ξ_{1} \\ - ξ_{2} & ξ_{1} & 0 \end{array}]$ (5) ) and a is the translational part. Now, let $φ (z)$ be a real analytic function at z = 0. We define $φ_{+} (z) = \frac{φ (i z) + φ (- i z)}{2}, φ_{-} (z) = \frac{φ (i z) - φ (- i z)}{2 i}$ We next define the four functions $g_{1} (z) = \frac{φ_{-} (z)}{z}, {\tilde{g}}_{1} (z) = \frac{g_{1}^{'} (z)}{z}, g_{2} (z) = \frac{φ (0) - φ_{+} (z)}{z^{2}}, {\tilde{g}}_{2} (z) = \frac{g_{2}^{'} (z)}{z}$ and the two scalars $ρ = A^{T} a$ , $α = ∥ A ∥_{2}$ . One can show that for any $(A, a)$ and $(B, b)$ in $s e (3)$ , it holds that $φ ({a d}_{(A, a)}) (B, b) = (C, c)$ where $\begin{aligned} C & = φ (0) B + g_{1} (α) A \times B + g_{2} (α) A \times (A \times B) \\ c & = φ (0) b + g_{1} (α) (a \times B + A \times b) + ρ {\tilde{g}}_{1} (α) A \times B + ρ {\tilde{g}}_{2} (α) A \times (A \times B) \\ + g_{2} (α) (a \times (A \times B) + A \times (a \times B) + A \times (A \times b)) \end{aligned}$ Considering for instance (Equation8(8) ${d e x p}_{u}^{- 1} (v) = {\frac{z}{e^{z} - 1}|}_{z = {a d}_{u}} (v) = v - \frac{1}{2} [u, v] + \frac{1}{12} [u, [u, v]] + \dots$ (8) ), we may now use $φ (z) = \frac{z}{e^{z} - 1}$ to calculate $g_{1} (z) = - \frac{1}{2}, {\tilde{g}}_{1} (z) = 0, g_{2} (z) = \frac{1 - \frac{z}{2} \cot \frac{z}{2}}{z^{2}}, {\tilde{g}}_{2} (z) = \frac{1}{z} \frac{d}{d z} g_{2} (z), φ (0) = 1.$ and thereby obtain an expression for ${d e x p}_{(A, a)}^{- 1} (B, b)$ with the formula above.

Similar types of formulas are known for computing the matrix exponential as well as functions of the $a d$ -operator for several other Lie groups of small and medium dimension. For instance in [Citation34], a variety of coordinate mappings for rigid body motions are discussed. For Lie algebras of larger dimension, both the exponential mapping and ${d e x p}_{u}^{- 1}$ may become computationally infeasible. For these cases, one may benefit from replacing the exponential by some other coordinate map for the Lie group $ϕ : g \to G$ . One option is to use canonical coordinates of the second kind [Citation45]. Then for some Lie groups such as the orthogonal, unitary and symplectic groups, there exist other maps that can be used and which are computationally less expensive. A popular choice is the Cayley transformation [Citation13].

3. Hamiltonian systems on Lie groups

In this section, we consider Hamiltonian systems on Lie groups. These systems (and their Lagrangian counterpart) often appear in mechanics applications as building blocks for more realistic systems with additional damping and control forces. We consider canonical systems on the cotangent bundle of a Lie group and Lie–Poisson systems which can arise by symmetry reduction or otherwise. We illustrate various cases with different formulations of the heavy top system.

3.1. Semi-direct products

The coadjoint action by G on $g^{*}$ is denoted ${A d}_{g}^{*}$ defined for any $g \in G$ as (10) $〈 {A d}_{g}^{*} μ, ξ 〉 = 〈 μ, {A d}_{g} ξ 〉, \forall ξ \in g,$ (10) where $A d : g \to g$ is the adjoint representation and for a duality pairing $〈 \cdot, \cdot 〉$ between $g^{*}$ and $g$ .

We consider the cotangent bundle of a Lie group G, $T^{*} G$ and identify it with $G \times g^{*}$ using the right multiplication $R_{g} : G \to G$ and its tangent mapping $R_{g *} := T R_{g}$ . The cartesian product $G \times g^{*}$ can be given a semi-direct product structure that turns it into a Lie group $G := G ⋉ g^{*}$ where the group multiplication is (11) $(g_{1}, μ_{1}) \cdot (g_{2}, μ_{2}) = (g_{1} \cdot g_{2}, μ_{1} + {A d}_{g_{1}^{- 1}}^{*} μ_{2}) .$ (11) Acting by left multiplication any vector field $F \in X (G)$ is expressed by means of a map $f : G \to T_{e} G$ , (12) $F (g, μ) = T_{e} R_{(g, μ)} f (g, μ) = (R_{g *} f_{1}, f_{2} - {a d}_{f_{1}}^{*} μ),$ (12) where $f_{1} = f_{1} (g, μ) \in g$ , $f_{2} = f_{2} (g, μ) \in g^{*}$ are the two components of f.

3.2. Symplectic form and Hamiltonian vector fields

The right trivializedFootnote² symplectic form pulled back to $G$ reads (13) $\begin{aligned} ω_{(g, μ)} ((R_{g *} ξ_{1}, δ ν_{1}), (R_{g *} ξ_{2}, δ ν_{2})) & = 〈 δ ν_{2}, ξ_{1} 〉 + \\ - 〈 δ ν_{1}, ξ_{2} 〉 - 〈 μ, [ξ_{1}, ξ_{2}] 〉, ξ_{1}, ξ_{2} \in g . \end{aligned}$ (13) See [Citation32] for more details, proofs and for a the left trivialized symplectic form. The vector field F is a Hamiltonian vector field if it satisfies $i_{F} ω = d H,$ for some Hamiltonian function $H : T^{*} G \to R$ , where $i_{F}$ is defined as $i_{F} (X) := ω (F, X)$ for any vector field X. This implies that the map f for such a Hamiltonian vector field gets the form (14) $f (g, μ) = (\frac{\partial H}{\partial μ} (g, μ), - R_{g}^{*} \frac{\partial H}{\partial g} (g, μ)) .$ (14) The following is a one-parameter family of symplectic Lie group integrators on $T^{*} G$ : (15) $\begin{aligned} M_{θ} & = {d e x p}_{- ξ}^{*} (μ_{0} + {A d}_{\exp (θ ξ)}^{*} (\bar{n})) - θ {d e x p}_{- θ ξ}^{*} {A d}_{\exp (θ ξ)}^{*} (\bar{n}), \end{aligned}$ (15) (16) $\begin{aligned} (ξ, \bar{n}) & = h f (\exp (θ ξ) \cdot g_{0}, M_{θ}), \end{aligned}$ (16) (17) $\begin{aligned} (g_{1}, μ_{1}) & = (\exp (ξ), {A d}_{\exp ((θ - 1) ξ)}^{*} \bar{n}) \cdot (g_{0}, μ_{0}) . \end{aligned}$ (17) For higher order integrators of this type and a complete treatment, see [Citation3].

3.3. Reduced equations Lie Poisson systems

A mechanical system formulated on the cotangent bundle $T^{*} G$ with a left or right invariant Hamiltonian can be reduced to a system on $g^{*}$ [Citation36]. In fact for a Hamiltonian H right invariant under the left action of G, $\frac{\partial H}{\partial g} = 0$ , and from (Equation12(12) $F (g, μ) = T_{e} R_{(g, μ)} f (g, μ) = (R_{g *} f_{1}, f_{2} - {a d}_{f_{1}}^{*} μ),$ (12) ) and (Equation14(14) $f (g, μ) = (\frac{\partial H}{\partial μ} (g, μ), - R_{g}^{*} \frac{\partial H}{\partial g} (g, μ)) .$ (14) ) we get for the second equation (18) $\dot{μ} = \mp {a d}_{\frac{\partial H}{\partial μ}}^{*} μ,$ (18) where the positive sign is used in case of left invariance (see, e.g. Section 13.4 in [Citation37]). The solution to this system preserves coadjoint orbits, thus using the Lie group action $g \cdot μ = {A d}_{g^{- 1}}^{*} μ,$ to build a Lie group integrator results in preservation of such coadjoint orbits. Lie group integrators for this interesting case were studied in [Citation15].

The Lagrangian counterpart to these Hamiltonian equations are the Euler–Poincaré equationsFootnote³ [Citation24].

3.4. Three different formulations of the heavy top equations

The heavy top is a simple test example for illustrating the behaviour of Lie group methods. We will consider three different formulations for this mechanical system. The first formulation is on $T^{*} S O (3)$ where the equations are canonical Hamiltonian, a second point of view is that the system is a Lie–Poisson system on $s e (3)^{*}$ , and finally it is canonical Hamiltonian on a larger group with a quadratic Hamiltonian function. The three different formulations suggest the use of different Lie group integrators.

3.4.1. Heavy top equations on $T^{*} S O (3)$

The heavy top is a rigid body with a fixed point in a gravitational field. The phase space of this mechanical system is $T^{*} S O (3)$ where the equations of the heavy top are in canonical Hamiltonian form. Assuming $(Q, p)$ are coordinates for $T^{*} S O (3)$ , $Π = (T_{e} L_{Q})^{*} (p)$ is the left trivialized or body momentum. The Hamiltonian of the heavy top is given in terms of $(Q, Π)$ as $H : S O (3) ⋉ s o (3)^{*} \to R, H (Q, Π) = \frac{1}{2} 〈 Π, I^{- 1} Π 〉 + M g ℓ Γ \cdot X, Γ = Q^{- 1} Γ_{0},$ where $I : s o (3) \to s o (3)^{*}$ is the inertia tensor, here represented as a diagonal $3 \times 3$ matrix, $Γ = Q^{- 1} Γ_{0}$ , where $Γ_{0} \in R^{3}$ is the axis of the spatial coordinate system parallel to the direction of gravity but pointing upwards, M is the mass of the body, g is the gravitational acceleration, $X$ is the body fixed unit vector of the oriented line segment pointing from the fixed point to the centre of mass of the body, ℓ is the length of this segment. The equations of motion on $S O (3) ⋉ s o (3)^{*}$ are (19) $\begin{aligned} \dot{Π} & = Π \times I^{- 1} Π + M g ℓ Γ \times X, \end{aligned}$ (19) (20) $\begin{aligned} \dot{Q} & = Q \hat{I^{- 1} Π} . \end{aligned}$ (20) The identification of $T^{*} S O (3)$ with $S O (3) ⋉ s o (3)^{*}$ via right trivialization leads to the spatial momentum variable $π = (T_{e} R_{Q})^{*} (p) = Q Π$ . The equations written in the space variables $(Q, π)$ get the form (21) $\begin{aligned} \dot{π} & = M g ℓ Γ_{0} \times Q X, \end{aligned}$ (21) (22) $\begin{aligned} \dot{Q} & = \hat{ω} Q ω = Q I^{- 1} Q^{T} π . \end{aligned}$ (22) where the first equation states that the component of π parallel to $Γ_{0}$ is constant in time. These equations can be obtained from (Equation12(12) $F (g, μ) = T_{e} R_{(g, μ)} f (g, μ) = (R_{g *} f_{1}, f_{2} - {a d}_{f_{1}}^{*} μ),$ (12) ) and (Equation14(14) $f (g, μ) = (\frac{\partial H}{\partial μ} (g, μ), - R_{g}^{*} \frac{\partial H}{\partial g} (g, μ)) .$ (14) ) on the right trivialized $T^{*} S O (3)$ , $S O (3) ⋉ s o (3)^{*}$ , with the heavy top Hamiltonian and the symplectic Lie group integrators (Equation16(16) $\begin{aligned} (ξ, \bar{n}) & = h f (\exp (θ ξ) \cdot g_{0}, M_{θ}), \end{aligned}$ (16) )–(Equation17(17) $\begin{aligned} (g_{1}, μ_{1}) & = (\exp (ξ), {A d}_{\exp ((θ - 1) ξ)}^{*} \bar{n}) \cdot (g_{0}, μ_{0}) . \end{aligned}$ (17) ) can be applied in this case. Similar methods were proposed in [Citation32] and [Citation49].

3.4.2. Heavy top equations on ${s e}^{*} (3)$

The Hamiltonian of the heavy top is not invariant under the action of $S O (3)$ , so Equations (Equation19(19) $\begin{aligned} \dot{Π} & = Π \times I^{- 1} Π + M g ℓ Γ \times X, \end{aligned}$ (19) )–(Equation20(20) $\begin{aligned} \dot{Q} & = Q \hat{I^{- 1} Π} . \end{aligned}$ (20) ) given in Section 3.4.1 cannot be reduced to ${s o}^{*} (3)$ , nevertheless the heavy top equations are Lie–Poisson on ${s e}^{*} (3)$ , [Citation17,Citation48,Citation52].

Observe that the equations of the heavy top on $T^{*} S O (3)$ (Equation19(19) $\begin{aligned} \dot{Π} & = Π \times I^{- 1} Π + M g ℓ Γ \times X, \end{aligned}$ (19) )–(Equation20(20) $\begin{aligned} \dot{Q} & = Q \hat{I^{- 1} Π} . \end{aligned}$ (20) ) can be easily modified eliminating the variable $Q \in S O (3)$ and replacing it with $Γ \in R^{3}$ $Γ = Q^{- 1} Γ_{0}$ to obtain (23) $\begin{aligned} \dot{Π} & = Π \times I^{- 1} Π + M g ℓ Γ \times X, \end{aligned}$ (23) (24) $\begin{aligned} \dot{Γ} & = Γ \times (I^{- 1} Π) . \end{aligned}$ (24) We will see that the solutions of these equations evolve on ${s e}^{*} (3)$ . In what follows, we consider elements of ${s e}^{*} (3)$ to be pairs of vectors in $R^{3}$ , e.g. $(Π, Γ)$ . Correspondingly the elements of $S E (3)$ are represented as pairs $(g, u)$ with $g \in S O (3)$ and $u \in R^{3}$ . The group multiplication in $S E (3)$ is then $(g_{1}, u_{1}) \cdot (g_{2}, u_{2}) = (g_{1} g_{2}, g_{1} u_{2} + u_{1}),$ where $g_{1} g_{2}$ is the product in $S O (3)$ and $g_{1} u$ is the product of a $3 \times 3$ orthogonal matrix with a vector in $R^{3}$ . The coadjoint representation and its infinitesimal generator on ${s e}^{*} (3)$ take the form ${A d}_{(g, u)}^{*} (Π, Γ) = (g^{- 1} (Π - u \times Γ), g^{- 1} Γ), {a d}_{(ξ, u)}^{*} (Π, Γ) = (- ξ \times Π - u \times Γ, - ξ \times Γ) .$ Using this expression for ${a d}_{(ξ, u)}^{*}$ with $(ξ = \frac{\partial H}{\partial Π}, u = \frac{\partial H}{\partial Γ})$ , it can be easily seen that Equation (Equation18(18) $\dot{μ} = \mp {a d}_{\frac{\partial H}{\partial μ}}^{*} μ,$ (18) ) in this setting reproduce the heavy top Equations (Equation23(23) $\begin{aligned} \dot{Π} & = Π \times I^{- 1} Π + M g ℓ Γ \times X, \end{aligned}$ (23) )–(Equation24(24) $\begin{aligned} \dot{Γ} & = Γ \times (I^{- 1} Π) . \end{aligned}$ (24) ). Therefore the equations are Lie–Poisson equations on ${s e}^{*} (3)$ . However, since the heavy top is a rigid body with a fixed point and there are no translations, these equations do not arise from a reduction of $T^{*} S E (3)$ . Moreover, the Hamiltonian on $s e (3)^{*}$ is not quadratic and the equations are not geodesic equations. Implicit and explicit Lie group integrators applicable to this formulation of the heavy top equations and preserving coadjoint orbits were discussed in [Citation15], for a variable stepsize integrator applied to this formulation of the heavy top, see [Citation12].

3.4.3. Heavy top equations with quadratic Hamiltonian

We rewrite the heavy top equations one more time considering the constant vector $p = - M g ℓ X$ as a momentum variable conjugate to the position $q \in R^{3}$ and where $p = Q^{- 1} Γ_{0} + \dot{q}$ , and the Hamiltonian is a quadratic function of Π, Q, $p$ and $q$ : $\begin{aligned} H : T^{*} S O (3) \times {R^{3}}^{*} \times R^{3} \to R, \\ H ((Π, Q), (p, q)) = \frac{1}{2} 〈 Π, I^{- 1} Π 〉 + \frac{1}{2} ∥ p - Q^{- 1} Γ_{0} ∥^{2} - \frac{1}{2} ∥ Q^{- 1} Γ_{0} ∥^{2}, \end{aligned}$ see [Citation23, section 8.5]. This Hamiltonian is invariant under the left action of $S O (3)$ . The corresponding equations are canonical on $T^{*} S \equiv S ⋉ s^{*}$ where $S = S O (3) \times R^{3}$ with Lie algebra $s := s o (3) \times R^{3}$ and $T^{*} S$ can be identified with $T^{*} S O (3) \times {R^{3}}^{*} \times R^{3}$ . The equations are (25) $\begin{aligned} \dot{Π} & = Π \times I^{- 1} Π - (Q^{- 1} Γ_{0}) \times p, \end{aligned}$ (25) (26) $\begin{aligned} \dot{Q} & = Q \hat{I^{- 1} Π}, \end{aligned}$ (26) (27) $\begin{aligned} \dot{p} & = 0, \end{aligned}$ (27) (28) $\begin{aligned} \dot{q} & = p - Q^{- 1} Γ_{0} . \end{aligned}$ (28) and in the spatial momentum variables (29) $\begin{aligned} \dot{π} & = - Γ_{0} \times Q p, \end{aligned}$ (29) (30) $\begin{aligned} \dot{Q} & = \hat{ω} Q, ω = Q I^{- 1} Q^{T} π, \end{aligned}$ (30) (31) $\begin{aligned} \dot{p} & = 0, \end{aligned}$ (31) (32) $\begin{aligned} \dot{q} & = p - Q^{- 1} Γ_{0} . \end{aligned}$ (32) Similar formulations were considered in [Citation31] for the stability analysis of an underwater vehicle. A similar but different formulation of the heavy top was considered in [Citation4].

3.4.4. Numerical experiments

We apply various implicit Lie group integrators to the heavy top system. The test problem we consider is the same as in [Citation4], where $Q (0) = I$ , $ℓ = 2$ , M = 15 $I = d i a g (0.234375, 0.46875, 0.234375)$ , $π (0) = I (0, 150, - 4.61538)$ , $X = (0, 1, 0)$ $Γ_{0} = (0, 0, - 9.81)$ .

In Figure , we report the performance of the symplectic Lie group integrators (Equation15(15) $\begin{aligned} M_{θ} & = {d e x p}_{- ξ}^{*} (μ_{0} + {A d}_{\exp (θ ξ)}^{*} (\bar{n})) - θ {d e x p}_{- θ ξ}^{*} {A d}_{\exp (θ ξ)}^{*} (\bar{n}), \end{aligned}$ (15) )–(Equation17(17) $\begin{aligned} (g_{1}, μ_{1}) & = (\exp (ξ), {A d}_{\exp ((θ - 1) ξ)}^{*} \bar{n}) \cdot (g_{0}, μ_{0}) . \end{aligned}$ (17) ) applied both on Equations (Equation21(21) $\begin{aligned} \dot{π} & = M g ℓ Γ_{0} \times Q X, \end{aligned}$ (21) )–(Equation22(22) $\begin{aligned} \dot{Q} & = \hat{ω} Q ω = Q I^{- 1} Q^{T} π . \end{aligned}$ (22) ) with $θ = 0$ , $θ = \frac{1}{2}$ and $θ = 1$ (SLGI), and to Equations (Equation29(29) $\begin{aligned} \dot{π} & = - Γ_{0} \times Q p, \end{aligned}$ (29) )–(Equation32(32) $\begin{aligned} \dot{q} & = p - Q^{- 1} Γ_{0} . \end{aligned}$ (32) ) with $θ = \frac{1}{2}$ (SLGIKK). The methods with $θ = \frac{1}{2}$ attain order 2. In Figure , we show the energy error for the symplectic Lie group integrators with $θ = \frac{1}{2}$ and $θ = 0$ integrating with stepsize h = 0.01 for 6000 steps.

Figure 1. Illustration of the heavy top, where CM is the centre of mass of the body, O is the fixed point, $\vec{g}$ is the gravitational acceleration vector, and $ℓ, Q, \vec{χ}$ follow the notation introduced in Section 3.4.1.

Figure 1. Illustration of the heavy top, where CM is the centre of mass of the body, O is the fixed point, g→ is the gravitational acceleration vector, and ℓ,Q,χ→ follow the notation introduced in Section 3.4.1.

Figure 2. Symplectic Lie group integrators integration on the time interval $[0, 1]$ . Left: 3D plot of $M ℓ Q^{- 1} Γ_{0}$ . Centre: components of $Q X$ . The left and centre plots are computed with the same step-size. Right: verification of the order of the methods.

Figure 3. Symplectic Lie group integrators, long time integration, h = 0.01, 6000 steps.. Top: energy error, bottom 3D plot of $M ℓ Q^{- 1} Γ_{0}$ .

4. Variable step size

One approach for varying the step size is based on the use of an embedded Runge–Kutta pair. This principle can be carried from standard Runge–Kutta methods in vector spaces to the present situation with RKMK and commutator-free schemes via minor modifications. We briefly summarize the main principle of embedded pairs before giving more specific details for the case of Lie group integrators. This approach is very well documented in the literature and goes back to Merson [Citation38] and a detailed treatment can be found in [Citation19, pp. 165–168].

An embedded pair consists of a main method used to propagate the numerical solution, together with some auxiliary method that is only used to obtain an estimate of the local error. This local error estimate is in turn used to derive a step size adjustment formula that attempts to keep the local error estimate approximately equal to some user defined tolerance $t o l$ in every step. Suppose the main method is of order p and the auxiliary method is of order $\tilde{p} \neq p .$ Footnote⁴ Both methods are applied to the input value $y_{n}$ and yields approximations $y_{n + 1}$ and ${\tilde{y}}_{n + 1}$ respectively, using the same step size $h_{n + 1}$ . Now, some distance measureFootnote⁵ between $y_{n + 1}$ and ${\tilde{y}}_{n + 1}$ provides an estimate $e_{n + 1}$ for the size of the local truncation error. Thus $e_{n + 1} = C h_{n + 1}^{\tilde{p} + 1} + O (h^{\tilde{p} + 2})$ . Aiming at $e_{n + 1} \approx t o l$ in every step, one may use a formula of the type (33) $h_{n + 1} = θ {(\frac{t o l}{e_{n + 1}})}^{\frac{1}{\tilde{p} + 1}} h_{n}$ (33) where θ is a ‘safety factor’, typically chosen between 0.8 and 0.9. In case the step is rejected because $e_{n} > t o l$ we can redo the step with a step size obtained by the same formula. We summarize the approach in the following algorithm

Here we have used again the safety factor θ, and the parameter α is generally chosen as $α = \frac{1}{1 + min (p, \tilde{p})}$ .

4.1. RKMK methods with variable stepsize

We need to specify how to calculate the quantity $e_{n + 1}$ in each step. For RKMK methods, the situation is simplified by the fact that we are solving the local problem (Equation6(6) $\dot{σ} (t) = {d e x p}_{σ (t)}^{- 1} (f (\exp (σ (t)) \cdot y_{0}))$ (6) ) in the linear space $g$ , where the known theory can be applied directly. So any standard embedded pair of Runge–Kutta methods described by coefficients $(a_{i j}, b_{i}, {\tilde{a}}_{i j}, {\tilde{b}}_{i})$ of orders $(p, \tilde{p})$ can be applied to the full dexpinv-equation (Equation6(6) $\dot{σ} (t) = {d e x p}_{σ (t)}^{- 1} (f (\exp (σ (t)) \cdot y_{0}))$ (6) ) to obtain local Lie algebra approximations $σ_{1}$ , ${\tilde{σ}}_{1}$ and one uses, e.g. $e_{n + 1} = ∥ σ_{1} - {\tilde{σ}}_{1} ∥$ (note that the equation itself depends on $y_{n}$ ). For methods which use a truncated version of the series for ${d e x p}_{u}^{- 1}$ one may also try to optimize performance by including commutators that are shared between the main method and the auxiliary scheme.

4.2. Commutator-free methods with variable stepsize

For the commutator-free methods of Section 2.2, the situation is different since such methods do not have a natural local representation in a linear space. One can still derive embedded pairs, and this can be achieved by studying order conditions [Citation44] as was done in [Citation12]. Now one obtains after each step two approximations $y_{n + 1}$ and ${\tilde{y}}_{n + 1}$ on $M$ both by using the same initial value $y_{n}$ and step size $h_{n}$ . One must also have access to some metric d to calculate $e_{n + 1} = d (y_{n + 1}, {\tilde{y}}_{n + 1})$ We give a few examples of embedded pairs.

4.2.1. Pairs of order $(p, \tilde{p}) = (3, 2)$

It is possible to obtain embedded pairs of order 3(2) which satisfy the requirements above. We present two examples from [Citation12]. The first one reuses the second stage exponential in the update $\begin{aligned} Y_{n, 1} & = y_{n} \\ Y_{n, 2} & = \exp (\frac{1}{3} h f_{n, 1}) \cdot y_{n} \\ Y_{n, 3} & = \exp (\frac{2}{3} h f_{n, 2}) \cdot y_{n} \\ y_{n + 1} & = \exp (h (- \frac{1}{12} f_{n, 1} + \frac{3}{4} f_{n, 3})) \cdot Y_{n, 2} \\ {\tilde{y}}_{n + 1} & = \exp (\frac{1}{2} h (f_{n, 2} + f_{n, 3})) \cdot y_{n} \end{aligned}$ One could also have reused the third stage $Y_{n, 3}$ in the update, rather than $Y_{n, 2}$ . $\begin{aligned} Y_{n, 1} & = y_{n} \\ Y_{n, 2} & = \exp (\frac{2}{3} h f_{n, 1}) \cdot y_{n} \\ Y_{n, 3} & = \exp (h (\frac{5}{12} f_{n, 1} + \frac{1}{4} f_{n, 2}) \cdot y_{n} \\ y_{n + 1} & = \exp (h (- \frac{1}{6} f_{n, 1} - \frac{1}{2} f_{n, 2} + f_{n, 3})) \cdot Y_{n, 3} \\ {\tilde{y}}_{n + 1} & = \exp (\frac{1}{4} h (f_{n, 1} + 3 f_{n, 3})) \cdot y_{n} \end{aligned}$ It is always understood that the frozen vector fields are $f_{n, i} := f_{Y_{n, i}}$ .

4.2.2. Order $(4, 3)$

The procedure of deriving efficient pairs becomes more complicated as the order increases. In [Citation12], a low cost pair of order $(4, 3)$ was derived, in the sense that one attempted to minimize the number of stages and exponentials in the embedded pair as a whole. This came, however, at the expense of a relatively large error constant. So rather than presenting the method from that paper, we suggest a simpler procedure at the cost of some more computational work per step, we simply furnish the commutator-free method of Section 2 by a third-order auxiliary scheme. It can be described as follows:

Compute $Y_{n, i}, i = 1 \dots, 4$ and $y_{n + 1}$ from (Equation9(9) $\begin{aligned} \begin{aligned} Y_{n, 1} & = y_{n} \\ Y_{n, 2} & = \exp (\frac{1}{2} h f_{n, 1}) \cdot y_{n} \\ Y_{n, 3} & = \exp (\frac{1}{2} h f_{n, 2}) \cdot y_{n} \\ Y_{n, 4} & = \exp (h f_{n, 3} - \frac{1}{2} h f_{n, 1}) \cdot Y_{n, 2} \\ y_{n + \frac{1}{2}} & = \exp (\frac{1}{12} h (3 f_{n, 1} + 2 f_{n, 2} + 2 f_{n, 3} - f_{n, 4})) \cdot y_{n} \\ y_{n + 1} & = \exp (\frac{1}{12} h (- f_{n, 1} + 2 f_{n, 2} + 2 f_{n, 3} + 3 f_{n, 4})) \cdot y_{n + \frac{1}{2}} \end{aligned} \end{aligned}$ (9) )
Compute an additional stage ${\bar{Y}}_{n, 3}$ and then ${\tilde{y}}_{n + 1}$ as (34) $\begin{aligned} \begin{aligned} {\bar{Y}}_{n, 3} & = \exp (\frac{3}{4} h f_{n, 2}) \cdot y_{n} \\ {\tilde{y}}_{n + 1} & = \exp (\frac{h}{9} (- f_{n, 1} + 3 f_{n, 2} + 4 {\bar{f}}_{n, 3})) \cdot \exp (\frac{h}{3} f_{n, 1}) \cdot y_{n} \end{aligned} \end{aligned}$ (34)

5. The N-fold 3D pendulum

In this section, we present a model for a system of N connected three-dimensional pendulums. The modelling part comes from [Citation28], and here we study the vector field describing the dynamics, in order to re-frame it into the Lie group integrators setting described in the previous sections. The model we use is not completely realistic since, for example, it neglects possible interactions between pendulums, and it assumes ideal spherical joints between them. However, this is still a relevant example from the point of view of geometric numerical integration. More precisely, we show a possible way to work with a configuration manifold which is not a Lie group, applying the theoretical instruments introduced before.

The Lagrangian we consider is a function from $(T S^{2})^{N}$ to $R$ . Instead of the coordinates $(q_{1}, \dots, q_{N}, {\dot{q}}_{1}, \dots, {\dot{q}}_{N})$ , where ${\dot{q}}_{i} \in T_{q_{i}} S^{2}$ , we choose to work with the angular velocities. Precisely, $T_{q_{i}} S^{2} = {v \in R^{3} : v^{T} q_{i} = 0} = 〈 q_{i} 〉^{⊥} \subset R^{3},$ and hence for any ${\dot{q}}_{i} \in T_{q_{i}} S^{2}$ there exist $ω_{i} \in R^{3}$ such that ${\dot{q}}_{i} = ω_{i} \times q_{i}$ , which can be interpreted as the angular velocity of $q_{i}$ . So we can assume without loss of generality that $ω_{i}^{T} q_{i} = 0$ (i.e. $ω_{i} \in T_{q_{i}} S^{2}$ ) and pass to the coordinates $(q_{1}, ω_{1}, q_{2}, ω_{2}, \dots, q_{N}, ω_{N}) \in (T S^{2})^{N}$ to describe the dynamics. In this section, we denote with $m_{1}, \dots, m_{N}$ the masses of the pendulums and with $L_{1}, \dots, L_{N}$ their lengths. Figure shows the case N = 3. We organize the section into three parts:

We define the transitive Lie group action used to integrate this model numerically,
We show a possible way to express the dynamics in terms of the infinitesimal generator of this action, for the general case of N joint pendulums,
We focus on the case N = 2, as a particular example. For this setting, we present some numerical experiment comparing various Lie group integrators and some classical numerical integrator. Then we conclude with numerical experiments on variable step size.

5.1. Transitive group action on $(T S^{2})^{N}$

We characterize a transitive action for $(T S^{2})^{N}$ , starting with the case N = 1 and generalizing it to N>1. The action we consider is based on the identification between $s e (3)$ , the Lie algebra of $S E (3)$ and $R^{6}$ . We start from the Ad-action of $S E (3)$ on $s e (3)$ (see [Citation23]), which writes $\begin{aligned} A d : S E (3) \times s e (3) \to s e (3), \\ A d ((R, r), (u, v)) = (R u, R v + \hat{r} R u) . \end{aligned}$ Since $s e (3) ≃ R^{6}$ , the Ad-action allows us to define the following Lie group action on $R^{6}$ $ψ : S E (3) \times R^{6} \to R^{6}, ψ ((R, r), (u, v)) = (R u, R v + \hat{r} R u) .$ We can think of ψ as a Lie group action on $T S^{2}$ since, for any $q \in R^{3}$ , it maps points of $T S_{| q |}^{2} := {(\tilde{q}, \tilde{ω}) \in R^{3} \times R^{3} : {\tilde{ω}}^{T} \tilde{q} = 0, | \tilde{q} | = | q |} \subset R^{6}$ into other points of $T S_{| q |}^{2}$ . Moreover, with standard arguments (see [Citation43]), it is possible to prove that the orbit of a generic point $m = (q, ω) \in R^{6}$ with $ω^{T} q = 0$ coincides with $O r b (m) = T S_{| q |}^{2} .$ In particular, when $q \in R^{3}$ is a unit vector (i.e. $q \in S^{2}$ ), ψ allows us to define a transitive Lie group action on $T S^{2} = T S_{| q | = 1}^{2}$ which writes $\begin{aligned} ψ : S E (3) \times T S^{2} \to T S^{2} \\ ψ ((A, a), (q, ω)) := ψ_{(A, a)} (q, ω) = (A q, A ω + \hat{a} A q) = (\bar{q}, \bar{ω}) . \end{aligned}$ To conclude the description of the action, we report here its infinitesimal generator which is fundamental in the Lie group integrators setting ${ψ_{*} ((u, v))|}_{(q, ω)} = (\hat{u} q, \hat{u} ω + \hat{v} q) .$ We can extend this construction to the case N>1 in a natural way, i.e. through the action of a Lie group obtained from cartesian products of $S E (3)$ and equipped with the direct product structure. More precisely, we consider the group $G = (S E (3))^{N}$ and by direct product structure we mean that for any pair of elements $δ^{(1)} = (δ_{1}^{(1)}, \dots, δ_{N}^{(1)}), δ^{(2)} = (δ_{1}^{(2)}, \dots, δ_{N}^{(2)}) \in G,$ denoted with $*$ the semidirect product of $S E (3)$ , we define the product ° on G as $δ^{(1)} \circ δ^{(2)} := (δ_{1}^{(1)} * δ_{1}^{(2)}, \dots, δ_{N}^{(1)} * δ_{N}^{(2)}) \in G .$ With this group structure defined, we can generalize the action introduced for N = 1 to larger Ns as follows: $\begin{aligned} ψ : (S E (3))^{N} \times (T S^{2})^{N} \to (T S^{2})^{N}, \\ ψ ((A_{1}, a_{1}, \dots, A_{N}, a_{n}), (q_{1}, ω_{1}, \dots, q_{N}, ω_{N})) \\ = (A_{1} q_{1}, A_{1} ω_{1} + {\hat{a}}_{1} A_{1} q_{1}, \dots, A_{N} q_{N}, A_{N} ω_{N} + {\hat{a}}_{N} A_{N} q_{N}), \end{aligned}$ whose infinitesimal generator writes $ψ_{*} (ξ) |_{m} = ({\hat{u}}_{1} q_{1}, {\hat{u}}_{1} ω_{1} + {\hat{v}}_{1} q_{1}, \dots, {\hat{u}}_{N} q_{N}, {\hat{u}}_{N} ω_{N} + {\hat{v}}_{N} q_{N}),$ where $ξ = [u_{1}, v_{1}, \dots, u_{N}, v_{N}] \in s e (3)^{N}$ and $m = (q_{1}, ω_{1}, \dots, q_{N}, ω_{N}) \in (T S^{2})^{N}$ . We have now the only group action we need to deal with the N-fold spherical pendulum. In the following part of this section, we work on the vector field describing the dynamics and adapt it to the Lie group integrators setting.

Figure 4. Threefold pendulum at a fixed time instant, with fixed point placed at the origin.

5.2. Full chain

We consider the vector field $F \in X ((T S^{2})^{N})$ , describing the dynamics of the N-fold 3D pendulum, and we express it in terms of the infinitesimal generator of the action defined above. More precisely, we find a function $F : (T S^{2})^{N} \to s e (3)^{N}$ such that $ψ_{*} (f (m)) |_{m} = F |_{m}, \forall m \in (T S^{2})^{N} .$ We omit the derivation of F starting from the Lagrangian of the system, which can be found in the section devoted to mechanical systems on $(S^{2})^{N}$ of [Citation28]. The configuration manifold of the system is $(S^{2})^{N}$ , while the Lagrangian, expressed in terms of the variables $(q_{1}, ω_{1}, \dots, q_{N}, ω_{N}) \in (T S^{2})^{N}$ , writes $L (q, ω) = T (q, ω) - U (q) = \frac{1}{2} \sum_{i, j = 1}^{N} (M_{i j} ω_{i}^{T} {\hat{q}}_{i}^{T} {\hat{q}}_{j} ω_{j}) - \sum_{i = 1}^{N} (\sum_{j = i}^{N} m_{j}) g L_{i} e_{3}^{T} q_{i},$ where $M_{i j} = (\sum_{k = m a x {i, j}}^{N} m_{k}) L_{i} L_{j} I_{3} \in R^{3 \times 3}$ is the inertia matrix of the system, $I_{3}$ is the $3 \times 3$ identity matrix, and $e_{3} = [0, 0, 1]^{T}$ . Noticing that when i = j we get $ω_{i}^{T} {\hat{q}}_{i}^{T} {\hat{q}}_{i} ω_{i} = ω_{i}^{T} (I_{3} - q_{i} q_{i}^{T}) ω_{i} = ω_{i}^{T} ω_{i},$ we simplify the notation writing $T (q, ω) = \frac{1}{2} \sum_{i, j = 1}^{N} (ω_{i}^{T} R (q)_{i j} ω_{j})$ where $R (q) \in R^{3 N \times 3 N}$ is a symmetric block matrix defined as $\begin{aligned} R (q)_{i i} & = (\sum_{j = i}^{N} m_{j}) L_{i}^{2} I_{3} \in R^{3 \times 3}, \\ R (q)_{i j} & = (\sum_{k = j}^{N} m_{k}) L_{i} L_{j} {\hat{q}}_{i}^{T} {\hat{q}}_{j} \in R^{3 \times 3} = R (q)_{j i}^{T}, i < j . \end{aligned}$ The vector field on which we need to work defines the following first-order ODE: $\begin{aligned} {\dot{q}}_{i} & = ω_{i} \times q_{i}, i = 1, \dots, N, \\ R (q) \dot{ω} & = {[\sum_{\begin{matrix} j = 1 \\ j \neq i \end{matrix}}^{N} M_{i j} | ω_{j} |^{2} {\hat{q}}_{i} q_{j} - (\sum_{j = i}^{N} m_{j}) g L_{i} {\hat{q}}_{i} e_{3}]}_{i = 1, \dots, N} \in R^{3 N} \end{aligned}$ By direct computation, it is possible to see that, for any $q = (q_{1}, \dots, q_{N}) \in (S^{2})^{N}$ and $ω \in T_{q_{1}} S^{2} \times \dots \times T_{q_{N}} S^{2}$ , we have $(R (q) ω)_{i} \in T_{q_{i}} S^{2} .$ Therefore, there is a well-defined linear map $A_{q} : T_{q_{1}} S^{2} \times \dots \times T_{q_{N}} S^{2} \to T_{q_{1}} S^{2} \times \dots \times T_{q_{N}} S^{2}, A_{q} (ω) := R (q) ω .$ We can even notice that $R (q)$ defines a positive-definite bilinear form on this linear space, since $ω^{T} R (q) ω = \sum_{i, j = 1}^{N} ω_{i}^{T} {\hat{q}}_{i}^{T} M_{i j} {\hat{q}}_{j} ω_{j} = \sum_{i, j = 1}^{N} ({\hat{q}}_{i} ω_{i})^{T} M_{i j} ({\hat{q}}_{j} ω_{j}) = v^{T} M v > 0.$ The last inequality holds because M is the inertia matrix of the system and hence it defines a symmetric positive-definite bilinear form on $T_{q_{1}} S^{2} \times \dots \times T_{q_{N}} S^{2}$ , see, e.g. [Citation16].Footnote⁶ This implies the map $A_{q}$ is invertible and hence we are ready to express the vector field in terms of the infinitesimal generator. We can rewrite the ODEs for the angular velocities as follows: $\dot{ω} = A_{q}^{- 1} ([g_{1}, \dots, g_{N}]^{T}) = [\begin{matrix} h_{1} (q, ω) \\ \dots \\ h_{N} (q, ω) \end{matrix}] = [\begin{matrix} a_{1} (q, ω) \times q_{1} \\ \dots \\ a_{N} (q, ω) \times q_{N} \end{matrix}]$ where $g_{i} = g_{i} (q, ω) = \sum_{\begin{matrix} j = 1 \\ j \neq i \end{matrix}}^{N} M (q)_{i j} | ω_{j} |^{2} {\hat{q}}_{i} q_{j} - (\sum_{j = i}^{N} m_{j}) g L_{i} {\hat{q}}_{i} e_{3}, i = 1, \dots, N$ and $a_{1}, \dots, a_{N} : (T S^{2})^{N} \to R^{3}$ are N functions whose existence is guaranteed by the analysis done above. Indeed, we can set $a_{i} (q, ω) := q_{i} \times h_{i} (q, ω)$ and conclude that a mapping f from $(T S^{2})^{N}$ to $(s e (3))^{N}$ such that $ψ_{*} (f (q, ω)) |_{(q, ω)} = F |_{(q, ω)}$ is the following one: $f (q, ω) = [\begin{matrix} ω_{1} \\ q_{1} \times h_{1} \\ \dots \\ \dots \\ ω_{N} \\ q_{N} \times h_{N} \end{matrix}] \in s e (3)^{N} ≃ R^{6 N} .$ We will not go into the Hamiltonian formulation of this problem; however, we remark that a similar approach works even in that situation. Indeed, following the derivation presented in [Citation28], we see that for a mechanical system on $(S^{2})^{N}$ the conjugate momentum writes $T_{q_{1}}^{*} S^{2} \times \dots T_{q_{N}}^{*} S^{2} ∋ π = (π_{1}, \dots, π_{N}), w h e r e π_{i} = - {\hat{q}}_{i}^{2} \frac{\partial L}{\partial ω_{i}}$ and its components are still orthogonal to the respective base points $q_{i} \in S^{2}$ . Moreover, Hamilton's equations take the form $\begin{aligned} {\dot{q}}_{i} & = \frac{\partial H (q, π)}{\partial π_{i}} \times q_{i}, \\ {\dot{π}}_{i} & = \frac{\partial H (q, π)}{\partial q_{i}} \times q_{i} + \frac{\partial H (q, π)}{\partial π_{i}} \times π_{i}, \end{aligned}$ which implies that setting $f (q, π) = [\begin{matrix} \partial_{q_{1}} H (q, π), & \partial_{π_{1}} H (q, π), & \dots, & \partial_{q_{N}} H (q, π), & \partial_{π_{N}} H (q, π) \end{matrix}]$ we can represent even the Hamiltonian vector field of the N-fold 3D pendulum in terms of this group action.

5.2.1. Case N = 2

We have seen how it is possible to turn the equations of motion of a N-chain of pendulums into the Lie group integrators setting. Now we focus on the example with N = 2 pendulums. The equations of motion write (35) $\begin{aligned} {\dot{q}}_{1} & = {\hat{ω}}_{1} q_{1}, {\dot{q}}_{2} = {\hat{ω}}_{2} q_{2}, \\ R (q) [\begin{matrix} {\dot{ω}}_{1} \\ {\dot{ω}}_{2} \end{matrix}] & = [\begin{matrix} (- m_{2} L_{1} L_{2} | ω_{2} |^{2} {\hat{q}}_{2} + (m_{1} + m_{2}) g L_{1} {\hat{e}}_{3}) q_{1} \\ (- m_{2} L_{1} L_{2} | ω_{1} |^{2} {\hat{q}}_{1} + m_{2} g L_{2} {\hat{e}}_{3}) q_{2} \end{matrix}], \end{aligned}$ (35) where $R (q) = [\begin{matrix} (m_{1} + m_{2}) L_{1}^{2} I_{3} & m_{2} L_{1} L_{2} {\hat{q}}_{1}^{T} {\hat{q}}_{2} \\ m_{2} L_{1} L_{2} {\hat{q}}_{2}^{T} {\hat{q}}_{1} & m_{2} L_{2}^{2} I_{3} \end{matrix}] .$ As presented above, the matrix $R (q)$ defines a linear invertible map of the space $T_{q_{1}} S^{2} \times T_{q_{2}} S^{2}$ onto itself: $A_{(q_{1}, q_{2})} : T_{q_{1}} S^{2} \times T_{q_{2}} S^{2} \to T_{q_{1}} S^{2} \times T_{q_{2}} S^{2}, [ω_{1}, ω_{2}]^{T} \to R (q) [ω_{1}, ω_{2}]^{T} .$ We can easily see that it is well defined since $R (q) [\begin{matrix} ω_{1} \\ ω_{2} \end{matrix}] = [\begin{matrix} (m_{1} + m_{2}) L_{1}^{2} I_{3} & m_{2} L_{1} L_{2} {\hat{q}}_{1}^{T} {\hat{q}}_{2} \\ m_{2} L_{1} L_{2} {\hat{q}}_{2}^{T} {\hat{q}}_{1} & m_{2} L_{2}^{2} I_{3} \end{matrix}] [\begin{matrix} {\hat{v}}_{1} q_{1} \\ {\hat{v}}_{2} q_{2} \end{matrix}] = [\begin{matrix} {\hat{r}}_{1} q_{1} \\ {\hat{r}}_{2} q_{2} \end{matrix}] \in (T S^{2})^{2}$ with $\begin{aligned} r_{1} (q, ω) & := (m_{1} + m_{2}) L_{1}^{2} v_{1} + m_{2} L_{1} L_{2} {\hat{q}}_{2} {\hat{v}}_{2} q_{2}, \\ r_{2} (q, ω) & := m_{2} L_{1} L_{2} {\hat{q}}_{1} {\hat{v}}_{1} q_{1} + m_{2} L_{2}^{2} v_{2} . \end{aligned}$ This map guarantees that if we rewrite the pair of equations for the angular velocities in (Equation35(35) $\begin{aligned} {\dot{q}}_{1} & = {\hat{ω}}_{1} q_{1}, {\dot{q}}_{2} = {\hat{ω}}_{2} q_{2}, \\ R (q) [\begin{matrix} {\dot{ω}}_{1} \\ {\dot{ω}}_{2} \end{matrix}] & = [\begin{matrix} (- m_{2} L_{1} L_{2} | ω_{2} |^{2} {\hat{q}}_{2} + (m_{1} + m_{2}) g L_{1} {\hat{e}}_{3}) q_{1} \\ (- m_{2} L_{1} L_{2} | ω_{1} |^{2} {\hat{q}}_{1} + m_{2} g L_{2} {\hat{e}}_{3}) q_{2} \end{matrix}], \end{aligned}$ (35) ) as $\begin{aligned} \dot{ω} & = R^{- 1} (q) [\begin{matrix} (- m_{2} L_{1} L_{2} | ω_{2} |^{2} {\hat{q}}_{2} + (m_{1} + m_{2}) g L_{1} {\hat{e}}_{3}) q_{1} \\ (- m_{2} L_{1} L_{2} | ω_{1} |^{2} {\hat{q}}_{1} + m_{2} g L_{2} {\hat{e}}_{3}) q_{2} \end{matrix}] = R^{- 1} (q) b = \\ = A_{(q_{1}, q_{2})}^{- 1} (b) = [\begin{matrix} h_{1} \\ h_{2} \end{matrix}] \in T_{q_{1}} S^{2} \times T_{q_{2}} S^{2}, \end{aligned}$ then we are assured that there exists a pair of functions $a_{1}, a_{2} : T S^{2} \times T S^{2} \to R^{3}$ such that $\dot{ω} = [\begin{matrix} a_{1} (q, ω) \times q_{1} \\ a_{2} (q, ω) \times q_{2} \end{matrix}] = [\begin{matrix} h_{1} (q) \\ h_{2} (q) \end{matrix}] .$ Since we want $a_{i} \times q_{i} = h_{i}$ , we just impose $a_{i} = q_{i} \times h_{i}$ and hence the whole vector field can be rewritten as $[\begin{matrix} {\dot{q}}_{1} \\ {\dot{ω}}_{1} \\ {\dot{q}}_{2} \\ {\dot{ω}}_{2} \end{matrix}] = [\begin{matrix} ω_{1} \times q_{1} \\ (q_{1} \times h_{1}) \times q_{1} \\ ω_{2} \times q_{2} \\ (q_{2} \times h_{2}) \times q_{2} \end{matrix}] = F |_{(q, ω)},$ with $h_{i} = h_{i} (q, ω)$ and $[\begin{matrix} h_{1} (q, ω) \\ h_{2} (q, ω) \end{matrix}] = R^{- 1} (q) [\begin{matrix} (- m_{2} L_{1} L_{2} | ω_{2} |^{2} {\hat{q}}_{2} + (m_{1} + m_{2}) g L_{1} {\hat{e}}_{3}) q_{1} \\ (- m_{2} L_{1} L_{2} | ω_{1} |^{2} {\hat{q}}_{1} + m_{2} g L_{2} {\hat{e}}_{3}) q_{2} \end{matrix}] .$ Therefore, we can express the whole vector field in terms of the infinitesimal generator of the action of $S E (3) \times S E (3)$ as $ψ_{*} (f (q, ω)) |_{(q, ω)} = F |_{(q, ω)}$ through the function $f : T S^{2} \times T S^{2} \to s e (3) \times s e (3) ≃ R^{12}, (q, ω) \to (ω_{1}, q_{1} \times h_{1}, ω_{2}, q_{2} \times h_{2}) .$

5.3. Numerical experiments

In this section, we present some numerical experiment for the N-chain of pendulums. We start by comparing the various Lie group integrators that we have tested (with the choice N = 2), and conclude by analysing an implementation of variable step size. Lie group integrators allow to keep the evolution of the solution in the correct manifold, which is $T S^{2} \times T S^{2}$ when N = 2. Hence, we briefly report two sets of numerical experiments. In the first one, we show the convergence rate of all the Lie group integrators tested on this model. In the second one, we check how they behave in terms of preserving the two following relations:

$q_{i} (t)^{T} q_{i} (t) = 1, i . e . q_{i} (t) \in S^{2}, i = 1, 2,$
$q_{i} (t)^{T} ω_{i} (t) = 0, i . e . ω_{i} (t) \in T_{q_{i} (t)} S^{2}, i = 1, 2,$

completing the analysis with a comparison with the classical Runge–Kutta 4 and with ODE45 of MATLAB. The Lie group integrators used to obtain the following experiments are Lie Euler, Lie Euler Heun, three versions of Runge–Kutta–Munthe–Kaas methods of order 4 and one of order 3. The RKMK4 with two commutators mentioned in the plots is the one presented in Section 2, while the other schemes can be found for example in [Citation7].

Figure presents the plots of the errors, in logarithmic scale, obtained considering as a reference solution the one given by the ODE45 method, with strict tolerance. Here, we used an exact expression for the ${d e x p}_{σ}^{- 1}$ function. However, we could obtain the same results with a truncated version of this function, keeping a sufficiently high number of commutators, or after some clever manipulations of the commutators (as with RKMK4 with 2 commutators, see Section 2.2). The schemes show the right convergence rates, so we can move to the analysis of the time evolution on $T S^{2} \times T S^{2}$ .

Figure 5. Convergence rate of the implemented Lie group integrators, based on global error considering as a reference solution the one of ODE45, with strict tolerance.

In Figure , we can see the comparison of the time evolution of the 2-norms of $q_{1} (t)$ and $q_{2} (t)$ , for $0 \leq t \leq T = 5$ . As highlighted above, unlike classical numerical integrators like the one implemented in ODE45 or the Runge–Kutta 4, the Lie group methods preserve the norm of the base components of the solutions, i.e. $| q_{1} (t) | = | q_{2} (t) | = 1$ $\forall t \in [0, T]$ . Therefore, as expected, these integrators preserve the configuration manifold. However, to complete this analysis, we show the plots making a similar comparison but with the tangentiality conditions. Indeed, in Figure we compare the time evolutions of the inner products $q_{1} (t)^{T} ω_{1} (t)$ and $q_{2} (t)^{T} ω_{2} (t)$ for $t \in [0, 5]$ , i.e. we see if these integrators preserve the geometry of the whole phase space $T S^{2} \times T S^{2}$ . As we can see, while for Lie group methods these inner products are of the order of $10^{- 14}$ and $10^{- 15}$ , the ones obtained with classical integrators show that the tangentiality conditions are not preserved with the same accuracy.

Figure 6. Visualization of the quantity $1 - q_{i} (t)^{T} q_{i} (t)$ , i = 1, 2, for time $t \in [0, 5]$ . These plots focus on the preservation of the geometry of $S^{2}$ .

Figure 6. Visualization of the quantity 1−qi(t)Tqi(t), i = 1, 2, for time t∈[0,5]. These plots focus on the preservation of the geometry of S2.

We now move to some experiments on variable stepsize. In this last part, we focus on the RKMK pair coming from Dormand–Prince method (DOPRI 5(4) [Citation14]), which we denote with RKMK(5,4). The aim of the plots we show is to compare the same schemes, both with constant and variable stepsize. We start by setting a tolerance and solving the system with the RKMK(5,4) scheme. Using the same number of time steps, we solve it again with RKMK of order 5. These experiments show that, for some tolerance and some initial conditions, the step size's adaptivity improves the numerical approximation accuracy. Since we do not have an available analytical solution to quantify these two schemes' accuracy, we compare them with the solution obtained with a strict tolerance and ODE45. We compute such accuracy, at time T = 3, by means of the Euclidean norm of the ambient space $R^{6 N}$ .

Figure 7. Visualization of the inner product $q_{i} (t)^{T} ω_{i} (t)$ , i = 1, 2, for $t \in [0, 5]$ . These plots focus on the preservation of the geometry of $T_{q_{i} (t)} S^{2}$ .

Figure 7. Visualization of the inner product qi(t)Tωi(t), i = 1, 2, for t∈[0,5]. These plots focus on the preservation of the geometry of Tqi(t)S2.

In Figure , we compare the performance of the constant and variable stepsize methods, where the structure of the initial condition is always the same, but what changes is the number of connected pendulums. The considered initial condition is $(q_{i}, ω_{i}) = (\sqrt{2} / 2, 0, \sqrt{2} / 2, 0, 1, 0), \forall i = 1, \dots, N$ , and all the masses and lengths are set to 1. From these experiments, we can notice situations where the variable step size beats the constant one in terms of accuracy at the final time, like the case N = 2 which we discuss in more detail afterwards.

Figure 8. Comparison of accuracy at final time (on the left) and step adaptation for the case N = 20 (on the right), with all pendulums of length $L_{i} = 1$ .

The results presented in Figure (left) do not aim to highlight any particular relation between how the number of pendulums increases or the regularity of the solution. Indeed, as we add more pendulums, we keep incrementing the total length of the chain since $\sum_{i = 1}^{N} L_{i} = N$ . Thus here we do not have any appropriate limiting behaviour in the solution as $N \to + \infty$ . The behaviour presented in that figure seems to highlight an improvement in accuracy for the RKMK5 method as N increases. However, this is biased by the fact that when we increase N, to achieve the fixed tolerance of $10^{- 6}$ with RKMKK(5,4), we need more time steps in the discretization. Thus, this plot does not say that as N increases, the dynamics becomes more regular; it suggests that the number of required timesteps increases faster than the ‘degree of complexity’ of the dynamics.

Figure 9. In these plots, we represent the six components of the solution describing the dynamics of the first mass (on the left) and of the second mass (on the right), for the case N = 2. We compare the behaviour of the solution obtained with constant stepsize RKMK5, the variable stepsize RKMK(5,4) and ODE45. (a) $(q_{1} (t), ω_{1} (t))$ (b) $(q_{2} (t), ω_{2} (t))$ .

Figure 10. Comparison of accuracy at final time (on the left) and step adaptation for the case N = 20 (on the right), with all pendulums of length $L_{i} = 5 / N$ .

For the case N = 2, we notice a relevant improvement passing to variable stepsize. In Figures and we can see that, for this choice of the parameters, the solution behaves smoothly in most of the time interval, but then there is a peak in the second component of the angular velocities of both the masses, at $t \approx 2.2$ . We can observe this behaviour both in the plots of Figure , where we project the solution on the 12 components and even in Figure (c). In the latter, we plot two of the vector field components, i.e. the second components of the angular accelerations ${\dot{ω}}_{i} (t)$ , i = 1, 2. They show an abrupt change in the vector field in correspondence to $t \approx 2.2$ , where the step is considerably restricted. Thus, to summarize, the gain we see with variable stepsize when N = 2 is motivated by the unbalance in the length of the time intervals with no abrupt changes in the dynamics and those where they appear. Indeed, we see that apart from a neighbourhood of $t \approx 2.2$ , the vector field does not change quickly. On the other hand, for the case N = 20, this is the case. Thus the adaptivity of the stepsize does not bring relevant improvements in the latter situation.

Figure 11. On the left, we compare the adaptation of the stepsize of RKMK(5,4) with the one of ODE45 and with the constant stepsize of RKMK5. In the centre, we plot the second component of the angular velocities $ω_{i}^{(2)}$ , i = 1, 2, and we zoom in the last time interval $t \in [2.1, 3]$ to see that the variable stepsize version of the method better reproduces the reference solution. On the right, we visualize the speed of variation of second component of the angular velocities. (a) Step adaptation, (b) Zoom at final times, (c) Values of ${\dot{ω}}_{i}^{(2)} (t)$ .

Figure 12. Two quadrotors connected to the mass point $m_{y}$ via massless links of lengths $L_{i}$ .

The motivating application behind our choice of this mechanical system has been some intuitive relation with a beam model, as highlighted in the introduction of this work. However, for this limiting behaviour to make sense, we should fix the length of the entire chain of pendulums to some L (the length of the beam at rest) and then set the size of each pendulum to $L_{i} = L / N$ . In this case, keeping the same tolerance of $10^{- 6}$ for RKMK(5,4), we get the results presented in the following plot. We do not investigate more in detail this approach, which might be relevant for further work, however, we highlight that here the step adaptivity improves the results as we expected.

6. Dynamics of two quadrotors transporting a mass point

In this section, we consider a multibody system made of two cooperating quadrotor unmanned aerial vehicles (UAV) connected to a point mass (suspended load) via rigid links. This model is described in [Citation28,Citation29].

We consider an inertial frame whose third axis goes in the direction of gravity, but opposite orientation, and we denote with $y \in R^{3}$ the mass point and with $y_{1}, y_{2} \in R^{3}$ the two quadrotors. We assume that the links between the two quadrotors and the mass point are of a fixed length $L_{1}, L_{2} \in R^{+}$ . The configuration variables of the system are: the position of the mass point in the inertial frame, $y \in R^{3}$ , the attitude matrices of the two quadrotors, $(R_{1}, R_{2}) \in (S O (3))^{2}$ and the directions of the links which connect the centre of mass of each quadrotor respectively with the mass point, $(q_{1}, q_{2}) \in (S^{2})^{2}$ . The configuration manifold of the system is $Q = R^{3} \times (S O (3))^{2} \times (S^{2})^{2}$ .

In order to present the equations of motion of the system, we start by identifying $T S O (3) ≃ S O (3) \times s o (3)$ via left trivialization. This choice allows us to write the kinematic equations of the system as (36) ${\dot{R}}_{i} = R_{i} {\hat{Ω}}_{i}, {\dot{q}}_{i} = {\hat{ω}}_{i} q_{i} i = 1, 2,$ (36) where $Ω_{1}, Ω_{2} \in R^{3}$ represent the angular velocities of each quadrotor, respectively, and $ω_{1}, ω_{2}$ express the time derivatives of the orientations $q_{1}, q_{2} \in S^{2}$ , respectively, in terms of angular velocities, expressed with respect to the body-fixed frames. From these equations, we define the trivialized Lagrangian $L (y, \dot{y}, R_{1}, Ω_{1}, R_{2}, Ω_{2}, q_{1}, ω_{1}, q_{2}, ω_{2}) : R^{6} \times {(S O (3) \times s o (3))}^{2} \times (T S^{2})^{2} \to R,$ as the difference of the total kinetic energy of the system and the total potential (gravitational) energy, L = T−U, with: $T = \frac{1}{2} m_{y} ∥ \dot{y} ∥^{2} + \frac{1}{2} \sum_{i = 1}^{2} (m_{i} ∥ \dot{y} - L_{i} {\hat{ω}}_{i} q_{i} ∥^{2} + Ω_{i}^{T} J_{i} Ω_{i}),$ and $U = - m_{y} g e_{3}^{T} y - \sum_{i = 1}^{2} m_{i} g e_{3}^{T} (y - L_{i} q_{i}),$ where $J_{1}, J_{2} \in R^{3 \times 3}$ are the inertia matrices of the two quadrotors and $m_{1}, m_{2} \in R^{+}$ are their respective total masses. In this system, each of the two quadrotors generates a thrust force, which we denote with $u_{i} = - T_{i} R_{i} e_{3} \in R^{3}$ , where $T_{i}$ is the magnitude, while $e_{3}$ is the direction of this vector in the i th body-fixed frame, i = 1, 2. The presence of these forces make it a nonconservative system. Moreover, the rotors of the two quadrotors generate a moment vector, and we denote with $M_{1}, M_{2} \in R^{3}$ the cumulative moment vector of each of the two quadrotors. To derive the Euler–Lagrange equations, a possible approach is through Lagrange–d'Alambert's principle, as presented in [Citation28]. We write them in matrix form as (37) $A (z) \dot{z} = h (z)$ (37) where $\begin{aligned} z & = [y, v, Ω_{1}, Ω_{2}, ω_{1}, ω_{2}]^{T} \in R^{18}, \\ A (z) & = [\begin{matrix} I_{3} & 0_{3} & 0_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & M_{q} & 0_{3} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & J_{1} & 0_{3} & 0_{3} & 0_{3} \\ 0_{3} & 0_{3} & 0_{3} & J_{2} & 0_{3} & 0_{3} \\ 0_{3} & - \frac{1}{L_{1}} {\hat{q}}_{1} & 0_{3} & 0_{3} & I_{3} & 0_{3} \\ 0_{3} & - \frac{1}{L_{2}} {\hat{q}}_{2} & 0_{3} & 0_{3} & 0_{3} & I_{3} \end{matrix}], \\ h (z) & = [\begin{matrix} h_{1} (z) \\ h_{2} (z) \\ h_{3} (z) \\ h_{4} (z) \\ h_{5} (z) \\ h_{6} (z) \end{matrix}] = [\begin{matrix} v \\ - \sum_{i = 1}^{2} m_{i} L_{i} ∥ ω_{i} ∥^{2} q_{i} + M_{q} g e_{3} + \sum_{i = 1}^{2} u_{i}^{∥} \\ - Ω_{1} \times J_{1} Ω_{1} + M_{1} \\ - Ω_{2} \times J_{2} Ω_{2} + M_{2} \\ - \frac{1}{L_{1}} g {\hat{q}}_{1} e_{3} - \frac{1}{m_{1} L_{1}} q_{1} \times u_{1}^{⊥} \\ - \frac{1}{L_{2}} g {\hat{q}}_{2} e_{3} - \frac{1}{m_{2} L_{2}} q_{2} \times u_{2}^{⊥} \end{matrix}], \end{aligned}$ where $M_{q} = m_{y} I_{3} + \sum_{i = 1}^{2} m_{i} q_{i} q_{i}^{T},$ and $u_{i}^{∥}, u_{i}^{⊥}$ are respectively the orthogonal projection of $u_{i}$ along $q_{i}$ and to the plane $T_{q_{i}} S^{2}$ , i = 1, 2, i.e. $u_{i}^{∥} = q_{i} q_{i}^{T} u_{i}$ , $u_{i}^{⊥} = (I - q_{i} q_{i}^{T}) u_{i}$ . These equations, coupled with the kinematic equations in (Equation36(36) ${\dot{R}}_{i} = R_{i} {\hat{Ω}}_{i}, {\dot{q}}_{i} = {\hat{ω}}_{i} q_{i} i = 1, 2,$ (36) ), describe the dynamics of a point $P = [y, v, R_{1}, Ω_{1}, R_{2}, Ω_{2}, q_{1}, ω_{1}, q_{2}, ω_{2}] \in M = T Q .$ Since the matrix $A (z)$ is invertible, we pass to the following set of equations: (38) $\dot{z} = A^{- 1} (z) h (z) := \tilde{h} (z) := \bar{h} (P) = [{\bar{h}}_{1} (P), \dots, {\bar{h}}_{7} (P)]^{T} .$ (38)

6.1. Analysis via transitive group actions

We identify the phase space M with $M ≃ T R^{3} \times (T S O (3))^{2} \times (T S^{2})^{2}$ . The group we consider is $\bar{G} = R^{6} \times (T S O (3))^{2} \times (S E (3))^{2},$ where the groups are combined with a direct-product structure and $R^{6}$ is the additive group. For a group element $g = ((a_{1}, a_{2}), ((B_{1}, b_{1}), (B_{2}, b_{2})), ((C_{1}, c_{1}), (C_{2}, c_{2}))) \in \bar{G}$ and a point $P \in M$ in the manifold, we consider the following left action: $\begin{aligned} ψ_{g} (P) & = [y + a_{1}, v + a_{2}, B_{1} R_{1}, Ω_{1} + b_{1}, B_{2} R_{2}, Ω_{2} + b_{2}, \\ C_{1} q_{1}, C_{1} ω_{1} + c_{1} \times C_{1} q_{1}, C_{2} q_{2}, C_{2} ω_{2} + c_{2} \times C_{2} q_{2}] . \end{aligned}$ The well-definiteness and transitivity of this action come from standard arguments, see, e.g. [Citation43]. The infinitesimal generator associated to $ξ = [ξ_{1}, ξ_{2}, η_{1}, η_{2}, η_{3}, η_{4}, μ_{1}, μ_{2}, μ_{3}, μ_{4}] \in \bar{g},$ where $\bar{g} = T_{e} \bar{G}$ , writes $\begin{aligned} ψ_{*} (ξ) |_{P} & = [ξ_{1}, ξ_{2}, {\hat{η}}_{1} R_{1}, η_{2}, {\hat{η}}_{3} R_{2}, η_{4}, \\ {\hat{μ}}_{1} q_{1}, {\hat{μ}}_{1} ω_{1} + {\hat{μ}}_{2} q_{1}, {\hat{μ}}_{3} q_{2}, {\hat{μ}}_{3} ω_{2} + {\hat{μ}}_{4} q_{2}] . \end{aligned}$ We can now focus on the construction of the function $f : M \to \bar{g}$ such that $ψ_{*} (f (P)) |_{P} = F |_{P}$ , where $\begin{aligned} F |_{P} & = [{\bar{h}}_{1} (P), {\bar{h}}_{2} (P), R_{1} {\hat{Ω}}_{1}, {\bar{h}}_{3} (P), R_{2} {\hat{Ω}}_{2}, \\ {\bar{h}}_{4} (P), {\hat{ω}}_{1} q_{1}, {\bar{h}}_{5} (P), {\hat{ω}}_{2} q_{2}, {\bar{h}}_{6} (P)] \in T_{P} M \end{aligned}$ is the vector field obtained combining Equations (Equation36(36) ${\dot{R}}_{i} = R_{i} {\hat{Ω}}_{i}, {\dot{q}}_{i} = {\hat{ω}}_{i} q_{i} i = 1, 2,$ (36) ) and (Equation38(38) $\dot{z} = A^{- 1} (z) h (z) := \tilde{h} (z) := \bar{h} (P) = [{\bar{h}}_{1} (P), \dots, {\bar{h}}_{7} (P)]^{T} .$ (38) ). We have $\begin{aligned} f (P) & = [{\bar{h}}_{1} (P), {\bar{h}}_{2} (P), R_{1} Ω_{1}, {\bar{h}}_{3} (P), R_{2} Ω_{2}, {\bar{h}}_{4} (P), \\ ω_{1}, q_{1} \times {\bar{h}}_{5} (P), ω_{2}, q_{2} \times {\bar{h}}_{6} (P)] \in \bar{g} . \end{aligned}$ We have obtained the local representation of the vector field $F \in X (M)$ in terms of the infinitesimal generator of the transitive group action ψ, hence we can solve for one time step $Δ t$ the IVP $\{\begin{cases} \dot{σ} (t) = {d e x p}_{σ (t)}^{- 1} (f (ψ (\exp (σ (t)), P (t)))) \\ σ (0) = 0 \in \bar{g} \end{cases}$ and then update the solution $P (t + Δ t) = ψ (\exp (σ (Δ t)), P (t))$ .

The above construction is completely independent of the control functions ${u_{i}^{∥}, u_{i}^{⊥}, M_{i}}_{i = 1, 2}$ and hence it is compatible with any choice of these parameters.

6.2. Numerical experiments

We tested Lie group numerical integrators for a load transportation problem presented in [Citation29]. The control inputs ${u_{i}^{∥}, u_{i}^{⊥}, M_{i}}_{i = 1, 2}$ are constructed such that the point mass asymptotically follows a given desired trajectory $y_{d} \in R^{3}$ , given by a smooth function of time, and the quadrotors maintain a prescribed formation relative to the point mass. In particular, the parallel components $u_{i}^{∥}$ are designed such that the payload follows the desired trajectory $y_{d}$ (load transportation problem), while the normal components $u_{i}^{⊥}$ are designed such that $q_{i}$ converge to desired directions $q_{i d}$ (tracking problem in $S_{2}$ ). Finally, $M_{i}$ are designed to control the attitude of the quadrotors.

In this experiment, we focus on a simplified dynamics model, i.e. we neglect the construction of the controllers $M_{i}$ for the attitude dynamics of the quadrotors. However, the full dynamics model can also be easily integrated, once the expressions for the attitude controllers are available.

In Figure , we show the convergence rate of four different RKMK methods compared with the reference solution obtained with ODE45 in MATLAB.

Figure 13. Convergence rate of the numerical schemes compared with ODE45.

In Figures –, we show results in the tracking of a parabolic trajectory, obtained by integrating the system (Equation37(37) $A (z) \dot{z} = h (z)$ (37) ) with a RKMK method of order 4.

Figure 14. Snapshots at $0 \leq t \leq 5$ .

Figure 15. Components of the load position (in blue) and the desired trajectory (in red) as a function time.

Figure 16. Deviation of the load position from the target trajectory.

Figure 17. Direction error of the links.

Figure 18. Preservation of the norms of $q_{1}, q_{2} \in S^{2}$ .

Figure 18. Preservation of the norms of q1,q2∈S2.

7. Summary and outlook

In this paper, we have considered Lie group integrators with a particular focus on problems from mechanics. In mathematical terms, this means that the Lie groups and manifolds of particular interest are $S O (n), n = 2, 3$ , $S E (n), n = 2, 3$ as well as the manifolds $S^{2}$ and $T S^{2}$ . The abstract formulations by, e.g. Crouch and Grossman [Citation11], Munthe-Kaas [Citation41] and Celledoni et al. [Citation6] have often been demonstrated on small toy problems in the literature, such as the free rigid body or the heavy top systems. But in papers like [Citation4], hybrid versions of Lie group integrators have been applied to more complex beam and multi-body problems. The present paper is attempting to move in the direction of more relevant examples without causing the numerical solution to depend on how the manifold is embedded in an ambient space, or the choice of local coordinates.

It will be the subject of future work to explore more examples and to aim for a more systematic approach to applying Lie group integrators to mechanical problems. In particular, it is of interest to the authors to consider models of beams that could be seen as a generalization of the N-fold pendulum discussed here.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Marie Sklodowska-Curie [860124].

Notes

1 If the Lie group action is smooth, a map f of the same regularity as F can be found [Citation53].

ω_{(g, μ)}

is obtained from the natural symplectic form on

T^{*} G

(which is a differential two-form), defined as

Ω_{(g, p_{g})} ((δ v_{1}, δ π_{1}), (δ v_{2}, δ π_{2})) = 〈 δ π_{2}, δ v_{1} 〉 - 〈 δ π_{1}, δ v_{2} 〉,

by right trivialization.

3 The Euler–Poincaré equations are Euler–Lagrange equations with respect to a Lagrange–d'Alembert principle obtained taking constraint variations.

4 In this paper, we will assume $\tilde{p} < p$ in which case the local error estimate is relevant for the approximation ${\tilde{y}}_{n + 1}$ .

5 There are many options for how to do this in practice, and the choice may also depend on the application. E.g. a Riemannian metric is a natural and robust alternative here.

6 It follows from the definition of the inertia tensor, i.e. $0 \leq \tilde{T} (q, \dot{q}) = \frac{1}{2} \sum_{i = 1}^{N} (\sum_{j \geq i} m_{j}) L_{i} L_{j} {\dot{q}}_{i}^{T} {\dot{q}}_{j} := \frac{1}{2} {\dot{q}}^{T} M \dot{q} .$ Moreover, in this situation it is even possible to explicitly find the Cholesky factorization of the matrix M with an iterative algorithm.

References