1,697
Views
0
CrossRef citations to date
0
Altmetric
Research Article

MAST-RT0 solution of the incompressible Navier–Stokes equations in 3D complex domains

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 53-93 | Received 03 Aug 2020, Accepted 01 Dec 2020, Published online: 29 Dec 2020

Abstract

A new numerical methodology to solve the 3D Navier-Stokes equations for incompressible fluids within complex boundaries and unstructured body-fitted tetrahedral mesh is presented and validated with three literature and one real-case tests. We apply a fractional time step procedure where a predictor and a corrector problem are sequentially solved. The predictor step is solved applying the MAST (Marching in Space and Time) procedure, which explicitly handles the non-linear terms in the momentum equations, allowing numerical stability for Courant number greater than one. Correction steps are solved by a Mixed Hybrid Finite Elements discretization that assumes positive distances among tetrahedrons circumcentres. In 3D problems, non-Delaunay meshes are provided by most of the mesh generators. To maintain good matrix properties for non-Delaunay meshes, a continuity equation is integrated over each tetrahedron, but the momentum equations are integrated over clusters of tetrahedrons, such that each external face shared by two clusters belongs to two tetrahedrons whose circumcentres have positive distance. A numerical procedure is proposed to compute the velocities inside clusters with more than one tetrahedron. Model preserves mass balance at the machine error and there is no need to compute pressure at each time iteration, but only at target simulation times.

1. Introduction

The Navier–Stokes Equations (NSEs) govern external and internal flows in many real-life industrial, environmental and biological problems (e.g. aircraft and ship problems, rotomachinery blade applications, hemodynamic and biological flows). The challenging research topics involved in the numerical solution of such mathematical problems mainly concern the choice of the velocity-pressure coupling algorithm and the design of the computational mesh discretizing the physical domain.

In compressible flows, pressure and density are linked by the state algebraic equation. In contrast, in incompressible flows, density is constant and pressure has to be solved along with velocities from all the momentum and continuity equations. With this motivation, the numerical algorithms used to solve the NSEs can generally be divided into density-based and pressure-based solvers (Shyy & Mittal, Citation1998; Tao, Citation2001). Density-based solvers are commonly applied to high-Ma compressible flows, while pressure-based solvers were originally proposed for incompressible flows, and then successfully extended to compressible flows (Tao, Citation2001).

Two approaches are generally used in pressure-based solvers, namely the direct (or coupled) approach and the segregated approach (e.g. Mazhar, Citation2016 and cited references). In the first case, the whole set of momentum and continuity equations is solved simultaneously, resulting in a strong coupling between pressure and velocity. The main drawback is the large amount of required computational effort and computer memory, which makes this approach unsuitable to facing many practical engineering applications (Darwish et al., Citation2015). In the segregated approach, pressure and velocity are solved separately and sequentially, using previously computed values of the other dependent variable. The core of the problem is how to update the pressure field so that the divergence-free velocity field condition is satisfied.

Several numerical procedures can be distinguished in the segregated approach, e.g. the projection method (Chorin, Citation1968; Kim & Moin, Citation1985), the penalty method (Braaten & Shyy, Citation1986; Hughes et al., Citation1979), the artificial compressibility method (Harlow & Welch, Citation1965; Malan et al., Citation2002; Vrahliotis et al., Citation2012), and the pressure-correction method (Ozoe & Tao, Citation2001; Patankar, Citation1980).

The fractional step projection method (Chorin, Citation1968; Kim & Moin, Citation1985) and the SIMPLE (Semi-Implicit Method for Pressure-Linked Equations) pressure-correction method (Patankar, Citation1980, Citation1981) have become very popular. In both methodologies, a pressure correction is introduced to improve the velocities computed from the solution of the momentum equations, and to satisfy the continuity equation. Projection methods, commonly regarded as fractional-step methods, can be classified into pressure-correction methods, velocity-correction methods and consistent splitting methods (Guermond et al., Citation2006). A sequence of decoupled elliptic equations for velocity and pressure has to be solved at each time step, and this represents one of the most attractive features of such procedures, especially for simulations of large-scale problems (Guermond et al., Citation2006). In the projection method, the pressure correction Poisson equation is solved once per time step, while for the SIMPLE method, the momentum and pressure correction equations are solved several times in each time step. For these reasons, projection methods could handle unsteady flow problems more easily than SIMPLE.

The SIMPLE-family algorithms (e.g. SIMPLER, SIMPLEC, SIMPLEX) are extensions of and improvements to the SIMPLE algorithm. Other pressure correction methods can be regarded as further extensions of the SIMPLE-family models, for example PISO and CLEAR (e.g. Aguerre et al., Citation2020 and cited works). Segregated solvers like the SIMPLE-family algorithms have shown poor convergence, especially when used for swirling flow fields (Hanby & Silvester, Citation1996). In these problems, where the coupling between radial and tangential momentum equations is strong, the linearization of the momentum equations leads to a sequential solution of these equations, without accounting for the coupling between the momentum equations and the velocity components (Hanby & Silvester, Citation1996). Even though several ‘ad hoc’ procedures have been presented in order to overcome the above-mentioned convergence problems (e.g. Gosman et al., Citation1976), the need of parameter calibration for convergence of the iterative procedure requires a high computational effort, which limits the application of these algorithms. These reasons, along with the increase of computer performance and memory, have motivated several authors towards new coupled approaches (e.g. Darwish et al., Citation2009 and cited references).

Other options have been proposed in the literature, e.g. the vorticity-stream function methods, where pressure is eliminated from the governing equations and velocity and pressure are replaced by the vorticity and the stream function (e.g. Calhoun, Citation2002 and cited references). Two main reasons have limited their use: the difficulty of handling wall boundary conditions and the difficulties arising in the extension from 2D to 3D problems (Calhoun, Citation2002).

It is widely recognized that spatial discretization of the incompressible flow equations on collocated grids leads to unphysical odd–even coupling of the pressure (i.e. the so-called spurious checkerboard modes) (e.g. Dalal et al., Citation2008; Perron et al., Citation2004). Staggered grids have been one of the most common ways to overcome these problems, where different grid points for velocity and pressure are used (e.g. Harlow & Welch, Citation1965; Perot, Citation2000).

In the last decade, the use of unstructured grids has become popular due to their capacity to discretize real arbitrary domains and to easily get local refinement. Geometric complexity is a drawback for a straightforward extension of the staggered mesh to unstructured grids and most solvers require the addition of terms which could generate unphysical solutions and loss of mass conservations (e.g. Perot, Citation2000). Moreover, the time step limitation required by the Courant condition can become very severe, due to the existence of few computational elements much smaller than the average ones.

In the past few decades, Finite Volume Methods (FVMs) (e.g. Kim & Choi, Citation2000; Mathur & Murthy, Citation1997; Perron et al., Citation2004; Plana Fattori et al., Citation2013; Vidovic et al., Citation2004) and Finite Element Methods (FEMs) (e.g. Bazilevs et al., Citation2013; Fortin, Citation1981; Pai et al., Citation2013; Zienkiewicz et al., Citation2013) have been preferred to other methods (e.g. finite difference methods) in handling irregular boundaries, because they can use unstructured triangular or tetrahedral meshes, which can easily discretize arbitrary geometries. A large multitude of FVMs has been proposed in the last few decades, with different choices of control volumes, both for collocated and staggered grids. Many of these proposed techniques suffer from a ‘non-orthogonality condition’, as pointed out in (Gao et al., Citation2012), in discretization of the second-order partial derivative terms in the momentum equations and in the pressure-correction Poisson equation. Hybrid FV/FEMs (e.g. Busto et al., Citation2018; Gao et al., Citation2012) take advantage of both FVMs and FEMs, discretizing the momentum equations with the FVMs and the Poisson equation with the FEMs. The Discontinuous Galerkin Finite Element Methods (DG-FEMs) (e.g. Bassi & Rebay, Citation1997; Lehrenfeld & Schöberl, Citation2016 and cited references) combine the performing features of the FEMs and FVMs. Like the classical FEMs, the DG-FEMs achieve high-order accuracy using high-order polynomial approximation within an element rather than using wide stencils, as in the case of finite volume methods. The physics of wave propagation is however correctly accounted for by the solution of a Riemann problem (e.g. Toro et al., Citation2020 and cited references) arising from the discontinuous solution at element interfaces, which makes them similar to FVMs (e.g. Toro & Vazquez-Cendon, Citation2012). The main drawback of DG-FEMs, compared to the classical FEMs and FVMs, is that they require solutions of systems of equations with more unknowns for the same grids, and have been recognized to be very demanding in terms of both computational costs and storage requirements.

Recently, mesh-free methods have received increasing attention from researchers, due to their ability to solve flow problems with complex geometries or involving multi-phases fluids. The Smoothed Particle Hydrodynamics (SPH) method is a pure Lagrangian, mesh-free procedure, originally proposed for the solution of astrophysical problems (Gingold & Monaghan, Citation1977; Lucy, Citation1977). The fluid domain is discretized by a set of moving particles, and the governing equations are solved for each of them. Each particle has a support domain with a characteristic spatial distance over which their physical properties (namely pressure and velocity) are ‘smoothed’ by means of a kernel function, generally with a Gaussian-type shape. Thanks to its quite easy implementation, this procedure has largely been applied for the solution of compressible and incompressible NSEs (e.g. Oger et al., Citation2007 and cited references). One of the main drawbacks of the Lagrangian schemes is that the relations among particles have to be updated at each time step, and the topology setting of the matrix system of the Poisson pressure-correction equations has to be performed at each time iteration, as well as its factorization operations.

During the last years, Virtual Elements Methods (VEMs) (e.g. Beirão da Veiga et al., Citation2018, and cited references) have been proposed as a new FEM paradigm to solve Partial Differential Equations. VEMs construct a conforming Galerkin FE scheme, dealing with general polygonal/polyhedral mesh elements, also with non-convex shape. VEMs have been recently applied for the solution of the NSEs in 2D problems (e.g. Beirão da Veiga et al., Citation2018, and cited references), but their extension to 3D problems is still limited to very simple domain geometries and boundary shapes (Liu & Chen, Citation2019).

In the present paper, we propose a new finite volume solver for the solution of 3D incompressible NSEs over unstructured tetrahedral meshes. We apply a fractional time step procedure where a predictor and a corrector problem are sequentially solved. The procedure presents substantial differences compared to the fractional step procedures presented in the literature, based upon the common projection procedures (e.g. Guermond et al., Citation2006; Perron et al., Citation2004). The predictor step (PS) is carried out by applying the Eulerian Finite Volume MAST (Marching in Space and Time) numerical procedure, recently proposed for the solution of shallow waters and groundwater problems (Aricò et al., Citation2007; Aricò et al., Citation2011, Citation2012, Citation2013a, Citation2013b; Aricò & Tucciarelli, Citation2007a, Citation2007b, Citation2009). In this step, all the terms are retained in the momentum equations. The major advantages of MAST are the following: (1) it explicitly handles the non-linear momentum terms in the momentum equations, by means of the sequential solution of a three-variable Ordinary Differential Equations (ODEs) system for each computational cell, with a computational effort which is simply proportional to the number of tetrahedral elements, (2) it provides numerical stability with respect to large Courant–Friedrichs-Lewy (CFL) numbers, that can be much greater than one, at a cost of local accuracy reduction (Aricò et al., Citation2007; Aricò & Tucciarelli, Citation2007a, Citation2007b). The correction step is split into two parts, named CS1 step and CS2 step. In CS1 step, three linear systems are solved for the three velocity components (one system for each velocity component), which update the viscous terms in the momentum equations. In the second corrector step (CS2), a single linear system is solved for the pressure correction unknown. The matrices of the systems of CS1 and CS2 steps are well conditioned, as they are sparse, symmetric and diagonally dominant, and lead to a very fast solution of the associated systems by the use of a preconditioned conjugated gradient solver. The matrix coefficients are constant in time, computed and factorized only once at the beginning of the numerical simulation. This makes it possible to save a lot of computational time, compared to other numerical schemes (e.g. Lagrangian schemes).

Both CS1 and CS2 steps are solved assuming a mass lumping Mixed Hybrid Finite Element (MHFE) (e.g. Auricchio et al., Citation2017, and cited references) discretization inside each tetrahedron, similar to the one proposed by Younes et al. (Citation2006). The mass lumping option has been chosen because it is easy to be used together with tetrahedral elements. To maintain good convergence and accuracy properties, our MHFE scheme assumes the distance among circumcenters to be positive, a condition which is always satisfied in the Delaunay meshes. Unfortunately, in 3D problems either bad-quality Delaunay or non-Delaunay meshes are provided by most of the available mesh generators (Li & Teng, Citation2001). To cope with this problem, and to use non-Delaunay meshes still saving the previously mentioned good matrix properties, in the present procedure, a continuity equation is integrated over each single tetrahedron, but the momentum equations are integrated over clusters of tetrahedrons, such that each external face shared by two different clusters is part of two tetrahedrons whose circumcenters have positive distance. To define the velocities inside each cluster with more than one tetrahedron correctly, the minimum energy condition is finally enforced for velocities inside the clusters.

In the proposed numerical solver, mass balance is always preserved at the error machine precision, and there is no need to update the pressure values at each time iteration, because only the pressure gradient appear in the NSEs. We compute the spatial pressure distribution only at target simulation times.

The paper is organized as follows. In section 2.1, we present the governing equations and the projection-correction formulation of the problem and in section 2.2 the spatial discretization. In sections 3.1, 3.2, and 3.3, we present the numerical details of the solution of respectively the PS, CS1 and CS2 steps as defined in section 2.1. In section 4, we present the extension of the previous algorithms to the real case of 3D non-Delaunay meshes and in section 5, we show the model application to three well-known literature tests and one real-life case, as well as an analysis of the associated computational costs.

2. RT0 spatial discretization of the governing equations

2.1. Governing equations and fractional time step discretization

We solve the 3D Navier–Stokes Equations (NSEs) for a real and incompressible fluid, (1) ut+(u)u+Ψν2u=0(1) (2) u=0(2) where Equations (1) are the momentum conservation equations, Equation (2) is the mass conservation equation, t is time, ν kinematic viscosity, u the velocity vector whose components are u, v, and w, along the x, y and z directions respectively, and ψ is the kinematic pressure p/ρ, where p is the fluid pressure and ρ is the fluid density, constant in space and time for an incompressible fluid. The governing equations are solved for the u and ψ unknowns.

The problem is well-posed if we correctly assign the initial and boundary conditions (ICs and BCs, respectively). With respect to the BCs, we assign either (a) all the velocity components (essential BCs) or (b) the stress vector (natural BCs) or (c) a combination of the previous ones called free-slip BC. Let Ω be the computational domain and Γ its boundary surface, and let us call Γu, Γσ and Γm the three non-overlapping portions, where BCs (a), (b) and (c) respectively apply, such that Γ = Γu + Γσ+ Γm. We formulate the BCs as (3a) u(x)=g(x)xΓut0(3a) (3b) σ(x)=(Ψ2νunn)nν(unτ1+uτ1n)t1ν(unτ2+uτ2n)t2 , xΓσt0(3b) (3c) un(x)=0 and τ(x)=0,xΓmt0(3c) where x is the coordinate position vector, g is the velocity vector assigned at the boundary, n is the unit outward vector normal to the boundary, and t1 and t2 are two other orthogonal unit vectors such that n, t1 and t2 give the reference frame attached to the tetrahedron face, τ is the stress tangent vector component in the t1-t2 plane, un, uτ1 and uτ2 are the corresponding u components. In the following sections, for simplicity’s sake we assume only hydrostatic stress occurring along Γσ, which is equivalent to assuming the stress normal to the boundary plane and to neglecting all the viscous terms in Equation (3b).

The assigned ICs on the system, in the Ω¯ (Ω¯=ΩΓ) domain are (4) u=u0withu0=0andΨ=Ψ0 at t=0(4)

As mentioned in section 1, we apply a predictor–corrector projection procedure, sequentially solving one predictor and two corrector problems. The predictor and the first corrector steps deal with the momentum equations, while in the second corrector step we combine the mass and momentum conservation equations to enforce the divergence-free condition.

In the next sections, time levels tk, tk+1/3, tk+2/3, tk+1 represent the beginning of the generic time iteration, the end of the prediction step, as well as the end of the first and second corrector steps, respectively, and tk+1 also marks the end of the time iteration. Superscripts k, k+1/3, k+2/3 and k+1 mark the values of the variables (i.e. u and ψ) at the corresponding time levels.

In vector-matrix form, we write the momentum Equations (1) as (5a) Ut+F(U)=B(U)+E(U)(5a) (5b) U=(uvw)T,F(U)=(F1F2F3)T,E(U)=(E1E2E3)TB(U)=(xΨyΨzΨ)T(5b) (5c) F1=(uuuvuw)T,F2=(vuvvvw)T,F3=(wuwvww)T(5c) (5d) Ej=(νuxjνvxjνwxj)Tj=1,2,3x1=x,x2=y,x3=z(5d)

In the framework of a fractional time step procedure, we set (6a) E=Ek1/3+(EEk1/3)(6a) (6b) B=Bk+(BBk)(6b) and we split Equation (5a) into (7a) Ut+F=Ek1/3+Bk(7a) (7b) Ut=EEk1/3(7b) (7c) Ut=BBk(7c) where Ek−1/3 marks the matrix E computed at the end of the first correction system of the previous time iteration and Bk marks the pressure gradient term at the beginning of the new time step. Functional analysis easily shows that, due to the stationarity of the pressure gradient term, Equation (7a) form a fully convective system, with only one characteristic line passing through the (x, t) point, while the systems in Equation (7b) and (7c) are fully parabolic (Aricò et al., Citation2007; Aricò & Tucciarelli, Citation2009). Integrating in time, from Equation (7) we get (8a) Uk+1/3Uk+0ΔtFdt=Ek1/3Δt+BkΔt(8a) (8b) Uk+2/3Uk+1/3=Ek+2/3ΔtEk1/3Δt(8b) (8c) Uk+1Uk+2/3=(Bk+1Bk)Δt(8c) where system (8a) is the prediction step (PS), (8b) and (8c) are the first and second correction systems (CS1 and CS2), respectively.

Summing systems (8), the integral of the original system (1) is formally obtained. Further details on time discretization will be given below in section 3.

2.2. RT0 spatial tetrahedral discretization of pressure and velocity

We discretize the computational domain by means of NT non-overlapping tetrahedrons (named also elements) and assume the velocity field in each tetrahedron e, ue(x)Xe, where Xe is the lowest-order Raviart-Thomas (RT0) space (Raviart & Thomas, Citation1977), such that (9) ue(x)=j=1,4ωjeQje with ωje=(xxje)3We(9) where ωje are the space basis functions of Xe, We is the volume of tetrahedron e, xje is the coordinate vector of the jth node of e, and Qje is the volumetric flux crossing the face of e opposite to the jth node. Important properties of the space Xe are that ue is constant over e, uenj is constant over each face j of e, and nj represents the unit outward vector orthogonal to face j of tetrahedron e (Raviart & Thomas, Citation1977). As a result of these properties, each velocity component of the RT0 discretization is piece-wise linear inside each element, and a constant velocity can occur only if the sum of the four fluxes is equal to zero. In this case the divergence is zero inside each element and mass continuity is both locally and globally conserved if the fluxes of two neighboring elements are always opposite one another in the common face.

In PS and CS1, the pressure gradient B is kept constant. In CS2, its correction is computed as the solution of a conservative problem, where the velocity is opposite to the pressure gradient. We will show in the following that, at the end of each step PS, CS1 and CS2, the computed velocity is piece-wise constant in each element, but the fluxes Qje in Equation (9) are opposite at the common face of two neighboring elements only at the end of CS2. Assuming a piece-wise linear pressure as initial condition, because the CS2 velocity correction is conservative with respect to the pressure correction, the pressure gradient remains piece-wise constant in all the steps. It can also be shown that the pressure correction computed in CS2 step by the MHFE is continuous along the element faces only in their circumcenters. Because pressure changes only in CS2 step, the same condition holds also for the pressure at the end of each time step. See in Figure  the pressure contour lines inside the face common to two neighboring tetrahedrons.

Figure 1. RT0 kinematic pressure contour lines in the face common to two tetrahedrons.

Figure 1. RT0 kinematic pressure contour lines in the face common to two tetrahedrons.

The presentation of the solution of each step in Equations (8) is restricted first to the hypothesis of a mesh with an Extended Delaunay Property (EDP), as defined in Aricò et al. (Citation2011), and then generalized in section 4 to the case of totally irregular meshes. In an EDP mesh, the circumsphere of any tetrahedron does not include any other node inside it (Forsyth, Citation1991; Joe, Citation1986; Letniowski, Citation1992) and, with reference to Figure , the following conditions hold for two neighboring tetrahedrons e and ep, (10a) (v2v1)v1<0 if|v1|>|v2|(10a) (10b) (v1v2)v2<0 if|v2|>|v1|(10b) where vectors v1 and v2 are defined as (10c) v1=(xctexcfe,ep),v2=(xctepxcfe,ep)(10c) xcte(ep) is the coordinate vector of the circumcenter of tetrahedron e (ep), xcfe,ep is the coordinate vector of the circumcenter of the face shared by tetrahedrons e and ep. Moreover, the following condition holds for the boundary elements e and the corresponding boundary face (11) vn<0(11) where n is the unit vector normal to the boundary face and oriented in the outward direction, v is defined as (12) v=(xctexcfe)(12) and xcfe is the coordinate vector of the circumcenter of the boundary face. In Figure (a,b), tetrahedrons e and ep satisfy and do not satisfy the EDP, respectively.

Figure 2. (a) The two tetrahedrons e and ep satisfy the EDP. (b) The two tetrahedrons do not satisfy the EDP.

Figure 2. (a) The two tetrahedrons e and ep satisfy the EDP. (b) The two tetrahedrons do not satisfy the EDP.

We will prove in the next sections that in an EDP mesh the system matrix, associated with the solution of the steps (8b) and (8c), is an M-matrix (Letniowski, Citation1992), and this avoids nonphysical local extrema in the computed solution (Forsyth, Citation1991; Letniowski, Citation1992). The same matrix property will be saved in the more general case of non-Delaunay meshes by changing the control volumes of the momentum equations, as will be shown in section 4.

3. MAST-RT0 solution in the case of Delaunay meshes

3.1. Prediction step

The step (8a) is solved by assuming, in each tetrahedral element, constant viscous and pressure gradient forces and by integrating convective inertial terms according to the MArching in Space and Time (MAST) procedure. MAST computes the solution of convective problems, with only one characteristic line passing through each (x, t) point of the domain, in the framework of a fractional time step procedure.

In MAST the three discretized momentum equations are solved in each element, uncoupled from the other ones. To this end, at the beginning of each kth time step, all the elements are marked with an index RekRk, called ‘rank’. The rank of an element e is an integer which is a unit greater than the ranks of all the neighboring ep elements with a common face crossed by a flux entering element e from element ep. All the elements are initialized at the beginning of each new time iteration with rank zero, and vector Rk is computed starting from the boundary elements with all interior fluxes oriented outward, which are set with rank 1. The rank is then computed for the elements that are neighbors to elements with rank greater than zero and have no interior faces crossed by fluxes oriented inward. The procedure is continued until all the elements of the computational domain have rank greater than zero. After computation of Rk, all the elements are sorted according to increasing rank values and solved one after the other.

During the solution of element e, the integral of the velocity momentum is computed and the mean momentum fluxes are added as external forces to the neighboring elements ep with entering flux and higher rank. The MAST solution of the prediction problem along time step k can be viewed as a reduction of the computational domain carried out after the solution of each boundary element, through extraction of the latter. This reduction makes it possible to solve originally internal elements according to the information carried on by characteristic lines rooted at the boundary of internal elements that behave like domain boundaries. In the example of Figure , cell 1 is solved using the information at x0 carried on by the first characteristic line. After interpolation in time of the solution at the cell boundary in x1, the second cell is solved using information carried on by the second characteristic line rooted between tk and tk+1 at x1. In this way, the sought after solution is not subject to the Courant restriction on the maximum size of the time step.

Figure 3. Sketch of MAST algorithm in the 1D case.

Figure 3. Sketch of MAST algorithm in the 1D case.

In Aricò et al. (Citation2013b) it has been shown that the basic requirement for the application of the MAST algorithm is the existence of a continuous ‘anisotropic scalar potential’ P of the flow field, such that the velocity can be computed as (13) u=K0P(13) where K0 is a (3 × 3) positive-definite tensor. In the same paper it is shown that for any incompressible and viscous fluid this potential always exists and streamlines are always open but, due to numerical discretization, it is possible in the computed velocity field to get one or more loops where a single element without a flux entering from other elements of the same loop does not exist. In this condition the rank vector cannot be computed and, to apply MAST, we need to select a ‘cut’ face between two elements of the same loop with a small flux, which has to be assumed constant during the time step and equal to its initial value. See in Appendix 1 a very fast procedure, named Order, aimed to compute the Rk vector, where the flux entering from the ‘cut’ face of a loop is treated as a known boundary flux. After application of the Order algorithm, an open source subroutine ‘QUICKSORT’ in the package KB07Footnote1 can be used to order the elements according to their rank values.

After integration in space, the prediction step of Equation (7a) can be written, for any element e, as (14) Weutdw+Weuudw+WeΨkdwνWe2uk1/3dw=0e=1,,NT(14) with the symbols specified as above. The first and second terms on the l.h.s. of Equation (14) represent the local and convective inertial terms, respectively, the third term is the force over the element due to the gradient of the kinematic pressure and the last term accounts for the effect of the viscous forces.

From now on we will call j (in the local element reference, j = 1, … , 4) the face that tetrahedron e shares with its neighboring ep, and jp (in a local reference too) the face that ep shares with e. The symbols σje and nj represent the area and the unit outward orthogonal vector of the jth face of tetrahedron e, respectively.

Due to the cell sorting operation at the beginning of each time iteration, the solution of the momentum Equation (14) for element e with rank Rek, depends (a) on the incoming momentum fluxes from neighboring ep elements with Repk < Rek, (b) on the kinematic pressure gradients and viscous terms, which are known from the solution of the previous time step, and (c) on the initial solution inside the element at the beginning of the new time step. For these reasons, the PDEs system in Equation (14) can be regarded as a small (3 × 3) Ordinary Differential Equations (ODEs) system to be solved for the velocity components in element e.

Starting from a piecewise constant in space (P0) and divergence-free velocity distribution inside each tetrahedron e, uekXe and uekP0,e, we assume that at the generic time τ inside the time step (0 τ Δt) the velocity value is (15) ue(τ)=uek+Δuef(τ)(15) where Δuef(τ)P0,e and Δuef(0)=0. Applying the Green lemma to the second integral on the l.h.s. of Equation (14), the equilibrium ODEs system for tetrahedron e can be written as (16) d(Δuef)dτWe+j=1,4φjeMje,out(τ)+j=1,4(1φje)M¯je,in+SΨ,ek+VFek1/3=0e=1,,NT(16) where Mje,out(τ) is the leaving momentum flux from tetrahedron e to the neighboring tetrahedron ep, M¯je,in is the mean incoming momentum flux entering tetrahedron e crossing the jth face, computed along with the solution of the previous elements, φje = 1 for faces shared by elements with higher rank or boundary faces with positive flux, otherwise φje = 0. SΨ,ek and VFek1/3 are the sum of the kinematic pressure and the viscous forces over the four faces of tetrahedron e, respectively, computed in the previous time step as specified in sections 3.2 and 3.3. ODEs systems (16) are sequentially solved, one for each element, starting from the tetrahedrons with the smallest Rek value, and proceeding to the tetrahedrons with higher Rek values. We call this step MAST forward step (MAST-fs). To solve system (16) from time 0 to time Δt, we use an explicit Runge–Kutta (RK) code (Brankin et al., Citation1993). This code adopts an internal time sub-grid, selected within the interval [0 – Δt], on which the approximate ODEs solution is computed. The position of the nodes of the grid is automatically selected by the RK code according to a local error estimation (Brankin et al., Citation1993).

Due to the change of the velocity vector during the ODEs system solution we could obtain, in intermediate times between 0 and Δt, momentum fluxes moving from the element e into other ep elements with a lower rank (Repk < Rek). To avoid this, we approximate the leaving momentum flux in the MAST-fs forward step as (17) Mje,out(τ)=σjeue(τ)max[0,ue(τ)nj](17) with the symbols specified above. To restore the force integral neglected in the integration of Equation (14) due to the limit of the momentum flux assigned in Equation (17), after the end of the forward step we again perform the solution of the 3-ODEs system in sequential way, assuming as initial ue value the one computed at the end of the MAST-fs forward step. In MAST backward solution (MAST-bs) we start from the tetrahedrons with the highest Rek value and proceed to the tetrahedrons with smaller Rek values, saving only the inertial terms of Equation (14), that is: (18a) d(Δueb)dτWe+j=1,4(1φje)Mje,out(τ)+j=1,4φjeM¯je,in=0(18a) (18b) ue(τ)=uek+Δuef(Δt)+Δueb(τ)(18b) where Δueb(τ)P0,e, Δueb(0)=0, and the momentum fluxes Mje,out(τ) are computed as for the forward step.

During the solution of the ODEs system (16) or (18), we compute the solution at nG selected number of Gauss integration points chosen in the interval [0 – Δt].

At the end of the solution of each ODEs system (16) or (18), in the MAST-fs and MAST-bs problems we compute the momentum fluxes coming into element e from the neighboring ep tetrahedrons as (19) M¯je,in=M¯jpep,out(19) (20a) M¯jpep,out=l=1,nGφjpepσjpepmax[0,uep(τl)nj]m=1,4{φmepσmepmax[0,uep(τl)nm]}×Mjpep,out(τl)wl in MAST - fs(20a) (20b) M¯jpep,out=l=1,nG(1φjpep)σjpepmax[0,uep(τl)njp]m=1,4{(1φmep)σmepmax[0,uep(τl)nm]}×Mjpep,out(τl)wl in MAST - bs(20b) where τ l and wl are the time and the weight associated with the lth Gauss point, with 1 l nG.

The final velocity uk+1/3 is computed by Equation (18b) for τ =Δt. uk+1/3 is piecewise constant and divergence-free inside each tetrahedron (uek+1/3P0,e), but the fluxes crossing the same face of two neighboring elements will not be opposite for the two elements in the computed solution and this disrupts mass conservation at time level tk+1/3.

In the MAST-fs problem, we compute the boundary momentum fluxes of tetrahedron e, different from zero, as (21a) Mje,in(τ)=σjegj,e(τ)(gj,e(τ)nj)if (gj,e(τ)nj0 and jΓu)(21a) (21b) Mje,in(τ)=σjeue(τ)min(0,ue(τ)nj)if (ueknj0 and jΓσ)(21b) (21c) Mje,out(τ)=σjeue(τ)max(0,ue(τ)nj) if (ueknj0(21c) where gj,e is the velocity vector assigned on the face j ∈ Γu of element e.

In the MAST-bs problem, we compute the same fluxes as (22a) Mje,in(τ)=σjeue(τ)min(0,ue(τ)nj)if (ueknj>0 and jΓσ)(22a) (22b) Mje,out(τ)=σjeue(τ)max(0,ue(τ)nj) if ueknj<0(22b)

3.1.1. Parallel solution of the MAST prediction step

The solution of the MAST PS in 1D problems is inherently serial and cannot be achieved with parallel computing. On the opposite, in 2D and 3D problems it could be possible to carry out simultaneously the solution of all the elements with the same rank using several CPU physical processors, just saving the average entering momentum fluxes computed for each element in both the forward and backward steps. See in Figure (a) the scheme of the MAST-fs solution, for a computer with five processors, of a single time step of a model with 13 elements, ordered in two groups of rank 1 (7 elements) and rank 2 (6 elements). The white circles represent the initial value of the variables and the black circles the final one. TT is the computational time required to each processor for the solution of one element. Observe that, due to the need of solving all the elements with rank 1, before proceeding to the other ones, some processors have to solve 4 elements before moving to the next time step, where in any traditional marching in time method (MTM) only a maximum of 3 elements would be required, as shown in Figure (b). An additional time ϵ is also required, for each time step, for the rank computation and the element ordering in the MAST-fs (see Figure (a)). These last operations, as better specified in the tests run in section 5 with millions of elements, require a computational time two or three order of magnitude smaller than TT.

Figure 4. Solution of a single time step. (a) Scheme of the parallel solution of the MAST-fs (or MAST-bs). (b) Scheme of the parallel solution of traditional MArching in Time method (MTM).

Figure 4. Solution of a single time step. (a) Scheme of the parallel solution of the MAST-fs (or MAST-bs). (b) Scheme of the parallel solution of traditional MArching in Time method (MTM).

For meshes with at least a few hundred thousand elements, the maximum number Nr,k of elements within a single rank Rek at time step k is usually much larger than the available physical processors in standard computers. For test 3 (in section 5.3.3), we estimate Nr,k and predict the corresponding computational time of parallelization of the MAST algorithm for different number of processors.

3.2. The CS1 correction step

By integrating Equation (8b) in space, we obtain the following system, to be solved for the three unknown velocity components uek+2/3P0,e, from time level tk+1/3 to time level t k+2/3, (23) uek+2/3uek+1/3ΔtWeWeνΔ2uek+2/3dw=WeνΔ2uek1/3dwe=1,,NT(23)

Application of the Green lemma to the volume integral on the l.h.s. of Equation (23) leads to (24) νWeΔ2uek+2/3dw=νj=1,4σjeuek+2/3n=νj=1,4σjeuek+2/3uepk+2/3de,ep(24) where un is the derivative of the velocity vector along the direction orthogonal to face σje, de,ep is the distance of the circumcenters of the two neighboring tetrahedrons e and ep, computed as specified in Equation (25), and the other symbols have already been defined. We compute the distance as (25) de,ep=(|v1v2|)signwhere sign=1 if {((v2v1)v1<0.and|v1|>|v2|).or.((v1v2)v2<0.and|v2|>|v1|)sign=1 otherwise (25) where vectors v1 and v2 have been defined in Equation (10c). For each tetrahedron e we set (26) Δue=uek+2/3uek+1/3 with ΔueP0,e(26) such that the r.h.s. of Equation (24) becomes, (27) νσjeuek+2/3uepk+2/3de,ep=νσje(uek+1/3uepk+1/3de,ep+ΔueΔuepde,ep)(27)

On the r.h.s. of Equation (23) we assume (28) νWeΔ2uek1/3dwνWeΔ2uekdw=νj=1,4σjeuekuepkde,ep(28)

We set (29) Δu~e=uek+1/3uekwithΔu~eP0,e(29) and merging Equation (29) with Equation (27) we get (30) νσjeuek+1/3uepk+1/3de,ep=νσje(uekuepkde,ep+Δu~eΔu~epde,ep)(30)

According to Equations (26)-(30), system (30) can be rewritten as (31) ΔueΔtWe+νj=1,4σjeΔueΔuepde,ep=νj=1,4σjeΔu~eΔu~epde,ep(31) and solved for the components of the velocity correction Δue.

The matrix of system (31) is sparse, symmetric and positive definite, with diagonal and off-diagonal coefficients Me,eCS1 and Me,epCS1 equal to (32) Me,eCS1=WeΔt+νj=1,4σje1de,ep,Me,epCS1=νσje1de,ep(32) and the e-th element of the source term vector is the r.h.s. of Equation (31).

For the boundary face j Γu of tetrahedron e, the velocity at the circumcenter of the neighboring tetrahedron is replaced in the system of Equation (31) by the boundary velocity at the circumcenter of the boundary face and the distance de,ep with the distance dje between the tetrahedron and the face circumcenter, defined as (33) dje=vn(33) where v and n have been already defined in Equations (11) and (12). If the boundary face belongs to Γu and condition gj,e(tk+1/3)nj0 holds, the velocity correction Δuj at the boundary face, from tk+1/3 to tk+2/3, is (34a) Δuj=gj,e(tk+2/3)gj,e(tk+1/3)=gj,e(tk+Δt)gj,e(tk+Δt)(34a) equal to zero along with the corresponding off-diagonal matrix coefficient in Equation (32), and the contribution to the source term becomes (34b) Δu~j=gj,e(tk+1/3)gj,e(tk)=gj,e(tk+Δt)gj,e(tk)(34b)

If the boundary face belongs to Γu and condition gj,e(tk+1/3)nj>0 holds, the boundary velocity at time tk+1/3 is assumed to be the tetrahedral element velocity uek+1/3, that implies Δuj=gj,e(tk+Δt)uek+1/3 and (35) Δu~j=uek+1/3uek(35)

If the boundary face belongs to Γσ or to Γm, for simplicity’s sake we neglect the viscous tangent stress components, along with the terms in Equation (31) proportional to the viscosity.

If the EDP is satisfied (see section 2), due to Equations (11)-(12), the diagonal and off-diagonal coefficients in Equation (32) are positive and non-positive, respectively, such that the M-matrix property of the matrix is always guaranteed (Forsyth, Citation1991; Letniowski, Citation1992).

We adopt a preconditioned conjugate gradient method to solve system (31), using incomplete Cholesky factorizationFootnote2 and a compressed row storage (CRS) format, which makes it possible to save a lot of computational memory allocation (we store the main diagonal and the upper off-diagonal matrix coefficients). Since the matrix coefficients only depend on the geometrical variables, as well as the value of the kinematic viscosity, the matrix is only factorized once, before the time iterations loop starts, and this makes it possible to save a lot of computational effort.

After system (31) is solved, the velocity in each tetrahedron is updated according to Equation (26). At time level tk+2/3 for each tetrahedron e we compute the sum FVek+2/3 of the viscous forces over its four faces as (36) FVek+2/3=j=1,4νσjeuek+2/3uepk+2/3de,ep(36) and we neglect the change in it occurring along the next CS2 correction step.

At the end of the CS1 problem, the continuity of the fluxes crossing the same face of two neighboring elements e and ep is not yet restored, and, similarly to uek+1/3 at the end of the PS problem (section 3.1), uek+2/3P0,e, but mass conservation is not satisfied.

The spatial discretization of the derivative un proposed in Equation (24) is similar to the one presented by Younes et al. (Citation2004, Citation2006) for the 2D Mixed Hybrid Finite Element method lumped in the circumcenter of triangles. Unlike the formulation of the present work, Younes et al. (Citation2004) proved that their corresponding 3D discretization exists only for regular tetrahedrons, and cannot be extended to a general 3D tetrahedral discretization.

3.3. The CS2 correction step

Substituting Equations (5b) in Equations (8c), we get (37) uk+1uk+2/3Δt+(Ψk+1Ψk)=0(37) which has to be solved to restore the flux continuity disrupted in the prediction and in the first correction steps. To this end we set (38) uk+1uRT0k+2/3=ηΔt(38) where η is an unknown function with the same dimensions as ψ, and uRT0k+2/3Xe is the RT0 velocity at time level tk+2/3 computed by Equation (9) in each tetrahedron e as function of the weight mean flux Fl¯e,jk+2/3 through face j of e, given by (39) Fl¯j,ek+2/3=Flj,ek+2/3WepFljp,epk+2/3WeWe+Wepj,jp=1,,4(39) where Flj,ek+2/3 is the flux due to velocity uek+2/3 crossing face j of e, and Fljp,epk+2/3 is the flux due to velocity uepk+2/3 crossing the same face jp of ep. From Equation (39), flux Fl¯j,ek+2/3 is opposite for the two neighboring tetrahedrons e and ep and the weight of each flux is inversely proportional to the volume of the corresponding element. Because the common face can be thought of as the basis of the two tetrahedrons, this is equivalent to assuming an inverse proportionality of the flux with respect to the corresponding height of the tetrahedron. According to Equation (9), we get (40) ue,RT0k+2/3=j=1,4ωjeFl¯j,ek+2/3(40) with the symbols already specified. Since the sum of the four fluxes Flj,ek+2/3 j = 1, … , 4 is not zero, mass conservation is not satisfied by the uRT0k+2/3 velocity field in tetrahedron e, uRT0k+2/30 and uRT0k+2/3P1,e (piecewise linear inside the tetrahedrons), as explained in section 2.2. By subtracting uk from both members in Equation (38), we get (41) ηΔt+uRT0k+2/3uk=uk+1uk(41) Taking the divergence of Equation (41), we get (42) Δt2η+(uRT0k+2/3uk)=(uk+1uk)(42)

Integration of Equation (42) and application of the Green lemma leads to (43) Δtj=4σjeηeηepde,ep+j=4(Fl¯j,ek+2/3Flj,ek)=Fle(43) where the first term is the flux of the vector uk+1uRT0k+2/3, Fl¯j,ek+2/3 is defined in Equation (39), Flj,ek is the flux crossing face j of element e due to velocity ukXe and Fle is the total flux of the vector (uk+1-uk) crossing the surface of element e. For the gradient of η we adopt the spatial discretization already applied in section 3.2 for the velocity gradient (see Equation (24)).

To compute η, we impose the condition that both uk+1 and uk are divergence-free, with uk+1Xe and uk+1P0,e, which implies that in Equation (43) the flux Fle of the velocity (uk+1uk) crossing the total surface of the tetrahedron is zero (see in Figure a 1D sketch of velocity vectors inside tetrahedron e). The resulting system is (44) Δtj=4σjeηeηepde,ep+j=4(Fl¯j,ek+2/3Flj,ek)=0(44)

Figure 5. 1D sketch of velocity vectors inside tetrahedron e.

Figure 5. 1D sketch of velocity vectors inside tetrahedron e.

Equations (44) form a well-conditioned linear system to be solved for the η unknowns. The matrix of the system is sparse, symmetric and positive-definite. Diagonal and off-diagonal coefficients Me,eCS2 and Me,epCS2 are, respectively, (45a) Me,eCS2=j=1,4σjeΔtde,ep,Me,epCS2=σjepΔtde,ep(45a) and the same M-property of the coefficients of the matrix of the system of CS1 holds (see section 3.2). The e-th coefficient of the source term vector is (45b) SteCS2=j=4(Fl¯j,ek+2/3Flj,ek)(45b)

The solution of system (44)-(45) is performed in the same way as for system (31)-(32). In the present case too, matrix coefficients only depend on the geometrical variables and time step size, so that the system matrix is factorized only once before the time iteration loop starts, saving a lot of computational time.

We call (46) Flj,eη=Δtσjeηeηepde,ep(46) the flux crossing face j of e due to the gradient of η. This flux is continuous for the two neighboring elements e and ep, with Flj,eη=Fljp,epη. For the tetrahedron e we compute the final velocity at time level tk+1 from the fluxes on the l.h.s. of Equations (44), coupled with Equation (46), as (47) uek+1=j=1,4ωje(Fl¯j,ek+2/3+Flj,eηFlj,ek)+uek=j=1,4ωje(Fl¯j,ek+2/3+Flj,eη)(47)

Because the fluxes in the brackets of Equation (47) are continuous, mass conservation is satisfied along all the faces of each tetrahedron and inside each tetrahedron.

The pressure gradient at time level tk+1 is finally computed from Equation (37) as (48) Ψek+1=uek+2/3uek+1Δt+Ψek(48)

Observe, from Equation (38) and from the definition of ue,RT0k+2/3P1,e given in Equation (40), that the gradient η and η have respectively a linear and a quadratic variation inside each element, while the pressure gradient and the pressure correction in Equation (48) have respectively a linear and a constant variation. From Equations (48) and (38) we get (49) Ψek+1Ψek=uek+2/3uRT0k+2/3Δtη(49)

Equation (49) says that, inside the computational domain, the gradient of the function η is different from the gradient of the kinematic pressure correction Ψek+1Ψek. Integration of the divergence of both members of Equation (49) would allow us to compute the pressure at the element circumcenters. On the other hand, the kinematic pressure does not need to be known inside the domain in order to proceed to the solution of the next time steps. In section 5.2 we will show how to estimate the kinematic pressure at the computational nodes only at a given number of simulation times. If the velocity is known at the circumcenter of a boundary face belonging to Γu, we can set Fl¯j,ek+2/3 equal to the corresponding flux, include it in the r.h.s. of Equation (44) and set the corresponding flux Flj,eη equal to zero.

Observe that the tangent components of the velocity computed by Equation (47) are not the same as those used along the Γu boundary surface to compute the viscous boundary forces in the previous CS1 correction step. This implies a small difference between the computed boundary face and boundary element velocities.

At the circumcenter of the faces belonging to Γσ, where the hydrostatic stress boundary condition is set, the following Dirichlet type condition is finally assigned to the corresponding equations (50) ηc=Ψck+1Ψck(50) and the boundary flux Fl¯j,ek+2/3Flj,ek is computed ‘a posteriori’ from the corresponding equation of system (44). In all the other boundary faces the corrective flux Flj,eη is set equal to zero.

After computation of Ψek+1, the sum SΨ,ek+1 of the kinematic pressure forces over the four faces of tetrahedron e, can be computed applying the Green lemma as (51) SΨ,ek+1=ΩenΨek+1dΩ=WeΨek+1dW=Ψek+1We(51) Vector SΨ,ek+1 is assumed equal to SΨ,ek in the MAST PS in the next time iteration (see Equation (16) in section 3.1).

Unfortunately, even if in the 2D case it is always possible to get a mesh satisfying the EDP as defined in Equation (11), also for very irregular domain geometries (Aricò et al., Citation2011; Aricò & Tucciarelli, Citation2013; Forsyth, Citation1991, and cited references), in 3D space it is almost impossible to obtain a mesh satisfying the EDP and a 3D Delaunay mesh always has very irregular elements inside it (e.g. slivers, caps, skinny tetrahedrons, …), which could affect the stability of the numerical solution (Joe, Citation1986). This implies the need to extend the methodology presented in all section 3 to the more general case of non-Delaunay meshes.

3.4. The MAST-RT0 pseudo code in the case of Delaunay meshes

  1. Compute model constants, including CS1 and CS2 matrix coefficients by means of Equations (32) and (45a). Perform matrix factorization and set k=1

  2. Given uk and Ψk at time tk, apply the MAST prediction step (PS)

    • Compute velocity variation Δuef(Δt) and Δueb(Δt) by solving Equations (16) and (18).

    • Update uek+1/3=ue(Δt) for each tetrahedron e by means of Equation (18b)

  3. Apply the 1st corrective step (CS1)

    • Solve system (31) for the Δue unknowns

    • update velocity uek+2/3 for each tetrahedron e by means of Equation (26)

  4. Update the viscous forces FVek+2/3 for each tetrahedron e by means of Equation (36)

  5. Compute fluxes Fl¯e,jk+2/3(uk+2/3) for each face j (j = 1, … , 4) of tetrahedron e by means of Equation (39)

  6. Apply the 2nd corrective step (CS2)

    • Compute ηe for each tetrahedron e by solving system (44)

    • Compute fluxes Flj,eη for each tetrahedron e by means of Equation (46)

    • Update the final velocity uek+1 for each tetrahedron e according to Equation (47)

  7. Update the gradient of the kinematic pressure Ψk+1 according to Equation (48)

  8. Compute the kinematic pressure forces SΨ,ek+1 for each tetrahedron e by means of Equation (51)

  9. Update k with k+1 and go back to point two for the next time step

4. The numerical procedure for non-Delaunay meshes

4.1. Tetrahedron clusters

The discretization of the second-order derivative terms in CS1 and CS2 problems (i.e. Equations (31) and (44)) along a face of the computational mesh shared by two tetrahedrons with negative distance, as defined in Equations (25) and (33), can lead to positive off-diagonal matrix coefficients Me,epCS1 and Me,epCS2 and a negative contribution to the corresponding diagonal coefficients Me,eCS1 and Me,eCS2 (see Equations (31)-(32) and (44)-(45)), so that the diagonal dominance and the M-property of the matrix system could be lost (Forsyth, Citation1991; Joe, Citation1986; Letniowski, Citation1992). Moreover, unphysical numerical solutions could arise, corresponding to poorly oriented viscous forces and pressure gradients. To avoid this problem, we propose the procedure described in the present section to handle the PS, CS1 and CS2 steps in the case of non-Delaunay meshes.

Let us consider irregular a face shared by two tetrahedrons with a negative distance de,ep between the corresponding circumcenters, as defined in Equation (25). If one or more irregular faces are present in the mesh, the EDP condition is no longer satisfied (see section 2.2). We group all the tetrahedrons in clusters. A cluster is to be seen as a small non-empty group of neighboring tetrahedrons not sharing any irregular face with other clusters. For example, the two tetrahedrons in Figure (b) form a cluster. Each tetrahedron belongs to a single cluster. A single tetrahedron e forms a cluster by itself if it has no irregular faces. In the cluster, we distinguish the external faces, shared by two tetrahedrons of different clusters, and the other internal ones. According to the previous assumptions, all external faces are regular, and in a cluster composed of a single tetrahedron we do not have internal faces.

Let NC be the number of clusters, with NC ≤ NT. The general strategy is to write, instead of the dynamic equilibrium of each single tetrahedron, the dynamic equilibrium of each cluster as a function of a single velocity variation and finally to correct all the tetrahedron velocities inside the clusters, after the CS2 correction step, in order to guarantee flux continuity through all the faces.

From now on, indices m and mp refer to two neighboring clusters, NT,m is the number of tetrahedrons e belonging to the m-th cluster, Nf,mext and Nf,mint are the number of external and internal faces of the cluster m, l and r are the local counters of the external and internal faces of the cluster, respectively (l = 1, … , Nf,mext, r = 1, … , Nf,mint), and Wm is the volume of the cluster, Wm=e=1,NT,mWe (see also Figure ).

Figure 6. (a) 2D sketch of case Nf,mint = NT,m – 1. (b) 2D sketch case of Nf,mint = NT,m. Blue solid lines are traces of the external faces of the cluster, and red dashed lines are traces of the internal faces of the cluster.

Figure 6. (a) 2D sketch of case Nf,mint = NT,m – 1. (b) 2D sketch case of Nf,mint = NT,m. Blue solid lines are traces of the external faces of the cluster, and red dashed lines are traces of the internal faces of the cluster.

4.2. The PS and CS1 problems for non-Delaunay meshes

Solution of the MAST prediction step, as explained in section 3.1, is not affected by the existence of irregular faces. On the other hand, after solution of all tetrahedrons, we need to evaluate, for each cluster, a single velocity variation corresponding to the cluster dynamic equilibrium. Call upek+1/3P0,e the predicted tetrahedron velocity computed at time tk+1/3 as described in section 3.1 for uek+1/3 in the case of Delaunay meshes, and Δu~m the unknown velocity variation between time levels tk+1/3 and tk to be assigned to cluster m.

For each tetrahedron e of cluster m, we can write the correction velocity upek+1/3uekP0,e by summing the time integrals of the ODEs system (16) and (18), to get (52) (upek+1/3uek)We=(j=1,4M¯je,out+j=1,4M¯je,in+SΨ,ek+FVek1/3)×Δt(52) where the symbols have been defined in section 3.1. M¯je,out has been computed in the forward step if Rek<Repk and in the backward step otherwise, M¯je,in has been computed in the forward step if Rek>Repk and in the backward step otherwise. Observe that, in all internal faces of the cluster, the following condition holds (53) M¯je,out=M¯jpep,in(53)

This implies that, summing the Equation (52) of all the tetrahedrons of the same cluster, we get (54) e(upek+1/3uek)We=e(j=1,4M¯je,out+j=1,4M¯je,in+SΨ,ek+FVek1/3)Δte=1,,NT,m(54) where all the momentum fluxes belonging to internal faces as well as the corresponding viscous forces, sum zero and can be deleted. Equation (54) can be seen as the equilibrium equation of the cluster m, which can be approximated by setting (55) Δu~m=e(upek+1/3uek)WeWmwithΔu~mP0,m(55) and by replacing in each tetrahedron the previously computed upek+1/3 velocity with (56) uek+1/3=uek+Δu~me=1,,NT,m(56)

In the CS1 step, solution of Equation (31) in not EDP meshes is hindered by the negative distances de,ep holding between the two circumcenters of tetrahedrons sharing an irregular face. To circumvent the problem, we write the viscous forces equilibrium of the cluster, where the external forces act always on regular external faces. Following the same procedure applied in section 3.2, integrating in space and time Equation (7b) and applying the Green lemma, we get the following system, (57) e=1,NT,muek+2/3uek+1/3ΔtWe+νl=1,Nf,mextσlmuek+2/3uepk+2/3de,ep=νl=1,Nf,mextσlmuek+1/3uepk+1/3de,epm=1,,NC(57) where ep is the tetrahedron of cluster r, sharing with tetrahedron e the l-th external face of cluster m, with area σlm, distance de,ep is defined in Equation (25), and the other symbols have been previously specified.

We set, for the tetrahedrons of cluster m (58) uek+2/3=uek+1/3+Δume=1,,NT,mwithΔumP0,m(58) and, substituting Equations (56) and (58) in Equation (57), we obtain 3 systems, one for each of the unknown components of Δum, (59) ΔumΔtWm+νl=1,Nf,mextσlmΔumΔurde,ep=νl=1,Nf,mextσlmΔu~mΔu~rde,epm=1,,NC(59)

Matrix of system (59) has the same properties of matrix of system (31), is symmetric and positive definite, its diagonal and off-diagonal coefficients Me,eCS1 and Me,epCS1 are, respectively, (60) Mm,mCS1=WmΔt+νl=1,Nf,mextσlm1de,ep,Mm,rCS1=νσlm1de,ep(60) and the m-th coefficient of the source term vector is the r.h.s. of Equation (57). Diagonal and off-diagonal matrix coefficients in Equation (60) are positive and non-positive, respectively, since all the distances de,ep are positive.

We deal with the BCs of the CS1 problem as described in section 3.2. Observe that the previously described procedure fails to guarantee in the clusters a positive distance from a boundary element circumcenter and the circumcenter of its boundary face. On the other hand, to guarantee a positive distance from the boundary we can simply avoid any internal node with a distance from the circumcenter of each triangular boundary face smaller than the radius of the circle passing through the three nodes of the same boundary triangle. This can be easily done with a methodology that will be shown in section 5.1.

After solution of system (59), the velocity in the tetrahedrons e of each cluster m is updated according to Equation (58). At time level tk+2/3 we need to compute for each tetrahedron e the sum FVek+2/3 of the viscous forces over its four faces, including the irregular ones, needed for the next computational time step. To do that we assume, coherently with the approximation of Equation (58), the local inertia per unit volume in each tetrahedron, to get (61) FVek+2/3=(uek+1/3uek)ΔtWeSΨ,ekj=1,4M¯je,out+j=1,4M¯je,in(61) where the symbols of the momentum fluxes are the same used for the r.h.s of Equation (52).

As it happens for the CS1 problem in the EDP meshes (section 3.2), at time level tk+2/3, the continuity of the fluxes crossing the same face of two neighboring elements e and ep belonging to two different clusters, is not yet restored, uek+2/3P0,e, and mass conservation is not satisfied.

4.3. The CS2 problem for non-Delaunay meshes

In non-Delaunay meshes the solution of the CS2 problem is split into two sub-steps. In the first sub-step we apply the procedure described in section 3.3, including in the mass balance the fluxes crossing the external faces of the cluster instead of the fluxes of the four faces of the single tetrahedron (as in Equations (43) and (44)), and assuming a single η value for the circumcenters of all the tetrahedrons inside the cluster. System (44) becomes (62) Δtl=1,Nf,mextσleηmηmpde,ep+l=1,Nf,mext(Fl¯l,ek+2/3Fll,ek)=0m=1,,NC(62) with the symbols already specified and distance de,ep is always positive according to section 4.1. We call Fl¯l,ek+2/3 the flux crossing face l as defined in Equation (39) over face j (j = 1, … , 4) of e, and Fll,ek the flux due to velocity uekXe crossing the same face.

The diagonal and off-diagonal matrix coefficients for system (62a) Mm,mCS2 and Mm,mpCS2 are, respectively, (63a) Mm,mCS2=l=1,Nf,mextσleΔtde,ep,Mm,mpCS2=σlmΔtde,ep(63a) and the same M-property of the coefficients of the matrix of the system of CS1 holds (see section 3.2). The m-th coefficient of the source term vector is (63b) StmCS2=l=1,Nf,mext(Fl¯l,ek+2/3Fll,ek)(63b)

The solution of the system formed by Equations (62)–(63) guarantees flux continuity on the external faces and global mass conservation inside the cluster, but because the velocities uek+1 inside each cluster do not generally belong to a single RT0 space, it does not guarantee zero divergence condition inside the cluster, unless the cluster includes only one tetrahedron.

For clusters composed of more than one tetrahedron, we apply the following procedure. We change, in Equation (47), the fluxes (Fl¯j,ek+2/3+Flj,eη) crossing the internal faces of the cluster, with the unknown flux Flj,eint, to get (64) uek+1=j=1,4ωje[φ~jFlj,eint+(1φ~j)(Flj,eη+Fl¯j,ek+2/3)](64) where Flj,eint is the flux crossing face j of element e, which is internal to the cluster, and φ~j is 1 if face j is internal to the cluster, and 0 if it is external. We also assume (65) Flj,eint=Fljp,epint(65) and we look for the optimal set of internal fluxes that: (1) guarantees the condition uek+1 P0,e for all the elements of the cluster and, as a consequence of constraint (65), mass conservation too, (2) minimizes the kinetic energy inside the cluster.

To get the required Flj,eint fluxes, in the second sub-step of the CS2 problem, for each cluster we solve the following linearly constrained quadratic minimization problem, (66) Minimize =12e=1,NT,m{[j=1,4ωje(φ~jFlj,eint+(1φ~j)(Flj,eη+Fl¯j,ek+2/3))j=1,4]2We}(66) subject to Equation (65) and (67) j=1,4(φ~jFlj,eint+(1φ~j)(Flj,eη+Fl¯j,ek+2/3))=0e=1,,NT,m1(67) where functional is the total kinetic energy of the cluster, given by the sum of the kinetic energy of the single tetrahedrons inside it, and the term in the square brackets in Equation (66) is the velocity uek+1, according to Equation (64). The last mass conservation Equation (67) in tetrahedron NT,m has been skipped because solving Equation (62) guarantees global mass conservation inside the cluster and this implies that any Equation (67) can be written as a linear combination of all the other ones. A very efficient way to solve problem (66)-(67) is to compute the Lagrangian multipliers, along with the required unknowns, as the solution of the following unconstrained quadratic minimization problem, (68a) MinimizeFI,λ=12e=1,NT,m{j=1,4[ωje(φ~js=1,Nf,mintδe,jsFIs+(1φ~j)(Flj,eη+Fl¯j,ek+2/3)s=1,Nf,mint)]2We}+e=1,NT,m1{λej=1,4(φ~js=1,Nf,mintδe,jsFIs+(1φ~j)(Flj,eη+Fl¯j,ek+2/3)s=1,Nf,mint)}(68a) where FIs is the sth internal flux, occurring between tetrahedrons e and ep of the same cluster m, assumed positive if going from the element with the smallest index to the element with the highest index, and (68b) δe,js=1 if face s belongs to element e and e<ep(68b) (68c) δe,js=1 if faces belongs to element e and e>ep(68c) (68d) δe,js=0 if face s does not belong to element e(68d) and Flj,eint=δj,esFIs, if s the index of the face between elements e and ep. The unknowns λe are the so-called Lagrangian multipliers. Setting at zero the derivatives of functional with respect to FIs and λe, we get the linear system (69a) e=1,NT,m{j=1,4[q=1,3ωj,qe(φ~js=1,.Nf,mintδe,jsFIs+(1φ~j)(Flj,eη+Fl¯j,ek+2/3)s=1,.Nf,mint)ωj,qe]φ~jδe,jrWe}+e=1,NT,m1{λej=1,4(φ~jδe,jr)}=0 with r=1,,Nf,mint(69a) (69b) j=1,4(φ~js=1,,NS,mδe,jsFIs+(1φ~j)(Flj,eη+Fl¯j,ek+2/3))=0for e=1,,NT,m1(69b)

Observe that Equation (69b) represent the mass conservation equations for all the tetrahedrons of cluster m. Moreover, if Nf,mint= NT,m-1 (see the 2D sketch in Figure (a)), system (69b) can be solved independently of Equation (69a) and there is only one set of internal fluxes that satisfy the mass conservation equations. If Nf,mintNT,m (see the 2D sketch in Figure (b)), we also have to compute Equation (69a) along with Equation (69b), because we need to select, among all the sets of internal fluxes that satisfy mass conservation, the one that minimizes the kinetic energy within the cluster.

Equation system (69) has no special structure and has to be solved with direct solvers, but it is small and only includes a few tetrahedrons of the mesh, so its solution requires a negligible computational burden.

After Equations (64)-(69) are solved, in non-Delaunay meshes we cannot compute the gradient Ψek+1 of each tetrahedron by applying Equation (48) with the computed uek+1 velocity. The reason is that, after the first sub-step of the CS2 problem, the computed solution satisfies the momentum and continuity equations of the clusters (not of the single tetrahedrons) and only the fluxes crossing the external faces of the clusters are computed under these constraints. The fluxes crossing the internal faces of each cluster are computed, during the second sub-step of the CS2 problem, by satisfying the continuity equations only. For these reasons, we compute Ψek+1 by applying the following general procedure.

We look for a new velocity umc common to all the tetrahedrons of the cluster. Let be a scalar functional, defined as (70a) =l=1,Nf,mext(umcnlflu,l)2(70a) (70b) flu,l=(Fl¯l,ek+2/3+Fll,eη)σlm(70b) where flu,l is the flux per unitary area crossing the l-th external face of the cluster, nl is the unit outward vector normal to face l, and all the other symbols have already been defined. In the case of a cluster made up of a single tetrahedron, according to Equation (47) flu,l is the flux per unit area of the velocity uek+1 crossing the external l face.

The relationship of umc with the pressure gradient Ψek+1 is given by Equation (48), where uek+1 is replaced by umc in the case of NT,m > 1. Observe that the second step of the CS2 problem does not change the fluxes crossing the external faces and that the pressure gradient Ψek+1 is not affected by the corrected fluxes of the internal faces. To compute umc we minimize , which is equivalent to setting at zero the partial derivatives of the functional with respect to the three components of umc. The size of the resulting system is (3 × 3), it does not have a special structure, and, as for system (69), is solved using direct solvers.

Once Ψek+1 is computed, we obtain the kinematic pressure force on each element of the cluster by setting in all the elements (71) SΨ,ek+1=Ψek+1We(71) Vector SΨ,ek+1 is assumed equal to SΨ,ek in the MAST PS of the next time iteration (see Equation (16) in section 3.1).

4.4. The MAST-RT0 pseudo code in the case of non-Delaunay meshes

  1. Compute cluster geometry and model constants, including CS1 and CS2 matrix coefficients by means of Equations (60) and (63a). Perform matrix factorization and set k = 1

  2. Given uk and Ψk at time tk, apply the MAST prediction step (PS)

    1. Compute velocity variation Δuef(Δt) and Δueb(Δt) by solving Equations (16) and (18) for each tetrahedron e

    2. update upek+1/3=ue(Δt) for each tetrahedron e by means of Equation (18b)

    3. compute one velocity variation Δu~m for each cluster m according to Equation (55)

    4. update velocity uek+1/3 for each tetrahedron e of each cluster m by means of Equation (56)

  3. Apply the 1st corrective step (CS1)

    1. Solve system (59) for the Δum unknown for each cluster m

    2. update velocity uek+2/3 for each tetrahedron e of each cluster m by means of Equation (58)

  4. Update the viscous forces FVek+2/3 for each tetrahedron e by means of Equation (61)

  5. Compute fluxes Fl¯l,ek+2/3(uk+2/3) for each external face l of tetrahedron e of each cluster m, by means of Equation (39)

  6. Apply the 2nd corrective step (CS2)

    1. Compute ηm for each cluster m by solving system (62)

    2. Compute the fluxes Flj,eint of the internal faces j of tetrahedrons e of the cluster m according to Equations (66)-(69)

    3. Compute fluxes Flj,eη of the cluster external faces by means of Equation (46)

    4. Update the final velocity uek+1 for each tetrahedron e according to Equation (64)

    5. Compute umc by solving the minimization problem in Equations (70)

  7. Update the gradient of the kinematic pressure Ψk+1 according to Equation (48), where uek+1 is replaced by umc

  8. Compute the kinematic pressure forces SΨ,ek+1 for each tetrahedron e by means of Equations (70)–(71)

  9. Set k = k+1 and go to point 2 for the solution of the next time step.

5. Model applications

5.1. Construction of the computational mesh and preliminary model operations

As mentioned in section 3.3, in the 3D space it is almost impossible to obtain a mesh satisfying the EDP given in Equations (10)–(11) (Forsyth, Citation1991; Joe, Citation1986 and cited references). Some algorithms exist (e.g. Qhull in MatlabFootnote3), which generate a 3D tetrahedral mesh forming a convex hull with arbitrary node location. Unfortunately, the generated mesh has very irregular elements inside it (e.g. slivers, caps, skinny tetrahedrons, …), along with several distances de,ep (see Equation (25)) strictly equal or close to zero. These irregularities always lead to instability and poor accuracy of the numerical solution (Letniowski, Citation1992).

The computational mesh used by the present solver is created by an off-line procedure, using two open source mesh generators, Netgen (Schöberl, Citation1997) and Tetgen (Hang, Citation2015).

We discretize first the computational domain with tetrahedrons using the mesh generator Netgen. The tetrahedral elements of the output mesh are quite regular in shape and size, even if they do not satisfy the EDP. Netgen also allows us to change the size of the tetrahedrons to discretize the internal subdomains properly, with quite smooth transitions of element size. Netgen also allows ‘user-friendly’ handling of the boundary domain.

Let NNET be the number of nodes of the Netgen mesh. For each boundary triangle bt we compute the corresponding circumsphere Σbt whose diametral plane contains bt, and we check if any internal node (i.e. not belonging to boundary faces) is internal to Σbt. At the end of this operation, we remove all the internal nodes inside the circumsphere of the boundary triangles (usually 0.02-0.03% of NNET). We call the number of the removed nodes Nr. The ensemble of the (NNETNr) nodes is the input for the mesh generator TETGEN, which regenerates the mesh domain starting from the nodal positions of the (NNETNr) nodes, still preserving the input domain boundaries. In order to optimize some aspect-ratio and shape tetrahedrons conditions, Tetgen can also insert additional Steiner nodes (Hang, Citation2015). The present numerical solver uses as its input the Tetgen output mesh.

Preliminary model operations concern (1) generation of the topology of the tetrahedrons and clusters of tetrahedrons, saved in separate arrays, (2) calculation of the matrix coefficients of systems (59)–(60) and (62), and (3) their factorization before the time iterations loop start.

5.2. Computation of the kinematic pressure ψ and of the body forces

Let N be the number of nodes of the computational mesh. We approximate the function ψ according to a Galerkin Finite Element approach, (72) Ψ=i=1,NwiΨ~i(72) where Ψ~i is the unknown nodal pressure value and wi is the Galerkin shape function in node i. Once Ψek+1 has been obtained inside each tetrahedron e as specified in sections 3.3 and 4.3, we minimize the following scalar functional, (73a) Ψ=e=1,NT(Ψ~ek+1Ψek+1)2(73a) rewritten as (73b) Ψ=e=1,NT[q=1,3{i=1,N(wixqΨ~i(xqΨk+1)e)2}]q={123xq={xyz(73b) where (xqΨk+1)e is the q-th components of Ψk+1 in tetrahedron e, previously computed. Ψ is a convex function and its minimum is obtained by setting at zero the partial derivatives of Equation (73b) with respect to the nodal Ψ~ik+1 values, (74a) ΨΨik+1=0i=1,,N(74a)

Equation (74a) can be written, according to Equation (72), as (74b) ΨΨ~ik+1=e=1,NT[q=1,3{l=1,N(wixqΨ~l(xqΨk+1)ewixq)wixql=1,N}]=0(74b)

Equation (74b) represents a linear system solved for the Ψ~l unknowns, with an (N x N) system matrix that is sparse, symmetric and positive-definite. Over the boundary faces of Γσ, we assign Dirichlet BCs for Ψ~l according to the prescribed boundary values. The diagonal and off-diagonal coefficients are (75a) Mi,iΨ=q=1,3wixqwixq,Mi,lΨ=q=1,3wixqwlxq(75a) and the i-th coefficient of the source term vector is (75b) SiΨ=(xqΨk+1)ewixq(75b)

We solve system (74)–(75) as the previous ones in sections 3.2, 3.3, or 4.2 and 4.3.

The forces over the bodies are easily computed as, (76a) F=FΨ+Fν with FΨ=b=1,Nbf(σb(fb=1,2,3Ψ~fb)nb3)(76a) and (76b) Fν=b=1,Nbf(ν(uek+1gb)djeσb) if bΓu(76b) (76c) Fν=0 if bΓσΓm(76c) where Nbf is the number of triangular faces of the body, σb and nb are the area of the b-th body face and the unit normal vector coming into the body, respectively, Ψ~fb is the value of the kinematic pressure at fb-th node of the body face, gb is the velocity vector assigned to the face, uek+1 is the velocity vector in the tetrahedron e with the b-th body face, and dje is the distance between the circumcenter of tetrahedron e and the circumcenter of the body face, computed as in Equation (33).

5.3. Test cases

The proposed numerical solver was applied to four different numerical tests. In the first test we analyzed the spatial and temporal accuracy of the model, as well as the required computational costs. In tests 2 and 3 we present two well-known literature applications, the lid driven cavity and the flow past a fixed sphere, according to different values of the Reynolds number. In the last test the real case of hemodynamic blood flow inside an abdominal aorta affected by an aneurysm is solved by the model inside a very irregular boundary.

For post-processing of the model outputs, we used the open source program Paraview.Footnote4

5.3.1. Taylor-Green vortices test

We first analyzed the accuracy of the proposed solver by comparison of the computed results with the known analytical solution of the Taylor–Green vortex test (Taylor & Green, Citation1937). The velocity vector components and the pressure field are (Taylor & Green, Citation1937), (77) u=cos(αx)sin(αy)eβtv=sin(αx)cos(αy)eβtw=0p=cos(2αx)+cos(2αy)4e2βta=π2β=2νa2ν=0.01m2s(77) such that the r.h.s. of Equation (1) is always zero. The initial conditions are found by setting t = 0 in Equation (77). We solve this problem up to 2 s in the domain [−3, 3]2 x [0, 0.25], by setting time-dependent essential BCs for the velocity, according to Equation (77), in the circumcenters of the boundary triangular faces of the lateral walls of the domain (Γu in section 2.1), as well as the hydrostatic stress in the nodes of the upper and lower wall (Γσ in section 2.1). The initial velocity vectors are set in the circumcenters of each tetrahedron, and assumed piecewise constant, while the initial kinematic pressures are set in the nodes. The initial values of the forces SΨ,e0 and VFe0 of the momentum equations of the MAST PS of the first time iteration are analytically found as (78a) SΨ,e0=Ψe0We(78a) (78b) VFe0=νΔ2ue0We(78b) where Ψe0 is computed from Equation (77) in the center of mass of tetrahedron e.

We discretize the domain using 5 meshes with mean element characteristic size hl (l = 1, … , 5) ranging from 0.00996 m to 0.066 m. hl is the mean value of the length of the sides of the tetrahedrons.

The discretized ICs uek=ue,0 do not satisfy the momentum and continuity equations of each element at time t = 0, as ue,0 P0,e, but the flux continuity through each face is missing. In order to circumvent this problem, before the beginning of the transient flow simulation with the BCs given by Equation (77), we compute the steady-state asymptotic solution corresponding to the BCs at t = 0 and we use it as ICs of the transient problem. We assume that the steady-state solution is attained when the L2 norm of the relative scatters of the computed u, v, w, and ψ, compared with the ones of the previous iteration, are small and the following tolerance holds (79) e=1,NT(u(v,w)ek+1u(v,w)eku(v,w)ek)2<1e04.andi=1,N(Ψik+1ΨikΨik)2<1e04(79) See in Tables and the L2 and L norms of the relative errors of the steady-state x and y velocity components and kinematic pressure with respect to the values computed at the face circumcenters according to Equation (77), for meshes with different size. The error of the z velocity component is negligible compared to the x and y errors.

Table 1. Test 1. L2 norms of relative errors and spatial rate of convergence.

Table 2. Test 1. L∞ norms of relative errors and spatial rate of convergence.

We also assume that the relative error errl, computed for the mesh with mean element size hl, is proportional to a power of hl, (80a) errl=(hl)rc,s(80a) where rc,s is the spatial rate of convergence, obtained by comparing the relative errors of two sequential sizes hl and hl+1 as (80b) rc,s=log(L2errlL2errl+1)log(hlhl+1)(80b)

The rate of convergence is shown in Tables and . The computed rc,s of the velocity components are slightly greater than 1 (ranging from 1.14–1.22) due to the piecewise approximation of u and v inside each tetrahedron. The convergence rate obtained for the kinematic pressure is greater (ranging from 1.45–1.59) and the reason could be the nodal pressure distribution inside each tetrahedron, as described in section 5.2. In Figure we plot the iso-contour lines, over a horizontal plane with z = 0.125 m, of the relative errors of the norm of u and of ψ obtained for the coarsest mesh. Close to the lateral boundary walls, the errors of the velocity decrease, due to the imposed BCs. On the opposite, the highest values of the relative errors of the kinematic pressure are close to the boundary lateral walls, since the Dirichlet BCs of ψ have been imposed over the upper and lower horizontal walls.

Figure 7. Test 1. Iso-contours of the relative error of the norm of (a) u, (b) ψ.

Figure 7. Test 1. Iso-contours of the relative error of the norm of (a) u, (b) ψ.

In Figure (a) we plot the norms of the relative errors, along with the 1st order convergence line.

Figure 8. Test 1. Investigation of the spatial and temporal accuracy. (a) Norms of relative errors vs. mean mesh size, (b) norms of relative errors vs. time step size.

Figure 8. Test 1. Investigation of the spatial and temporal accuracy. (a) Norms of relative errors vs. mean mesh size, (b) norms of relative errors vs. time step size.

With this test we also analyzed the time convergence rate of the algorithm. In order to cancel out the error due to the spatial discretization, we assumed as the reference solution the one obtained over the finest mesh (with mean size hl = 0.00996 m), and a time step size Δt = 0.001 s. At simulation time 2 s, we compared with the reference solution the numerical solutions obtained over the same mesh, assuming five different Δt values, ranging from 0.0015 s to 0.02 s. The L2 and L norms of the relative errors are shown in Tables and , along with the time rate of convergence, rc,t, computed by comparing the relative errors of two sequential time step sizes Δtl and Δtl+1 (80c) rc,t=log(L2errlL2errl+1)log(ΔtlΔtl+1)(80c)

Table 3. Test 1. L2 norms of relative errors and time rate of convergence.

Table 4. Test 1. L∞ norms of relative errors and time rate of convergence.

The rate of convergence rc,t is always greater than 1, ranging from 1.21–1.28, even if the model is 1st order accurate in time. This could be due to a twofold reason related to the MAST-PS, (1) the use of the internal time sub-grid during the ODEs solution, and (2) a polynomial time approximation order of the leaving momentum fluxes, using nG = 3 Gauss points. In Figure (b) we plot the norms of the relative errors along with the 1st-order convergence line.

For the finest mesh, for each time step adopted for the time convergence rate analysis, we computed the maximum CFL number, as (81) CFLmax=max(ΔtWe3||ue||)e=1,,NT(81) CFLmax ranges from 2.56 (for Δt = 0.02) to 0.0128 (for the reference solution).

We also investigated the computational (CPU) times required by the different model steps, using a single Intel Core i7 at 3.49 GHz. Because computational times strongly depends on the adopted computer and on the specific algorithm coding, we focused on the correlation existing between the the computational time of the single step and the number of elements. We set the average CPU time per iteration and per model step equal to (82a) CPU¯step=exp(c)NTβ(82a) and we assumed that a single step is efficiently solved as much as the β power exponent in the correlation 82(a) is small and close to 1. Equation (82a) in logarithmic space becomes (82b) ln(CPU¯step)=c+βln(NT)(82b)

See in Figure the CPU¯step time required for the solution of the single model steps, i.e. cell sorting (CPU¯S), solution of the MAST-PS step (CPU¯MPS), solution of the CS1 and CS2 steps (CPU¯CS1 and CPU¯CS2, respectively), as well as the kinematic pressure computation (CPU¯Ψ). MAST-PS is the most demanding one, but in this case the CPU is simply proportional to NT, and in Equation (82a) power β is equal to one. The CPU required by the other model steps grows in the logarithm space more than linearly with the number of tetrahedrons due to their ‘non-explicit nature’, since solution of large linear systems is involved, but β is smaller than 1.20 for the CS2 step, and smaller than 1.12 in all the other ones. The sorting cell operation is the least demanding algorithm step and its CPU time is 2–3 magnitude orders smaller than the MAST-PS one.

Figure 9. Test 1. Computational time of model steps.

Figure 9. Test 1. Computational time of model steps.

5.3.2. Lid driven cavity flow at different Reynolds numbers

In this test the flow is confined inside a square cavity, and it is driven by the upper wall displacement in horizontal direction. The set-up of the test is shown in Figure (a). Cavity is [0, L] x [0, 0.2] x [0, L] with L = 1 m. At the east, west and bottom walls we set zero velocity with no slip BCs (Γu), while at the front and rear walls we set free slip BCs (Γm), and we impose a constant in time horizontal velocity equal to 1 m/s over the top wall with no slip BCs (Γu). We set zero kinematic pressure in the lowest left corner with coordinates (0, 0, 0), as shown in Figure (a). ICs are zero velocity and pressure inside domain.

Figure 10. Test 2. (a) set-up of the numerical test and BCs. (b) section of the mesh with a cutting plane.

Figure 10. Test 2. (a) set-up of the numerical test and BCs. (b) section of the mesh with a cutting plane.

Let the Reynolds number be (83) Re=vmaxLν(83) where vmax = 1 m/s. We run simulations for Re = 100, 400 and 1000.

The imposed horizontal velocity of the upper lid drives the fluid inside the cavity into a vortical flow. The resulting complex flow structure shows a large central vortex and small recirculating zones close to the cavity corners, whose shape depends on the value of Re.

We discretize the domain with 87,740 tetrahedrons and 17,388 nodes (see in Figure (b) an intersection of the mesh with a cutting plane), resulting in 84,385 clusters. The time step size Δt is set to 0.025 s, and the maximum computed CFL values are 2.18, 2.08 and 1.96 for Re = 100, 400 ad 1000, respectively.

In Figure we plot the streamlines, as well as the vorticity and the iso-pressure fields. The minimum and maximum pressures are computed respectively at the upper-left and upper-right corners, where a discontinuity in the boundary condition occurs. Observe that, because of this singularity, no ad-hoc handling is required in the model, unlike what is found in Botella and Peyret (Citation1998), Boppana and Gajjar (Citation2010), Kuhlmann and Romanò (Citation2019), and cited references. The minimum pressure values are associated with the foci of the vortices, due to the high centrifugal acceleration occurring around them. We obtain good agreement with the literature results (e.g. Dalal et al., Citation2008 and cited references), and the results provided by the present solver match very well the ones of the 2D benchmark solutions given by Ghia et al. (Citation1982) and shown in Figure . The results in Figures and refer to the cutting plane (x-z) with y = 0.1 m (i.e. the diametral plane of the domain). The results obtained for other cutting planes (x-z) are very similar to the previous ones, and for brevity are not shown here.

Figure 11. Test 2. Velocity streamlines (left panels), vorticity (ωz) (central panels), iso (kinematic) pressure (right panels). Top Re = 100, middle Re = 400, bottom Re = 1000.

Figure 11. Test 2. Velocity streamlines (left panels), vorticity (ωz) (central panels), iso (kinematic) pressure (right panels). Top Re = 100, middle Re = 400, bottom Re = 1000.

Figure 12. Test 2. x velocity component (top panel), z velocity component (bottom panel) (Nomenclature ‘Ref.’ are the results by Ghia et al. Citation1982).

Figure 12. Test 2. x velocity component (top panel), z velocity component (bottom panel) (Nomenclature ‘Ref.’ are the results by Ghia et al. Citation1982).

Observe that the value of the vertical velocity component computed for Re = 400 and x = 0.9063 provided by Ghia et al. (Citation1982) (wGhia = −0.23827) (see table 2 of the referred paper) is quite different from the result obtained by the present solver. On the other hand, the wGhia result is missing in most of the papers where the lid-driven test is used as bench mark, including the papers reporting the solution by Ghia et al. (Citation1982) for many other points (e.g. Xue & Burton, Citation2013 and many others), whereas in other papers the mentioned wGhia vertical velocity component does not match the result, like in Dalal et al. (Citation2008).

5.3.3. Flow past a stationary sphere at different Reynolds numbers

In the past few decades, a plethora of experimental, theoretical and numerical studies of viscous flow past a stationary sphere S have been presented to investigate wake structures. The Reynolds number, defined as (84) Re=U0Dsν(84) based on the uniform undisturbed flow velocity U0, on the diameter of the sphere Ds, and on the surrounding fluid viscosity ν, is used as a parameter to classify the wake structure (e.g. Johnson & Patel, Citation1999; Ploumhans et al., Citation2002; Sakamoto & Haniu, Citation1990, and cited references). The wake structure has been a strongly debated topic, and several controversial findings have been obtained by authors in the literature studies (e.g. Johnson & Patel, Citation1999; Ploumhans et al., Citation2002; Sakamoto & Haniu, Citation1990, and cited references). According to experimental and numerical studies, above 270 < Re < 290, the flow becomes unsteady but periodic, and vortex shedding starts around Re = 300 (e.g. Johnson & Patel, Citation1999; Ploumhans et al., Citation2002; Sakamoto & Haniu, Citation1990, and cited references), with formation of hairpin vortices (as shown in Figure , from the experimental studies by Sakamoto and Haniu (Citation1990)). By progressively increasing Re, the vortices start to intertwine with each other, and above Re = 500 periodicity is lost (e.g. Johnson & Patel, Citation1999; Ploumhans et al., Citation2002; Sakamoto & Haniu, Citation1990, and cited references). For a more comprehensive review, we refer the readers to the cited works.

Figure 13. Test 3. Pattern of vortex shedding. Left panels Re = 300, (a) side view, (b) upper view. Right panel (c) 480 < Re < 800 (from Sakamoto & Haniu Citation1990)).

Figure 13. Test 3. Pattern of vortex shedding. Left panels Re = 300, (a) side view, (b) upper view. Right panel (c) 480 < Re < 800 (from Sakamoto & Haniu Citation1990)).

In the research referred to here we investigated the flow structures around a stationary sphere for Re = 300 and 600. The fluid is assumed to be water. In Figure we plot the 3D view of the domain and the setup of the numerical test. We assume the sphere S, with Ds = 0.0254 m, symmetrically placed inside a large cylinder C with diameter DC = 0.3 m ( 12 Ds) and length 1.1 m ( 43.5 Ds). The center of S is located 0.15 m ( 6 Ds) downstream of the inflow upstream face of C. At the upstream inflow section of C we set u uniform and constant in time (U0, 0, 0), and the same velocity is imposed at the lateral walls of C in such a way that the flow around S is only weakly affected by the walls. Zero pressure is assumed at the downstream outflow section. Over the surface of S we assume no-slip BC. We assume flow at rest and zero pressure inside C at t = 0.

Figure 14. Test 3. (a) 3D view of the domain. (b) setup and BCs of the numerical runs.

Figure 14. Test 3. (a) 3D view of the domain. (b) setup and BCs of the numerical runs.

The mesh size is refined in zone I (around S and downstream of it, as shown in Figure (a)) in order to reproduce the strong velocity gradients close to the surface of S and the fluid vortices in the wake. A larger mesh size is adopted for the rest of the domain (zone II in Figure (a)). The mesh size (defined as in test 1) used for zones I and II is 0.00055 and 0.0128 m, respectively, with a smooth transition between the two zones. The total number of tetrahedrons in the mesh is 2,309,771, with 2,243,199 clusters and 392,850 nodes, and the surface of S is discretized with 4332 triangles.

Unfortunately, description of initiation of vortex shedding is most often ignored in numerical studies. In physical experiments, initiation of vortex shedding is generated by flow instabilities amplifying small flow disturbances (Sungsu, Citation2000). Such disturbances include, among others, asymmetric domain geometry, vibrations of the pipe, non-uniformity and turbulence of the inflow velocity, non-uniform roughness of the pipe walls and sphere surface, … (Sungsu, Citation2000). All these sources are missing in numerical experiments, and, for a stationary sphere inside a symmetric domain, symmetric steady solution are attained even for Reynolds numbers at which experimental unsteady vortex shedding has been detected (Sungsu, Citation2000). In numerical experiments, vortex shedding could be generated by (1) computational truncation and round-off errors, strongly dependent on the characteristics of the numerical solver and the computer, or (2) specifically introduced numerical perturbations (Sungsu, Citation2000). Examples of such numerical perturbations can be found, for example, in Ploumhans et al. (Citation2002), Sungsu (Citation2000) and cited references.

We performed a first series of simulations at Re = 300 without numerical perturbations, with an impulsive start of the inflow velocity. After rapid changes during the early stages of the process, the flow characteristics converted to a stationary solution. In Figure we plot the streamlines of the stationary flow field in the (x-y) plane. After the early transient process due to the impulsive flow start, we computed the stationary values of drag and lift coefficients CD and CL listed in Table . These were obtained as CD=Fx/(1/2ρU0πDS2/4) and CL=Fy/(1/2ρU0πDS2/4), where Fx and Fy are the x and y components of the total force, sum of the pressure and the viscous forces. The streamlines in the (x-z) plane are almost symmetrical, and the mean-in-time value of the side coefficient CS, obtained as CS=Fz/(1/2ρU0πDS2/4), is 1.34e-05.

Figure 15. Test 3, Re = 300. Velocity streamlines in the (x–y) plane, stationary case.

Figure 15. Test 3, Re = 300. Velocity streamlines in the (x–y) plane, stationary case.

Table 5. Test 3. Values of the drag and lift coefficients.

It is important to underline that CD and CL are in very good agreement with the mean in time values provided by literature studies, which are reported in Table .

Similarly to Ploumhans et al. (Citation2002), the most efficient method for the present solver to trigger vortices has been, after the impulsive flow start, to set the y velocity component (85) v=sin(π(τ3))U04 with τ=tU0/DS(85) in the interval 3τ4, along the inflow section and the lateral walls of C. The time step size used for the simulations is 0.05 s and the maximum computed value of the CFL number is 3.5.

In Figure we plot the 3D view of the vorticity structures identified by the Q-criterium (Hunt et al., Citation1988). In Figures and we show the velocity streamlines and the iso-contour lines of the kinematic pressure in the (x-y) and (x-z) planes, respectively. The time difference among the panels is one quarter of a period, and after the fourth panel (3/4 of period), the cycle repeats again. The present model satisfactorily reproduces the hairpin shapes of the vortical structure. Observe that, due to the strong pressure gradients feeding the movement of the vortices, the pressure minima in this case do not match their foci.

Figure 16. Test 3, Re = 300. 3D periodic time evolution of the vortical structures.

Figure 16. Test 3, Re = 300. 3D periodic time evolution of the vortical structures.

Figure 17. Test 3, Re = 300. Periodic time evolution of the velocity streamlines (left panels) and kinematic pressure (right panels), (x–z) plane.

Figure 17. Test 3, Re = 300. Periodic time evolution of the velocity streamlines (left panels) and kinematic pressure (right panels), (x–z) plane.

Figure 18. Test 3, Re = 300. Periodic time evolution of the velocity streamlines (left panels) and kinematic pressure (right panels), (xy) plane.

Figure 18. Test 3, Re = 300. Periodic time evolution of the velocity streamlines (left panels) and kinematic pressure (right panels), (x–y) plane.

After a rapid transient phase, due to the impulsive start and to the imposed numerical perturbation, the time evolution of the CD and CL coefficients becomes periodic in time. The mean and amplitude values are listed in Table , and compared with the results provided by other literature works.

We also simulate the case with Re = 600. According to experimental observations, the shedding vortices become irregular for Re > 480 (Sakamoto & Haniu, Citation1990). In Figure (c) we have previously shown the pattern of vortex shedding for 480 < Re < 800 (from Sakamoto & Haniu, Citation1990). The setting of the ICs and BCs is the same as for Re = 300, as is the impulsive flow start. In this case, perturbation for generating vortex shedding was not necessary. The time step size used for the numerical runs was 0.025 s and the maximum value of the CFL number attained during the simulations was 3.12.

In Figure (a) we show the 3D vortical structure at τ* = 35, where τ* is defined in Equation (85) and the irregularity of the hairpin vortices and their intertwining with each other is evident, as experimentally observed by Sakamoto and Haniu (Citation1990). In Figure (b) we plot the time histories of the drag lift and side coefficients. As expected, we lost the periodic trend of CD and CL observed for Re = 300, and the value of the side coefficient is of the same order as CL.

Figure 19. Test 3, Re = 600. (a) 3D vortical structure at τ* = 35. (b) Time evolution of the drag, lift and side coefficients.

Figure 19. Test 3, Re = 600. (a) 3D vortical structure at τ* = 35. (b) Time evolution of the drag, lift and side coefficients.

As mentioned in section 3.1.1, for the case of Re = 300, we estimate Nr,k and predict the corresponding computational time of parallelization of the MAST algorithm assuming different Np number of processors. The maximum Nr,k is 3847. Neglecting the time ϵ and the other parallelization costs, the MAST solution time TMAST would be equal to (86) TMAST=i=1,Niter(r=1,NR,k(Int(Nr,kNp)+1))2TT(86) where Niter is the number of time steps and NR,k is the number of ranks at iteration k. This time is of course larger than the time Tmin computed assuming all available processors working together, because Nr,i can be smaller than Np and the ratio between TMAST and Tmin grows along with Np.

See in Table the ratios TMAST / TT and TMAST / Tmin computed in the simulation of test 3, with a mesh of 2,309,771 tetrahedrons, assuming a number of processors in the range 4-250, at the present time corresponding to small-medium workstations. You can observe that the ratio TMAST / Tmin attains a maximum value of 1.4 with the maximum number of 250 processors. This means that parallelization should work very well with this type of very popular computers even with large size problems.

Table 6. MAST solution time vs. no. of processors.

5.3.4. Simulation of blood flow inside aneurism

In this test, we simulated the hemodynamic flow conditions inside a real abdominal aorta affected by a large aneurysm without thrombus in the lumen. The computational domain was computed starting from the kinematic field of a real (44-year-old female) patient-specific aortic wall, obtained from the data recorded by an electrocardiogram-gated computer tomography angiography (CTA) during a stabilized cardiac cycle, as described in (Aricò et al., Citation2020 and cited references).

Besides the CTA images, additional input data were the measurements, in a resting condition, of pressure (on the left arm) and aorta volumetric flow rate (in the carotid artery), during a stabilized cardiac cycle (Aricò et al., Citation2020). The cardiac cycle Tc of the patient was 0.83333 s.

The real aortic segment was approximately 0.16 m long (see Figure (a)). The computational domain was extended with respect to the real one by means of two transition stretches, and the real cross-sections were linearly morphed into circles of equivalent radii r=A/π (Figure (a)). The diameters of the inflow and outflow artificial sections, Di and Do, are 0.03213 and 0.0256 m, respectively, and the two stretches were approximately Di long.

Figure 20. (a) Test 4. Real and computational domain. (b) computational mesh. (c) Section of the mesh with a generic plane.

Figure 20. (a) Test 4. Real and computational domain. (b) computational mesh. (c) Section of the mesh with a generic plane.

For the numerical model simulations, we set an inflow velocity profile and a uniform spatial pressure distribution along the upstream and downstream sections of the computational domain, respectively. The BCs of the present model were obtained as described in Aricò et al. (Citation2020) and cited references.

In Figure (a) we plot the waveforms of the inflow velocity (obtained by dividing the waveform of the flow rate by the area of the inflow cross-section), and the outlet pressure. The ‘time of the diastolic (systolic) pressure’ is the time corresponding to the minimal (maximal) aortic pressure – tdp (tsp) in Figure . tdp and tsp were computed to be 0.0589 and 0.2946 s after the start of the cycle (Nagy et al., Citation2015).

Figure 21. Test 4. (a) waveforms of mean-in-time inflow velocity and outlet pressure. (b) Womersley inflow velocity profiles at four significant times (from Aricò et al., Citation2020).

Figure 21. Test 4. (a) waveforms of mean-in-time inflow velocity and outlet pressure. (b) Womersley inflow velocity profiles at four significant times (from Aricò et al., Citation2020).

A Womersley velocity profile of the pulsatile flow (Womersley, Citation1955) was analytically computed for the diameter Di, as described in Aricò et al. (Citation2020), and assigned to the upstream boundary of the computational domain. The Womersley number α can be regarded as the ratio between the unsteady inertial forces and the viscous forces, and it is defined as (Womersley, Citation1955) (87) α=Rων with ω=1/Tc(87) where R is the radius of the vessel at the boundary section. We set ν = 3.77 × 10−6 m2/s (blood kinematic viscosity). The original Poiseuille velocity profile is flattened proportionally to the α number. In the present case α is around 24, and Figure (b) shows the profiles computed for the significant times listed in the table of Figure (a).

The maximum value of the Reynolds number attained during the simulations was approximately 1200, which implies a fully laminar flow.

The computational domain corresponds to the fixed-in-time geometry computed, at the tsp time, applying the procedure proposed in Nagy et al. (Citation2015). The computational mesh had 790,346 tetrahedrons and 146,995 nodes, and the number of cluster is 729,038. The mesh size ranged from 1.e-04 m to 1e-03 m. In Figure (b,c) we show the mesh and a zoom of a cutting generic plane. The time step size was 0.01 s and the maximum value of the CFL number computed during the simulations was 3.45.

In Figure , for the significant times listed in Figure (b), we show the computed velocity and the kinematic pressure fields. The black arrow indicates the main upstream-downstream flow direction. At tdp, the pressure gradient is oriented according to the main upstream-downstream direction. The velocity profile in the upstream portion of the studied reach is almost uniform along the radial vessel direction, and, close to the walls, recirculating flow zones arise (see the zoom in Figure ). These recirculating flows could be generated by the inflow velocity computed in the most lateral part of the Womersley profile (see Figure (b)). The blood flow decelerates in the central region of the aorta, due to the enlargement of the vessel because of the aneurysm, and accelerates downstream of the aneurysm, due to the reduction of the section of the vessel. Due to the reduction of the inflow velocity assigned at the inflow section, from time 0.25 Tc to tsp, and the corresponding increase of the assigned outlet pressure in the same time interval (see Figure ), the pressure gradient along the principal flow direction changes sign, becoming downstream-upstream oriented. The recirculating flow velocity zones close to the lateral walls then disappear, since the boundary Womersley velocities assigned at the inflow section are inward oriented. At time 0.54 Tc, the flow recirculation close to the vessel walls is stronger than at tdp. The size of the portion of the inflow section where the leaving Womersley velocities are set is similar to the one where the incoming velocities are assigned, and their norm is comparable to (or greater than) the values of the inflow velocity norm (see Figure (b)). Vorticities also arise in the central and downstream portions of the aortic vessel, as shown in the zoom of the velocity vector (Figure ).

Figure 22. Test 4. Computed velocity vectors and kinematic pressure at the significant times in Figure .

Figure 22. Test 4. Computed velocity vectors and kinematic pressure at the significant times in Figure 21.

Figure 23. Test 4. Zoom of the velocity fields for tdp and 0.54 Tc.

Figure 23. Test 4. Zoom of the velocity fields for tdp and 0.54 Tc.

In Figure we plot the values of the velocity components computed along the axis of the aneurism, shown in Figure , at four different times. The origin of the coordinates along the axis is also shown in Figure . These results are marked as ‘u (v, w) m1’. In Figure , we also superimpose the results obtained using a refined mesh, with 3,361,925 tetrahedrons, 599,988 nodes, and 3,116,998 clusters, marked as ‘u (v, w) m2’. The scatters between the two solutions are almost negligible.

Figure 24. Test 4. Velocity components computed along the axis shown in Figure , for two meshes.

Figure 24. Test 4. Velocity components computed along the axis shown in Figure 25, for two meshes.

Figure 25. Test 4. Initial and final axis position.

Figure 25. Test 4. Initial and final axis position.

6. Conclusions

A new algorithm for the numerical solution of the 3D Navier–Stokes equations for incompressible fluids has been presented and validated with synthetic tests. The algorithm is radically new and is based on the Raviart-Thomas first order spatial discretization of pressure and velocity. The convective terms were solved using the Marching in Space and Time (MAST) technique, previously applied only to groundwater transport and shallow water problems. The algorithm has the following merits with respect to the many other competitors: (1) It can be applied to any tetrahedral, unstructured and non-Delaunay mesh, generated inside irregular boundaries; (2) it fully preserves mass conservation, (3) the time step size is not constrained by the CFL limit, (4) the CPU time required for the solution of a single time step, with a single physical processor, grows with a β power almost equal to 1.1, due to the solution of linear systems with matrices that are always positive-definite, holding the M-property, factorized only once at the beginning of the first time step.

Code parallelization and introduction of a turbulence model to solve the turbulence structures within the element scale would make MAST-RT0 model ready to solve also many other problems of great interest, like Fluid-Structure Interactions (FSI), flows around moving boundaries (e.g. swimming fishes) (e.g. Salih et al., Citation2019), sloshing tanks (e.g. Ghalandari et al., Citation2019) and many others.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

References

Appendix 1. The Order subroutine

Call ET (j,k) the index of the jth tetrahedron neighbor to tetrahedron k and Fl (j,k) the flux between tetrahedrons k and ET (j,k). ET (j,k) = 0 if the jth face of tetrahedron k is a boundary face. Flux FL (j,k) is positive if it goes from k to ET (j,k), negative otherwise. We assume the fluxes of two neighbor tetrahedrons k and kp to have the same norm and opposite sign in the shared face. The Order input are the integer matrices ET, FL and the initial IORD vector.

Call INV(k) the position m of tetrahedron k in the IORD vector, such that INV(IORD(m)) = m. Vector INV is initialized according to the input IORD vector. Initialize also two other auxiliary vectors AUX and BACK and one auxiliary matrix JC with size[4, NT], where NT is the number of tetrahedrons. All vectors RANK, BACK and JC are initialized with 0.

The Order algorithm computes the vector RANK and updates the vector IORD. The output vector IORD provides, for each index m, the tetrahedron k = IORD(m) with the following properties: (A1) (RANK(ET(j,k))>0 or JC (j,k)0) if (FL (j,k)<0 and ET (j,k)>0)j=1,,4(A1) (A2) (RANK(k)>RANK (ET(j,k)) or JC (j,k)0)if (FL (j,k)<0 and ET (j,k)>0)j=1,,4(A2) (A3) (INV(ET(j,k))<INV (k) or JC (j,k)0) if (FL (j,k)<0 and ET(j,k)>0)for any value k=iord (s) and 0<s<m(A3)

The general strategy is to compute the mth tetrahedron of IORD and its rank RANK(IORD(m)) after the computation of the previous tetrahedrons IORD(1), … , IORD (m–1) and their rank. Order adopts the following subroutines, where apex i marks input variables and apex o marks output variables:

A1.1. Subroutine Switch

Input: nx, AUX, IORD, INV, FL, ET, RANK

Output: RANK, IORD, INV

Given the known index r = AUX(nx) of a tetrahedron which satisfies constraints A1 and has RANKi(r) = 0, compute the new rank of r as the maximum rank of the neighbor tetrahedrons that satisfy constraint (A1), plus one. This allows tetrahedron r to satisfy also requirement A2. Switch the position of tetrahedrons r and IORDi(m) in the IORDo vector, by setting: s = INVi(r), IORDo(m) = r, IORDo(s) = IORDi(,m), INVo(r) =m, INVo(IORDi(m)) = s. See the flow-chart in Figure .

A1.2. Subroutine Search

Input: nx, AUX, IORD, INV, FL, ET, RANK, BACK, JC

Output: nx, AUX, BACK, RANK, IORD, INV, JC

Call BACK(k) the neighbor of tetrahedron k with RANK(k) = 0 and RANK(BACK(k)) = 0 with the maximum flux going from to BACK(k) to k. Given a length nxi≥ 1, check if constraint (A1) is satisfied for k = AUX(nx). If it is satisfied, apply Switch and set nxo = nxi – 1. Otherwise, select the neighbor tetrahedron kp with the minimum (maximum absolute value) entering flux smaller than zero and with RANKi(kp) = 0. If BACK(kp) ≠ 0, we have a loop. In this case apply subroutine Cut and iterate the check. If constraint (A1) is not satisfied, select the neighbor tetrahedron kp with RANK(kp) = 0 and minimum entering flux, set BACK(k) = kp, update nxo with nxi + 1, set AUX(nxo) = kp, and iterate until constraint (A1) is satisfied. See the flow-chart in Figure .

A1.3. Subroutine Cut

Input: nx, AUX, BACK, JC, FL, ET

Output: nx, BACK, JC

If a loop is found, compute the index mb and the tetrahedron kb = AUXi (mb) corresponding to the minimum positive flux going from BACK (AUXi (m)) to AUXi (m) for mp ≤ m ≤ nxi, where AUXi (mp) = BACK (AUXi (nxi)). Set nxo = mb and BACKo (AUXi (m)) = 0, for mb ≤ m ≤ nxi. Set ka = BACK0i (kb), JCo (n1,ka) = kb and JCo (n2,kb) = ka, where n1 and n2 are the local indices of the face common to tetrahedrons ka and kb. See the flow-chart in Figure .

In Order we loop the index p from 1 to nel. At each iteration, if nx > 0 we apply Search. If nx = 0, we test the tetrahedron k = IORD(p). If constraint (A1) is satisfied, we apply Switch. If it is not satisfied, we set nx = 1, AUX(nx) = k and apply Search. See the flow-chart in Figure .

Observe that, if zero loops and zero flux sign changes occur, the solution obtained in the MAST procedure is the same with any adopted sorting rule providing a sequence vector IORD also different from the vector computed in the Order subroutine, if constraint (A3) is satisfied for all the tetrahedrons. Due to loop cuts and flux sign changes, this is not true and a much better solution turns out to be the one obtained by solving sequentially all the tetrahedrons with the same rank, starting from 1, also using parallel computing if available. Any open source subroutine, like QUICKSORT, can be used to order all the tetrahedrons according to their rank value after the RANK solution of Order is found.

See the flow-chart of the Order, Search, Cut and Switch subroutines in Figures A1–A4.

Figure A1. Flow-chart of the Order subroutine.

Figure A1. Flow-chart of the Order subroutine.

Figure A2. Flow-chart of the Search subroutine.

Figure A2. Flow-chart of the Search subroutine.

Figure A3. Flow-chart of the Cut subroutine.

Figure A3. Flow-chart of the Cut subroutine.

Figure A4. Flow-chart of the Switch subroutine.

Figure A4. Flow-chart of the Switch subroutine.