Full article: The coupled-cluster formalism – a mathematical perspective

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

The Coupled-Cluster (CC) theory is one of the most successful high precision methods used to solve the stationary Schrödinger equation. In this article, we address the mathematical foundation of this theory with focus on the advances made in the past decade. Rather than solely relying on spectral gap assumptions (non-degeneracy of the ground state), we highlight the importance of coercivity assumptions – Gårding type inequalities – for the local uniqueness of the CC solution. Based on local strong monotonicity, different sufficient conditions for a local unique solution are suggested. One of the criteria assumes the relative smallness of the total cluster amplitudes (after possibly removing the single amplitudes) compared to the Gårding constants. In the extended CC theory the Lagrange multipliers are wave function parameters and, by means of the bivariational principle, we here derive a connection between the exact cluster amplitudes and the Lagrange multipliers. This relation might prove useful when determining the quality of a CC solution. Furthermore, the use of an Aubin–Nitsche duality type method in different CC approaches is discussed and contrasted with the bivariational principle.

GRAPHICAL ABSTRACT

KEYWORDS:

1. Introduction

One of the most successful high accuracy ab initio computational schemes is the Coupled-Cluster (CC) approach [Citation1]. It goes back to Coester [Citation2], who in 1958 suggested using an exponential parametrisation of the wave function. This parametrisation was derived independently by Hubbard [Citation3] and Hugenholtz [Citation4] in 1957 as an alternative to summing many-body perturbation theory (MBPT) contributions order by order. At that time, Coester was not able to come up with working equations that one might try to solve. Those were presented by Čížek [Citation5] after the relevant concepts had been introduced in the context of quantum chemistry. In this work, Čížek mentioned the projective approach of the equations, which is exploited in all conventional CC methods until today. Firstly, in [Citation5] the working amplitudes and energy equations were derived when the cluster operator is approximated by merely double excitations (CCD). Secondly, the CC theory was compared with MBPT, configuration interaction (CI), and the pair cluster expansions of Sinanoğlu [Citation6]. Thirdly, the first ever CCD and linearised CCD computations were reported for nitrogen and a model of benzene. For a more detailed description of the CC history, we refer to reviews by pioneers of the theory. For example, Kümmel [Citation7] and Čížek [Citation8] wrote such articles within the workshop 'Coupled Cluster Theory of Electron Correlation'. Furthermore, see the articles by Bartlett [Citation9], Paldus [Citation10], Arponen [Citation11] and Bishop [Citation12].

Unlike the CI method, the CC formalism does not arise from the Rayleigh–Ritz variational principle and is therefore said to be non-variational in that sense. This yields the well-known fact that the CC energy is in general not equal to the expectation value of the Hamiltonian and in general not an upper bound to the ground-state energy. The reliability of quantum chemical methods is in most cases based on benchmarking, and the results' physical and chemical consistency with existing theory. The gold standard of quantum chemistry – the CCSD(T) method [Citation13,Citation14] – is no exception of this. It is the importance of sharp statements of an ab initio method's reliability that is the motivation of this work. Here, we build on a local analysis [Citation15] of the CC theory that also holds in the exact, so-called continuous, formulation with infinitely many one-particle basis functions [Citation16,Citation17].

There is a rich history of mathematical investigations addressing CC methods prior to the local analyses in [Citation15–17]. To give a complete historical account is beyond the scope of this article. We therefore limit ourselves and mention only a few important results. As a system of polynomial equations, the CC equations can have real or if the cluster operator is truncated, complex solutions. Furthermore, using quasi-Newton–Raphson methods to compute solutions of non-linear equations can lead to divergence since the approximated Jacobian may become singular. This is, in particular, the case when strongly correlated systems are considered. These and other related aspects of the CC theory have been addressed by Živković and Monkhorst [Citation18,Citation19] and Piecuch et al. [Citation20]. Significant advances in the understanding of the nature of multiple solutions of single-reference CC have been made by Živković and Monkhorst [Citation19], Kowalski and Jankowski [Citation21], and by Piecuch and Kowalski [Citation22]. An interesting attempt to address the existence of a cluster operator and cluster expansion in the open-shell case was done by Jeziorski and Paldus [Citation23]. We would also like to mention the coupled-electron pair approximation (CEPA) [Citation24–27]. This approach was introduced as a size-consistent alternative to the CISD method that was achieved by modifying (through topological factors [Citation28]) the CI equations to account for higher excitations. This makes CEPA non-variational (for an adapted variational formulation of CEPA see [Citation29]). CEPA can be regarded as an approximation of the CC method and does not form a truncation hierarchy that converges to the full-CI limit [Citation30].

Mathematical analysis is a well-established part of many natural sciences. Plenty examples show how various fields benefit from mathematical rigor and that mathematical analysis can define a framework of the method's applicability. This work takes off from recent developments of local analyses of CC methods, including the single-reference CC, the extended CC, the tailored CC (TCC) and its special case the CC method tailored by tensor network states (TNS–TCC) [Citation15–17,Citation31,Citation32]. In the spirit of Robert Parr's fundamental approach to quantum chemistry, which was honored during the 58th Sanibel Symposium, we here present some mathematical concepts used to analyse CC methods in a functional analytic framework. These yield rigorous analytical results that are independent of benchmarks and interpretations but rather based on mathematical assumptions. Adapting these assumptions to cover the computations performed in practice remains a challenge and is subject of future work. The local analysis puts as a sufficient – but not necessary – condition that the cluster amplitudes are small relative to other constants. We discuss a possible way out of this restriction motivated by the fact that CC calculations are known to work for large (single) amplitudes as well. We furthermore address the $t_{1}$ -diagnostic [Citation33] and mathematically derive a more sophisticated strategy that includes all cluster amplitudes and offers a sufficient condition of a locally unique and quasi-optimal solution (after possibly rotating out the single amplitudes) rather than rejection based on just large single amplitudes. We furthermore complement the literature by a detailed discussion on spectral gap assumptions. In this context, spectrum refers to the point spectrum, i.e. the eigenvalues of relevant operators. Although a gap between the highest occupied molecular orbital and the lowest unoccupied molecular orbital (HOMO–LUMO gap), or a spectral gap of the exact Hamiltonian $\hat{H}$ (non-degenerate ground state), is crucial for the analysis, we highlight the importance of coercivity conditions, either for $\hat{H}$ or the Fock operator $\hat{F}$ . Additionally, we derive an optimal constant in the monotonicity proof of the CC function for the finite dimensional case, i.e. the projected CC theory. Comparing the CC Lagrangian with the extended CC formulation [Citation31], we propose by means of the bivariational principle an alternative to measure the quality of the Lagrange multipliers, here interpreted as wave function parameters.

This article is structured as follows: In Section 2, a brief summary of the CC theory is presented. We introduce the set of admissible wave functions and moreover define cluster operators, the CC function, and the CC energy (for a full scope treatment of the mathematical formulation of CC theory presented here we refer to [Citation16,Citation17]). In Section 3, we discuss the use of local analysis within different CC methods. Key concepts here are (see Section 3.1 for definitions) local strong monotonicity and local Lipschitz continuity of the CC function f, which – if fulfilled – are sufficient conditions for a locally unique solution of f=0 by Zarantonello's theorem. In particular, the importance of so-called Gårding inequalities is demonstrated. This is done both for the Hamiltonian, Section 3.2.1, and for the Fock operator, Section 3.2.2. We conclude in Section 3.3 with an overview of the Aubin-Nitsche method and the bivariational principle as they are used in CC methods for estimating the truncation error of the energy.

The authors are thankful to the organisers of the 58th Sanibel Symposium under which many ideas presented here took form. Moreover, the anonymous referee greatly improved a previous draft of this article – especially putting the local analysis, under consideration here, into the context of the rich quantum chemistry literature on CC methods. This work was supported by the European Research Council (ERC-STG-2014) through the Grant No. 639508, and furthermore supported by the Norwegian Research Council through the CoE Hylleraas Centre for Quantum Molecular Sciences Grant No. 262695. AL and FMF thank Simen Kvaal and Thomas Bondo Pedersen for useful comments and discussions.

2. Wave functions on an exponential manifold

The aim of electronic many-body methods, such as the CC approach, is to solve the electronic Schrödinger equation (SE) $\hat{H} ψ = E_{0} ψ$ of an N-electron system. Here, $E_{0}$ is the ground-state energy and $\hat{H}$ the self-adjoint Coulomb Hamiltonian. In this work, we restrict our attention to real Hamiltonians and wave functions. We emphasise that the mathematical framework of Hermitian operators is not sufficient to support the necessary spectral theory for quantum mechanics. Thirring exemplified this with the radial momentum operator ${\hat{P}}_{r} = - i ℏ (\partial / \partial r)$ on $D ({\hat{P}}_{r}) = {ψ \in L^{2} ((0, \infty)) : ψ (r = 0) = 0 and {\hat{P}}_{r} ψ \in L^{2} ((0, \infty))}$ [Citation34].

From a mathematical viewpoint the Coulomb Hamiltonian, like most differential operators, is studied in its weak form to allow a larger variety of solutions. Set $Ω = R^{3} \times {\pm \frac{1}{2}}$ (or any other appropriate region in space and number of spin states) and let $\int_{Ω^{N}} d τ$ denote both integration and summation over spatial and spin degrees of freedom. Multiplying the SE on both sides with a smooth and compactly supported function $φ \in C_{c}^{\infty} (Ω^{N})$ , a so-called test function, and integrating by parts yields ( $\nabla = (\nabla_{r_{1}}, \dots, \nabla_{r_{N}})$ ) (1) $\frac{1}{2} \int_{Ω^{N}} \nabla ψ \cdot \nabla φ d τ + \int_{Ω^{N}} ψ {\hat{V}}_{C} φ d τ = E_{0} \int_{Ω^{N}} ψ φ d τ,$ (1) where ${\hat{V}}_{C}$ denotes the Coulomb operator (containing both the Coulomb attraction and repulsion) and ψ a solution of the SE. It follows immediately that the l.h.s. of Equation (Equation1(1) $\frac{1}{2} \int_{Ω^{N}} \nabla ψ \cdot \nabla φ d τ + \int_{Ω^{N}} ψ {\hat{V}}_{C} φ d τ = E_{0} \int_{Ω^{N}} ψ φ d τ,$ (1) ) defines a bilinear form $a (\cdot, \cdot) : C_{c}^{\infty} \times C_{c}^{\infty} \to K$ with $K$ being the underlying algebraic field. Boundedness and ellipticity of this bilinear form, however, are non-trivial consequences that go back to Hardy–Rellich inequalities proving that ${\hat{V}}_{C} : C_{c}^{\infty} \to L^{2}$ is bounded (for a general introduction see [Citation35]). Note that this treatment of the SE extends the set of admissible wave functions to the set of antisymmetric $L^{2}$ -functions ψ of finite kinetic energy $K (ψ)$ , i.e. $∥ ψ ∥_{2}^{2} := \int_{Ω^{N}} | ψ |^{2} d τ < + \infty$ and $K (ψ) := \frac{1}{2} \sum_{i = 1}^{N} \int_{Ω^{N}} | \nabla_{r_{i}} ψ |^{2} d τ < + \infty .$ We denote this space $H^{1} (Ω^{N})$ and impose the norm $∥ \cdot ∥ : H^{1} \to R; ψ \mapsto \sqrt{∥ ψ ∥_{2}^{2} + 2 K (ψ)} .$ In this topology $C_{c}^{\infty} (Ω^{N}) \subseteq H^{1} (Ω^{N})$ is dense. Hence, the bilinear form $a (\cdot, \cdot)$ is continuously extendable to $H^{1} (Ω^{N})$ . We define the operator ${\hat{H}}_{w} : H^{1} \to (H^{1})^{'}; ψ \mapsto {\hat{H}}_{w} ψ = a (ψ, \cdot),$ where $(H^{1})^{'}$ is dual space of $H^{1}$ , which we shall denote $H^{- 1}$ from now on. Note that ${\hat{H}}_{w}$ maps indeed into $H^{- 1}$ since boundedness and ellipticity are preserved under continuous extensions. Furthermore, the r.h.s. of Equation (Equation1(1) $\frac{1}{2} \int_{Ω^{N}} \nabla ψ \cdot \nabla φ d τ + \int_{Ω^{N}} ψ {\hat{V}}_{C} φ d τ = E_{0} \int_{Ω^{N}} ψ φ d τ,$ (1) ) can be generalised to the dual pairing allowing to reformulate the SE as an operator equation: Find $ψ \in H^{1}$ such that ${\hat{H}}_{w} ψ = E_{0} ψ^{'}$ , with $ψ^{'}$ being the Riesz representation of ψ. This general approach to the SE was to the best of our knowledge not considered in the mathematical analyses of CC theory prior to the work of Schneider and Rohwedder [Citation15–17]. Subsequently, we consider this weak formulation and for simplicity write ${\hat{H}}_{w} = \hat{H}$ .

Different parameterisations of ψ lead to different approximation schemes, subject of this article is the CC scheme, i.e. we parameterise ψ on an exponential manifold. We assume that the solution $ψ_{*}$ can be written $ψ_{*} = φ_{0} + ψ_{⊥}$ , where $φ_{0}$ is a reference determinant of N one-electron functions and $ψ_{⊥}$ is an element of ${φ_{0}}^{⊥}$ , the $L^{2}$ -orthogonal complement of $φ_{0}$ . We denote the $L^{2}$ -inner product by $⟨ \cdot | \cdot ⟩$ and follow the quantum chemistry notation for expectation values of operators, i.e. $⟨ ψ | \hat{A} | ψ ⟩$ . In particular, assuming that $\hat{H}$ supports a ground state, which is always the case for Coulomb systems [Citation36], the Rayleigh–Ritz variational principle reads $E_{0} = min_{ψ \neq 0} \frac{⟨ ψ | \hat{H} | ψ ⟩}{⟨ ψ | ψ ⟩} =: min_{ψ \neq 0} R (ψ),$ with $ψ \in H^{1}$ . Note that although we assume ψ to be normalisable ( $L^{2}$ -summable), we do not impose $∥ ψ ∥_{2} = 1$ , but rather $∥ φ_{0} ∥_{2} = 1$ . Furthermore, by construction of the solution $ψ_{*}$ , we assume intermediate normalisation $⟨ φ_{0} | ψ ⟩ = 1$ .

Next, let ${χ_{k}} \subset H^{1} (Ω)$ be an $L^{2} (Ω)$ -orthonormal one-electron basis of the space of admissible one-electron wave functions. Unless we explicitly write ${χ_{k}}_{k = 1}^{K}$ we refer to the infinite dimensional setting. We construct from this set an $L^{2} (Ω^{N})$ -orthonormal Slater basis in the usual fashion denoted ${φ_{μ}}$ . Note that the N-particle basis functions ${φ_{μ}}$ span the infinite dimensional space of all possible excitations with respect to the reference determinant $φ_{0}$ . In this notation we have $ψ = φ_{0} + ψ_{⊥} = φ_{0} + \sum_{μ} s_{μ} φ_{μ}$ , where ${s_{μ}}$ are the $L^{2}$ -weights of ψ in the given Slater basis, i.e. $s_{μ} = ⟨ φ_{μ} | ψ ⟩$ . We formally define the cluster operators by $\hat{S} = \sum_{μ} s_{μ} {\hat{X}}_{μ}$ , where ${\hat{X}}_{μ}$ excites the reference state $φ_{0}$ to the state $φ_{μ}$ . We obtain $ψ = (\hat{I} + \hat{S}) φ_{0}$ with $\hat{I}$ denoting the identity operator. The coefficients $s_{μ}$ are called cluster amplitudes and we say that $s = {s_{μ}}$ is a set of admissible cluster amplitudes if $\hat{S} φ_{0} \in L^{2}$ and $K (\hat{S} φ_{0}) < + \infty$ . Due to the one-to-one relationship between cluster amplitudes and linearly parametrised wave functions, a natural choice for a norm on the space of admissible cluster amplitudes is the corresponding wave function norm of $\hat{S} φ_{0}$ [Citation15,Citation16], i.e. $∥ s ∥^{2} = ∥ \hat{S} φ_{0} ∥^{2} = ∥ \hat{S} φ_{0} ∥_{2}^{2} + K (\hat{S} φ_{0}) .$

2.1. The exponential ansatz

The CC theory is based on an exponential parametrisation of wave functions. This is an alternative and, assuming full excitation rank (explained below) of the cluster operators, equivalent description of the full CI (FCI) wave function. Since its introduction by Hubbard [Citation3] and, independently, Hugenholtz [Citation4], the unique parametrisation of a wave function ψ by the exponential $ψ = e^{\hat{T}} φ_{0}$ was assumed to be true and motivated from formal manipulations. However, the unique representation of functions in a Hilbert space is by nature a mathematical problem and was rigorously proven for the exponential parametrisation in the infinite dimensional case by Rohwedder [Citation16].

A key element in deriving the exponential parameterisation from the mathematical viewpoint is the well-definedness of the exponential of $\hat{T}$ (or equivalently the logarithm of $\hat{I} + \hat{S}$ ), which is subject of functional calculus. We emphasise that the applicability of functional calculus depends strongly on the operator's domain since different domains may imply different properties of the operator, e.g. boundedness, essential self-adjointness, sectorial spectrum, etc. By the fact that Rohwedder [Citation16] showed the $H^{1}$ -continuity of cluster operators in a continuous setting, the functional calculus for bounded operators was proven to be applicable.

In the finite dimensional case this result was known in the quantum chemistry community, see, e.g. Živković and Monkhorst [Citation19]. However, this result was revisited by Schneider [Citation15] using the Cauchy–Dunford calculus. To the best of our knowledge, the subtleties addressed in [Citation15,Citation16] have not been part of previous considerations in mathematical analysis of CC theory. These important results demonstrate how quantum chemistry benefits from mathematics on a very fundamental level. The continuous CC theory amounts to the exact formulation where the set ${χ_{k}}$ forms a basis (in the strict mathematical sense) of the one particle space $H^{1} (Ω)$ . In a for this article appropriate form, we recall Rohwedder's result [Citation16]:

(i) Let $φ_{0}$ denote a reference determinant, e.g. the Hartree–Fock solution. Given a wave function $ψ_{⊥} \in {φ_{0}}^{⊥} \cap L^{2}$ , i.e. $⟨ ψ_{⊥} | φ_{0} ⟩ = 0$ , set $S = S_{ψ_{⊥}}$ where $S_{ψ_{⊥}} φ_{0} = ψ_{⊥}$ and note that $S \in B (L^{2}, L^{2})$ , i.e. a bounded linear operator from $L^{2}$ into $L^{2}$ . Then, $ψ_{⊥} \in H^{1}$ if and only if $S \in B (H^{1}, H^{1})$ . Furthermore, there exists a constant C independent of $ψ_{⊥}$ such that $∥ ψ_{⊥} ∥ \leq ∥ S ∥ \leq C ∥ ψ_{⊥} ∥ .$ An equivalent statement holds for the $L^{2}$ -adjoint of S.

(ii) The exponential map $\hat{T} \mapsto e^{\hat{T}}$ is a $C^{\infty}$ isomorphism between $C := {\hat{T} : \hat{T} \in B (H^{1}, H^{1})}$ and $I + C := {\hat{I} + \hat{T} : \hat{T} \in B (H^{1}, H^{1})}$ . In particular, for any $ψ \in H^{1}$ with $⟨ φ_{0} | ψ ⟩ = 1$ there exists a unique $\hat{T}$ such that $ψ = e^{\hat{T}} φ_{0}$ .

Note that this result holds for any orthonormal set of N-particle basis functions spanning the space of selected excitations with respect to the reference determinant $φ_{0}$ . However, it is required that the excitation rank of the cluster operators remains untruncated, i.e. $\hat{T} = \sum_{k = 1}^{N} {\hat{T}}_{k}$ , where ${\hat{T}}_{1}$ corresponds to single excitations, ${\hat{T}}_{2}$ to double excitations,…, ${\hat{T}}_{N}$ to N-fold excitations. Consequently, we have (2) $ψ = \exp ({\hat{T}}_{1} + \dots + {\hat{T}}_{N}) φ_{0}$ (2) in the case of full excitation rank.

The usual identification between the linear and exponential parametrisation holds [Citation37]: Write $\hat{S} = {\hat{S}}_{1} + \dots + {\hat{S}}_{N}$ and suppose that the linear parametrisation is given by (3) $ψ = (\hat{I} + {\hat{S}}_{1} + \dots + {\hat{S}}_{N}) φ_{0} .$ (3) Expanding the exponential in Equation (Equation2(2) $ψ = \exp ({\hat{T}}_{1} + \dots + {\hat{T}}_{N}) φ_{0}$ (2) ), and comparing with Equation (Equation3(3) $ψ = (\hat{I} + {\hat{S}}_{1} + \dots + {\hat{S}}_{N}) φ_{0} .$ (3) ), then yields ${\hat{T}}_{1} = {\hat{S}}_{1}, {\hat{T}}_{2} = {\hat{S}}_{2} - \frac{1}{2} {\hat{S}}_{1}^{2}, \dots$ and for the amplitudes $t_{i}^{a} = c_{i}^{a} / c_{0}, t_{i j}^{a b} = c_{i j}^{a b} / c_{0} - (c_{i}^{a} c_{j}^{b} - c_{j}^{a} c_{i}^{b}) / c_{0}^{2}, \dots,$ where $c_{0}$ is the FCI coefficient of the reference determinant (here $c_{0} = 1$ ). This shows a one-to-one relation for untruncated linear and exponential parameterisations. Restricting the parametrisation on the sub-manifold of excitation rank k<N, this one-to-one relationship is in general not true (see Remark 2 in [Citation31]): Consider CCSD for N>2 particles, i.e. $ψ = e^{{\hat{T}}_{1} + {\hat{T}}_{2}} φ_{0}$ . Expanding the exponential yields ${\hat{T}}_{1} + {\hat{T}}_{2} + \frac{({\hat{T}}_{1} + {\hat{T}}_{2})^{2}}{2} + \dots + \frac{({\hat{T}}_{1} + {\hat{T}}_{2})^{N}}{N!} = \hat{S},$ which is not a CISD parametrisation, unless for the trivial case ${\hat{T}}_{1} = {\hat{T}}_{2} = 0$ .

2.2. The CC energy

Being able to express any wave function in $H^{1}$ on an exponential manifold, it is straightforward to derive the linked CC equations [Citation37]: (4) $\begin{aligned} E (t) & = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩, \\ (f (t))_{μ} & = ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ = 0, for all φ_{μ} . \end{aligned}$ (4) Here, $φ_{0}$ and all the (visavi $φ_{0}$ ) excited determinants $φ_{μ}$ are assumed to form a basis of the anti-symmetric part of $H^{1}$ . Note that the above equation defines the CC function f and the CC energy function $E$ . Theorem 5.3 from [Citation16] demonstrates that the CC theory provides a wave function that satisfies $R (ψ_{*}) = E_{0} = E (t_{*})$ :

The continuous (and with full excitation rank) CC amplitudes $t_{*}$ solve $f (t_{*}) = 0$ fulfilling $E (t_{*}) = E_{0}$ if and only if the corresponding function $ψ_{*} = e^{{\hat{T}}_{*}} φ_{0}$ solves the SE $\hat{H} ψ_{*} = E_{0} ψ_{*}$ .

By this fact and with $E_{0} = E (t_{*})$ , if $t_{*}$ solves $f (t_{*}) = 0$ the SE yields $⟨ ψ_{*} | \hat{H} | ψ_{*} ⟩ = E_{0} ⟨ ψ_{*} | ψ_{*} ⟩ .$ Hence, the CC amplitudes describe a function $ψ_{*}$ that provides the system's energy in the usual quantum mechanical setting, i.e. $R (ψ_{*}) = E (t_{*})$ .

In practice, computations are carried out using a finite basis ${χ_{k}}_{k = 1}^{K}$ and furthermore with a truncated excitation rank ${\hat{T}}^{(n)} = {\hat{T}}_{1} + \dots {\hat{T}}_{n}$ , n<N. The total truncation level can then be denoted by $d = (K, n)$ , and where we solve f=0 on $V^{(d)}$ to obtain $f (t_{d}) = 0$ , $E_{d} = E (t_{d})$ . We note the following from the literature:

(i) Given a finite one-electron basis ${χ_{k}}_{k = 1}^{K}$ , we denote the span of the corresponding Slater basis by $H_{K}^{1}$ . With full excitation rank (n=N) Proposition 4.7 in [Citation15] gives: $f (t_{d}) = 0$ and $E_{K} = E (t_{d})$ if $ψ_{d} = e^{{\hat{T}}_{d}} φ_{0}$ solves the SE on $H_{K}^{1}$ , i.e. $\hat{H} e^{{\hat{T}}_{d}} φ_{0} = E_{K} e^{{\hat{T}}_{d}} φ_{0}$ . By the argument of Monkhorst in [Citation38] we can establish the reverse: Assume $f (t_{d}) = 0$ and set $E_{K} = E (t_{d})$ , then since ${\hat{I}}_{K} = | φ_{0} ⟩ ⟨ φ_{0} | + \sum_{μ} | φ_{μ} ⟩ ⟨ φ_{μ} |$ we obtain (Equation (38) in [Citation38]) $\begin{aligned} ⟨ φ_{0} | e^{{\hat{T}}_{d}^{†}} e^{{\hat{T}}_{d}} | φ_{0} ⟩ R (e^{{\hat{T}}_{d}} φ_{0}) \\ = ⟨ φ_{0} | e^{{\hat{T}}_{d}^{†}} e^{{\hat{T}}_{d}} {\hat{I}}_{K} e^{- {\hat{T}}_{d}} \hat{H} e^{{\hat{T}}_{d}} | φ_{0} ⟩ \\ = ⟨ φ_{0} | e^{{\hat{T}}_{d}^{†}} e^{{\hat{T}}_{d}} | φ_{0} ⟩ E (t_{d}) + \sum_{μ} ⟨ φ_{0} | e^{{\hat{T}}_{d}^{†}} e^{{\hat{T}}_{d}} | φ_{μ} ⟩ f (t_{d}) . \end{aligned}$ From this we can conclude $R (e^{{\hat{T}}_{d}} φ_{0}) = E (t_{d}) = E_{K}$ , i.e. the CC wave function gives the energy when inserted into the Rayleigh–Ritz quotient. Furthermore, we have (where $C_{d}$ denotes the truncated version of $C$ ) $\begin{aligned} inf {R (ψ) : ψ & = e^{\hat{T}} φ_{0}, \hat{T} \in C_{d}} \\ = inf {R (ψ) : ψ \\ = (\hat{I} + \hat{S}) φ_{0}, \hat{S} \in C_{d}} = E_{K}, \end{aligned}$ by the equivalence between linear and exponential parametrisation as long as full excitation rank is kept. Consequently, $ψ_{d} = e^{{\hat{T}}_{d}} φ_{0}$ solves the SE on $H_{K}^{1}$ , which establishes the reversed implication in Proposition 4.7 in [Citation15].

(ii) However, for n<N we have in general (see for instance Remark 4.9 in [Citation15]) $E_{d}^{var} := inf {R (ψ) : ψ = e^{\hat{T}} φ_{0}, \hat{T} \in C_{d}} \neq E_{d}, n < N,$ which gives the well-known result that the computed $E_{d}$ is not an upper bound to $E_{K}$ . Hence, $E_{d} \neq R (e^{{\hat{T}}_{d}} φ_{0})$ where $f (t_{d}) = 0$ and $E_{d} = E (t_{d})$ .

(iii) By (ii), strictly speaking, CC methods do not compute wave functions, as $ψ_{d}$ does not provide the system's energy and therewith does not fulfill the Copenhagen interpretation's first principle [Citation39]. However, as mathematical analyses in [Citation15–17,Citation31,Citation32] have demonstrated, CC methods do provide approximate wave functions that converge to the solution of the SE (as $K \to \infty$ , $n \to N$ ). The Copenhagen interpretation is formulated for full systems, which correspond to the continuous CC formulation, and does not contain any statement about approximative solutions. This raises the fundamental question of what properties should be demanded for approximative solutions.

(iv) To contrast with the next section, we would also like to point out the work [Citation19] where, for a finite basis, the CC equations were analysed in a perturbational setting. Writing $e^{{\hat{T}}^{(n)}} = \hat{I} + {\hat{T}}_{1} + \dots + {\hat{T}}_{n} + λ ({\hat{T}}_{n + 1} (n) + {\hat{T}}_{n + 2} (n)),$ where we followed the notation in [Citation19] (see Equations (A9) and (A10)), the CI equations are obtained at $λ = 0$ and $λ = 1$ corresponds to the CC case. From this and under the assumption of a finite one-electron basis, both the reality and multiplicity of the CC solutions were investigated with respect to pole and branch cut singularities in the complex plane. The emergence of multiple solutions is certainly interesting and worth pursuing, however, the local analysis studied here instead deals with the establishment of a locally unique solution under certain assumptions. Note that the local behaviour of a solution is important for the applicability and convergence of Newton–Rhapson and quasi-Newton methods.

3. Local analysis in CC theory

The CC equations (linked and unlinked) can be formulated as a non-linear Galerkin scheme, which is a well-established framework in numerical analysis to convert the continuous Schrödinger equation to a discrete problem. Instead of solving the full problem, Galerkin methods solve the CC equations in a finite dimensional subspace $H_{d} \subseteq H^{1}$ . Note that the CC equations remain the same, only the space spanned by the considered ${φ_{μ}}$ has changed. Reducing the problem to a finite-dimensional vector subspace allows to numerically compute an approximate solution via Newton–Rhapson or quasi-Newton methods. Galerkin methods allow a local analysis, which is useful for CC theory due to the manifold of solutions [Citation18–23] and the use of quasi-Newton methods that require certain local behaviour of the solutions. Local analysis furthermore allows reliable statements about the existence and local uniqueness of Galerkin solutions as well as quantitative statements on the basis-truncation error. Its backbone is formed by a local version of Zarantonello's theorem [Citation40]:

Let $f : X \to X^{'}$ be a map between a Hilbert space $(X, ⟨ \cdot, \cdot ⟩, ∥ \cdot ∥)$ and its dual $X^{'}$ , and let $x_{*} \in B_{δ}$ be a root, $f (x_{*}) = 0$ , where $B_{δ}$ is an open ball of radius δ around $x_{*}$ . Assume that f is Lipschitz continuous and strongly monotone in $B_{δ}$ with constants L>0 and $γ > 0$ , respectively. Then the root $x_{*}$ is unique in $B_{δ}$ . Indeed, there is a ball $C_{ε} \subset X^{'}$ with $0 \in C_{ε}$ such that the solution map $f^{- 1} : C_{ε} \to X$ exists and is Lipschitz continuous, implying that the equation $f (x_{*} + x) = y$ has a unique solution $x = f^{- 1} (y) - x_{*}$ , depending continuously on y, with norm $∥ x ∥ \leq δ$ . Moreover, let $X^{(d)} \subset X$ be a closed subspace such that $x_{*}$ can be approximated sufficiently well, i.e. the distance $d (x_{*}, X^{(d)})$ is small. Then, the projected problem $f_{d} (x_{d}) = 0$ has a unique solution $x_{d} \in X^{(d)} \cap B_{δ}$ and $∥ x_{*} - x_{d} ∥ \leq \frac{L}{γ} d (x_{*}, X^{(d)}),$ i.e. $x_{d}$ is a quasi-optimal solution.

The concept of quasi optimality was introduced in Jean Céa's dissertation [Citation41] in 1964 for linear Galerkin schemes and got extended over the years to the non-linear case. It ensures that the Galerkin solution in a fixed approximative space is, up to a multiplicative constant, the closest element to the exact solution. For obvious reasons this is a desired property for CC schemes. The different CC methods vary, however, in more than just minor details, which makes this property a conceptual different and challenging task to establish for each method.

3.1. Local unique solutions and quasi-optimality

We start by elaborating on the assumptions of Zarantonello's theorem in a more demonstrative way. Here, the notation $⟨ s, t ⟩ = \sum_{μ} s_{μ} t_{μ}$ is used for sequences $s = {s_{μ}}$ and $t = {t_{μ}}$ . In the context of the CC theory, the CC function f from Equation (Equation4(4) $\begin{aligned} E (t) & = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩, \\ (f (t))_{μ} & = ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ = 0, for all φ_{μ} . \end{aligned}$ (4) ) is said to be strongly monotone if for sets of cluster amplitudes $t = {t_{μ}}$ and $t^{'} = {t_{μ}^{'}}$ there exists a $γ > 0$ such that (5) $⟨ f (t) - f (t^{'}), t - t^{'} ⟩ \geq γ ∥ t - t^{'} ∥^{2} .$ (5) If this inequality is true for all $t, t^{'} \in B_{δ} (t_{*})$ then f is said to be locally strongly monotone. The CC function f is further said to be Lipschitz continuous if there exists a constant L>0 such that (6) $∥ f (t) - f (t^{'}) ∥ \leq L ∥ t - t^{'} ∥ .$ (6) In direct analogy with local strong monotonicity, we define local Lipschitz continuity if Equation (Equation6(6) $∥ f (t) - f (t^{'}) ∥ \leq L ∥ t - t^{'} ∥ .$ (6) ) is fulfilled for all cluster amplitudes $t, t^{'}$ inside some ball.

To exemplify these concepts in a simple way we consider a smooth function $f : R \to R$ . By the Cauchy–Schwarz inequality, the strong monotonicity implies that the derivative $f^{'} (t) \geq γ$ , i.e. f is a strictly monotonically increasing function. Note that strictly monotone functions are injective (one-to-one), which implies local invertibility. Hence, this already ensures local uniqueness of the function's root $t_{*}$ , if supported. Lipschitz continuity on the other hand implies that $- L \leq f^{'} (t) \leq L$ . Hence, the assumptions in Zarantonello's theorem are restrictions to the function's slope, namely $0 < γ \leq f^{'} (t) \leq L .$ By introducing normed operator spaces, these restrictions can be generalised to vector valued and even infinite dimensional functions f.

Returning to the general case, the Lipschitz continuity is key to derive the quasi-optimality in case of Galerkin solutions. We assume that $X^{(d)} ⊊ X$ is the considered approximation space supporting the Galerkin solution $t_{d}$ , i.e. $⟨ f (t_{d}), s ⟩ = 0$ for all $s \in X^{(d)}$ . Then, $f (t_{*}) - f (t_{d}) \in (X^{(d)})^{⊥}$ , i.e. $⟨ f (t_{*}) - f (t_{d}), u ⟩ = 0$ for all $u \in X^{(d)}$ , in particular for $u = t_{d}$ . Starting from the strong monotonicity, we deduce for any $u \in X^{(d)}$ that $\begin{aligned} γ ∥ t_{*} - t_{d} ∥^{2} & \leq ⟨ f (t_{*}) - f (t_{d}), t_{*} - t_{d} ⟩ \\ = ⟨ f (t_{*}) - f (t_{d}), t_{*} - u ⟩ \\ \leq L ∥ t_{*} - t_{d} ∥ ∥ t_{*} - u ∥ . \end{aligned}$ Because $u \in X^{(d)}$ was chosen arbitrarily, the above estimate holds for all u, which implies the quasi optimality: (7) $∥ t_{*} - t_{d} ∥ \leq L / γ min_{u \in X^{(d)}} ∥ t_{*} - u ∥ .$ (7)

To apply Zarantonello's theorem to CC methods, the main challenge is to demonstrate a strictly positive γ in Equation (Equation5(5) $⟨ f (t) - f (t^{'}), t - t^{'} ⟩ \geq γ ∥ t - t^{'} ∥^{2} .$ (5) ) such that strong monotonicity holds locally around the solution that corresponds to the ground state. The original idea in [Citation15] to obtain such a result in the finite-dimensional projected CC theory assumed the existence of an HOMO–LUMO gap. Further, more technical assumptions on the Fock operator $\hat{F}$ (see Gårding inequality below) were needed to achieve a generalisation to the continuous CC setting [Citation17], which also has a counterpart for $\hat{H}$ . We refer the reader to [Citation15–17,Citation31,Citation32] for the detailed proofs and made assumptions, not only within the traditional CC formalism, but also for the TCC and extended CC methods. However, we remark that these assumptions are sufficient conditions but not necessary. One example is given by metals: Despite their typically small or negligible HOMO–LUMO gaps, the single-reference CC method can compute metallic effects often quite well. This suggests that the HOMO–LOMO gap assumption, which limits the results' applicability, can be lifted in the case of non-multi-configuration systems [Citation32]. See also [Citation23] for a CC theory that considers open-shell systems where no HOMO–LUMO gap exists.

Here, we extend the results in [Citation15–17,Citation31,Citation32] by optimising the strong monotonicity constant γ, which yields lesser restrictions on the solution's cluster amplitudes $t_{*} = {(t_{*})_{μ}}$ . Further investigations need to be undertaken before the presented analysis can lead to practical results of the reliability of the CC approach. However, we suggest an estimate on the CC amplitudes that is sufficient to guarantee the existence of a locally unique CC solution (see Equation (Equation13(13) $c (e_{opt}) > 2 ∥ T_{*} ∥ .$ (13) )) and contrast it with the single amplitudes diagnostic of [Citation33].

3.2. Local strong monotonicity of the CC function

In the literature there are two different proofs that the infinite dimensional (continuous) CC function f is locally strongly monotone [Citation17] (see also [Citation31] for the extended CC function). Even though spectral-gap assumptions enter the arguments, it is the so-called Gårding constants that give a sufficient condition for the local strong monotonicity, as will be demonstrated below. This fact emerges from the analysis in [Citation17] but was noted and elaborated within the analysis of the extended CC method in [Citation31]. We here furthermore improve the existing analysis by optimising the constants. We start by defining the Gårding inequality that will be used extensively in the sequel:

An operator $\hat{A}$ fulfills a Gårding inequality if there exists a real constant e such that $\hat{A} + e$ is coercive, i.e. there exists a constant c>0 that depends on e (we denote this dependence by $c (e)$ ) such that $⟨ ψ | \hat{A} + e | ψ ⟩ \geq c (e) ∥ ψ ∥^{2} .$ The coercivity above describes a particular growth behaviour of $\hat{A} + e$ as the lower bound becomes large when the wave function is at the extreme of the space, e.g. wave functions with a large kinetic energy. Subsequently, we denote the l.h.s. of Equation (Equation5(5) $⟨ f (t) - f (t^{'}), t - t^{'} ⟩ \geq γ ∥ t - t^{'} ∥^{2} .$ (5) ) by Δ, i.e. for two sets of CC amplitudes $t = {t_{μ}}$ and $t^{'} = {t_{μ}^{'}}$ we have $Δ = ⟨ f (t) - f (t^{'}), t - t^{'} ⟩ .$ We further set $Δ \hat{T} = \hat{T} - {\hat{T}}^{'}$ , which yields by the CC equations in Equation (Equation4(4) $\begin{aligned} E (t) & = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩, \\ (f (t))_{μ} & = ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ = 0, for all φ_{μ} . \end{aligned}$ (4) ) the equality (8) $Δ = ⟨ Δ \hat{T} φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} - e^{- {\hat{T}}^{'}} \hat{H} e^{{\hat{T}}^{'}} | φ_{0} ⟩ .$ (8) Next, we elaborate on Gårding inequalities for two different operators that imply local strong monotonicity of the CC function, by bounding the r.h.s. of Equation (Equation8(8) $Δ = ⟨ Δ \hat{T} φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} - e^{- {\hat{T}}^{'}} \hat{H} e^{{\hat{T}}^{'}} | φ_{0} ⟩ .$ (8) ). Interestingly, for the finite-dimensional (projected) CC method, only the latter approach has a counterpart (using the particular structure of the Fock operator $\hat{F}$ ).

3.2.1. A Gårding inequality for the hamiltonian

We here assume a spectral gap $γ_{*}$ of $\hat{H}$ , i.e. for all ψ that are $L^{2}$ -orthogonal to the ground state $ψ_{*}$ we have $R (ψ) - E_{0} \geq γ_{*}$ , for some $γ_{*} > 0$ , i.e. we assume a non-degenerate ground state. We also assume that $φ_{0}$ is a good approximation of the exact wave function, i.e. $ε = ∥ ψ_{*} - φ_{0} ∥_{2}$ is small. It then holds (see Lemma 11 in [Citation31]) (9) $⟨ \hat{T} φ_{0} | \hat{H} - E_{0} | \hat{T} φ_{0} ⟩ \geq γ_{*} (ε) ∥ \hat{T} φ_{0} ∥_{2}^{2},$ (9) with $γ_{*} (ε) = γ_{*} (1 - 4 ε + O (ε^{2}))$ . Thus, $γ_{*} (ε)$ is close to $γ_{*}$ and strictly positive, if ϵ is sufficiently close to zero. Using the argument in [Citation17,Citation31] (see proof of Theorem 3.4 in [Citation17], and also Equation (16) with ${\hat{Λ}}_{*} = 0$ together with the proof of Theorem 16 in [Citation31]), we obtain (10) $\begin{aligned} Δ & \geq ⟨ Δ \hat{T} φ_{0} | \hat{H} - E_{0} | Δ \hat{T} φ_{0} ⟩ \\ - (∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ + ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥) ∥ Δ \hat{T} φ_{0} ∥^{2} . \end{aligned}$ (10) In [Citation17], the first term of Equation (Equation10(10) $\begin{aligned} Δ & \geq ⟨ Δ \hat{T} φ_{0} | \hat{H} - E_{0} | Δ \hat{T} φ_{0} ⟩ \\ - (∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ + ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥) ∥ Δ \hat{T} φ_{0} ∥^{2} . \end{aligned}$ (10) ) was bounded by a constant times $∥ Δ \hat{T} φ_{0} ∥^{2}$ , achieved by combining the Gårding inequality with Equation (Equation9(9) $⟨ \hat{T} φ_{0} | \hat{H} - E_{0} | \hat{T} φ_{0} ⟩ \geq γ_{*} (ε) ∥ \hat{T} φ_{0} ∥_{2}^{2},$ (9) ).

From Lemma 11 in [Citation31], it follows that $⟨ Δ \hat{T} φ_{0} | \hat{H} - E_{0} | Δ \hat{T} φ_{0} ⟩ \geq \frac{γ_{*} (ε)}{γ_{*} (ε) + e + E_{0}} c (e) ∥ Δ \hat{T} φ_{0} ∥^{2} .$ However, this can be further strengthened to $⟨ Δ \hat{T} φ_{0} | \hat{H} - E_{0} | Δ \hat{T} φ_{0} ⟩ \geq η_{opt} (ε) ∥ Δ \hat{T} φ_{0} ∥^{2},$ with the optimal constant $η_{opt} (ε) := max_{e > 0} \frac{γ_{*} (ε)}{γ_{*} (ε) + e + E_{0}} c (e) .$ From this we conclude (11) $\begin{aligned} Δ & \geq (η_{opt} (ε) - ∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ \\ - ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥) ∥ t - t^{'} ∥^{2}, \end{aligned}$ (11) which yields the following sufficient condition for the local strong monotonicity of f, namely (12) $η_{opt} (ε) > ∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ + ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥ .$ (12) Given $γ_{*} > 0$ , we observe that a sufficiently small ϵ and $t_{*}$ , such that $∥ {\hat{T}}_{*} ∥$ is small enough relative to $η_{opt} (ε)$ , guarantees that Equation (Equation12(12) $η_{opt} (ε) > ∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ + ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥ .$ (12) ) is fulfilled. (Recall that $∥ t ∥$ and $∥ \hat{T} ∥$ are equivalent, see Section 2.1.)

To finalise this section, we offer the following interpretation of Equation (Equation11(11) $\begin{aligned} Δ & \geq (η_{opt} (ε) - ∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ \\ - ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥) ∥ t - t^{'} ∥^{2}, \end{aligned}$ (11) ), providing a more descriptive approach to Equation (Equation12(12) $η_{opt} (ε) > ∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ + ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥ .$ (12) ). We see as e tends to $- E_{0}$ from above, the quotient $γ_{*} (ε) / (γ_{*} (ε) + e + E_{0})$ goes to one from below. Furthermore, assume that $c (e)$ goes to zero from above as e approaches $- E_{0}$ from above. This suggest an optimal value of $e_{opt} > - E_{0}$ . For instance, choosing $e_{n} = - E_{0} + γ_{*} (ϵ) / n$ implies $\frac{γ_{*} (ε)}{γ_{*} (ϵ) + e + E_{0}} c (e_{n}) = \frac{1}{1 + 1 / n} c (e_{n})$ such that $γ_{*} (ε)$ is eliminated from the expression. Assuming further that $e_{opt}$ corresponds to an $n_{opt} ≫ 1$ yields $η_{opt} \approx c (e_{opt})$ . In conclusion, as long as $γ_{*} (ε) > 0$ , the Gårding constant $c (e_{opt})$ offers a direct estimate of the monotonicity constant $γ \approx c (e_{opt}) - 2 ∥ {\hat{T}}_{*} ∥ + O (∥ {\hat{T}}_{*} ∥^{2})$ . We therefore obtain the following (approximate) sufficient condition for local strong monotonicity (13) $c (e_{opt}) > 2 ∥ T_{*} ∥ .$ (13) Note that $∥ \hat{T} ∥ \geq K ∥ t ∥$ , for some constant K. However, a sharp estimate for this constant is object of current research. Thus, for Zarantonello's theorem to guarantee a locally unique solution, the exact amplitudes $t_{*} = {(t_{*})_{μ}}$ cannot be too large relative to $c (e_{opt})$ . We remark that by an appropriate choice of the reference determinant $φ_{0}$ , the single amplitudes $t_{1} = {(t_{1})_{μ}}$ do not contribute to (the overall) $∥ t ∥$ . Thus, if $∥ t ∥$ is too large then this is a consequence of $t_{2}, t_{3}, \dots$ (doubles, triples, etc.). Numerical investigations are left for future work but we can already compare this mathematically derived sufficient condition for locally unique CC solutions with the $t_{1}$ -diagnostics of [Citation33]. Given the truncation level n of the excitation rank, here the proposed diagnostic uses all cluster amplitudes $t_{1}, t_{2}, \dots, t_{n}$ and not just the single amplitudes $t_{1}$ . This is a clear advantage since, as mentioned above, orbital rotations can be used to rotate out the single amplitudes. However, our diagnostic offers only a sufficient and not a necessary criterion for a local unique solution, i.e. for large $t_{2}, t_{3}, \dots$ the current diagnostic is agnostic about local uniqueness and only states that local strong monotonicity cannot be inferred from this particular analysis. We hope that future work will clarify the situation further.

3.2.2. A Gårding inequality for the fock operator

On the other hand, assume an HOMO–LUMO gap $γ_{0} > 0$ of the Fock operator $\hat{F}$ and that $φ_{0}$ is the Hartree–Fock solution, i.e. $\hat{F} φ_{0} = Λ_{0} φ_{0}$ with $⟨ ψ | \hat{F} - Λ_{0} | ψ ⟩ \geq γ_{0} ∥ ψ ∥_{2}^{2}, for all ψ ⊥ φ_{0} .$ The HOMO–LUMO gap thus corresponds to a spectral gap of the Fock operator and we regard $Λ_{0}$ as the ground-state energy of $\hat{F}$ . Let $\hat{F} = \sum_{i = 1}^{N} \hat{f} (r_{i})$ and choose ${χ_{k}}$ as eigenbasis of $\hat{f}$ , i.e. $\hat{f} χ_{k} = λ_{k} χ_{k}$ for all k. We observe that $Λ_{0} = \sum_{i = 1}^{N} λ_{i}$ , $γ_{0} = λ_{N + 1} - λ_{N} > 0$ and $\hat{F} φ_{μ} = (Λ_{0} + ε_{μ}) φ_{μ}$ with $ε_{μ} = \sum_{l \leq | μ |} λ_{a_{l}} - λ_{i_{l}}$ . The argument proving that the CC function f is locally strongly monotone can then be outlined as follows.

The considered Fock operator is assumed to fulfill a Gårding inequality. Thus there exists a constant e such that $\hat{F} + e$ is coercive, i.e. $⟨ ψ | \hat{F} + e | ψ ⟩ \geq c (e) ∥ ψ ∥^{2} .$ For the sake of simplicity we use the same symbols for the Gårding constants of $\hat{F}$ as for the Hamiltonian. In complete analogy with $\hat{H}$ , the argument in [Citation15,Citation31] shows that (14) $⟨ ψ | \hat{F} - Λ_{0} | ψ ⟩ \geq max_{e > 0} \frac{γ_{0}}{γ_{0} + e + Λ_{0}} c (e) ∥ ψ ∥^{2}$ (14) and we moreover define (15) $η_{opt}^{(0)} := max_{e > 0} \frac{γ_{0}}{γ_{0} + e + Λ_{0}} c (e) .$ (15) Following [Citation17], for a fixed $φ_{0}$ we define the map from the space of cluster amplitudes into the space of wave functions $O_{φ_{0}} : t \mapsto \hat{O} (t) φ_{0}$ , with $\hat{O} : t \mapsto [[\hat{F}, \hat{T}], \hat{T}] + e^{- \hat{T}} \hat{W} e^{\hat{T}}$ . Hence, (16) $\begin{aligned} e^{- \hat{T}} \hat{H} e^{\hat{T}} φ_{0} & = e^{- \hat{T}} (\hat{F} + \hat{W}) e^{\hat{T}} φ_{0} \\ = (\hat{F} + [\hat{F}, \hat{T}]) φ_{0} + \hat{O} (t) φ_{0}, \end{aligned}$ (16) where $\hat{H} = \hat{F} + \hat{W}$ , and assume that for some L>0 (not too large) (17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) As a technical remark, the assumption in [Citation17] is the stronger requirement that $t \mapsto \hat{O} (t) φ_{0}$ is Lipschitz continuous as a map from the space of cluster amplitudes to $H^{- 1}$ . However, we here note that Equation (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ) is sufficient to derive the CC function's local strong monotonicity, as will be evident shortly. Inserting the identity (a consequence of Equation (Equation16(16) $\begin{aligned} e^{- \hat{T}} \hat{H} e^{\hat{T}} φ_{0} & = e^{- \hat{T}} (\hat{F} + \hat{W}) e^{\hat{T}} φ_{0} \\ = (\hat{F} + [\hat{F}, \hat{T}]) φ_{0} + \hat{O} (t) φ_{0}, \end{aligned}$ (16) ) and $\hat{F} φ_{0} = Λ_{0} φ_{0}$ ) $e^{- \hat{T}} \hat{H} e^{\hat{T}} φ_{0} = (\hat{F} + (\hat{F} - Λ_{0}) \hat{T}) φ_{0} + \hat{O} (t) φ_{0}$ into Equation (Equation8(8) $Δ = ⟨ Δ \hat{T} φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} - e^{- {\hat{T}}^{'}} \hat{H} e^{{\hat{T}}^{'}} | φ_{0} ⟩ .$ (8) ), as well as using Equations (Equation14(14) $⟨ ψ | \hat{F} - Λ_{0} | ψ ⟩ \geq max_{e > 0} \frac{γ_{0}}{γ_{0} + e + Λ_{0}} c (e) ∥ ψ ∥^{2}$ (14) ) and (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ), we obtain (18) $\begin{aligned} Δ & = ⟨ Δ \hat{T} φ_{0} | \hat{F} - Λ_{0} | Δ \hat{T} φ_{0} ⟩ \\ + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ \geq (η_{opt}^{(0)} - L) ∥ t - t^{'} ∥^{2} . \end{aligned}$ (18) Consequently, local strong monotonicity holds if $η_{opt}^{(0)} > L$ . Repeating the argument presented in the previous section, with the obvious adaptations, we obtain (19) $c (e_{opt}) > L$ (19) as a sufficient condition for f to be locally strongly monotone. Here, no explicit assumption on $∥ t_{*} ∥$ enters. The main drawback of the assumption in Equation (Equation19(19) $c (e_{opt}) > L$ (19) ) is that the constant L of the inequality in Equation (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ) has to be determined. Further analysis of this constant is postponed for later work.

Before we conclude this section we exemplify how the Gårding constant c can be chosen in the finite dimensional setting. In this case the commutator $[\hat{F}, \hat{T}]$ is an excitation operator (which implies $[[\hat{F}, \hat{T}], \hat{T}] = 0$ ) and $\hat{O} (t)$ is simply the similarity transformation of the fluctuation potential $\hat{W}$ . This offers the following insight into the optimal constant $η_{opt}^{(0)}$ in Equation (Equation14(14) $⟨ ψ | \hat{F} - Λ_{0} | ψ ⟩ \geq max_{e > 0} \frac{γ_{0}}{γ_{0} + e + Λ_{0}} c (e) ∥ ψ ∥^{2}$ (14) ) for the truncated case. As in [Citation15], we define the norm on ${φ_{0}}^{⊥}$ by $∥ \hat{T} φ_{0} ∥_{F}^{2} = \sum_{μ} ε_{μ} t_{μ}^{2} = ∥ t ∥_{F}^{2} .$ It follows that $⟨ Δ \hat{T} φ_{0} | \hat{F} - Λ_{0} | Δ \hat{T} φ_{0} ⟩ = \sum_{μ} ε_{μ} (Δ t)_{μ}^{2} = ∥ Δ \hat{T} φ_{0} ∥_{F}^{2} .$ Using $∥ \hat{T} φ_{0} ∥_{F}$ instead of $∥ \hat{T} φ_{0} ∥$ and making the assumption in Equation (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ) also for the truncated theory (denoting the Lipschitz constant in this new topology by $L^{'}$ ), we obtain (20) $\begin{aligned} Δ & = \sum_{μ} ε_{μ} (Δ t)_{μ}^{2} + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ = ∥ t - t^{'} ∥_{F}^{2} + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ \geq (1 - L^{'}) ∥ t - t^{'} ∥_{F}^{2} . \end{aligned}$ (20) Comparing the local strong monotonicity estimates Equations (Equation18(18) $\begin{aligned} Δ & = ⟨ Δ \hat{T} φ_{0} | \hat{F} - Λ_{0} | Δ \hat{T} φ_{0} ⟩ \\ + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ \geq (η_{opt}^{(0)} - L) ∥ t - t^{'} ∥^{2} . \end{aligned}$ (18) ) and (Equation20(20) $\begin{aligned} Δ & = \sum_{μ} ε_{μ} (Δ t)_{μ}^{2} + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ = ∥ t - t^{'} ∥_{F}^{2} + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ \geq (1 - L^{'}) ∥ t - t^{'} ∥_{F}^{2} . \end{aligned}$ (20) ) suggests that the finite-dimensional version of $η_{opt}^{(0)}$ equals one. Furthermore, at first glance it appears that the estimate in Equation (Equation20(20) $\begin{aligned} Δ & = \sum_{μ} ε_{μ} (Δ t)_{μ}^{2} + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ = ∥ t - t^{'} ∥_{F}^{2} + ⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \\ \geq (1 - L^{'}) ∥ t - t^{'} ∥_{F}^{2} . \end{aligned}$ (20) ) is obtained without imposing a Gårding inequality. A key observation here is that the choice of the norm makes $\hat{F}$ on ${φ_{0}}^{⊥}$ fulfill a Gårding inequality with $e_{opt} = - Λ_{0}$ and $c (e_{opt}) = 1$ . Indeed, the inequality is saturated, meaning that equality holds. It follows then immediately from Equation (Equation15(15) $η_{opt}^{(0)} := max_{e > 0} \frac{γ_{0}}{γ_{0} + e + Λ_{0}} c (e) .$ (15) ) that $η_{opt}^{(0)} = c (e_{opt}) = 1$ . Thus, in agreement with Equation (Equation19(19) $c (e_{opt}) > L$ (19) ) we have obtained the condition $η_{opt}^{(0)} - L^{'} > 0$ .

To conclude this section, we note that we have formulated an alternative to the diagnostic in Equation (Equation13(13) $c (e_{opt}) > 2 ∥ T_{*} ∥ .$ (13) ): Assume a finite one-electron basis and suppose that $\hat{O} (t)$ satisfies Equation (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ) with the norm $∥ \cdot ∥_{F}$ and $L^{'} < 1$ locally around the solution amplitudes. Then local strong monotonicity implies a locally unique CC solution. Whether Equation (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ) with $L^{'} < 1$ holds without the assumption of a small $∥ t_{*} ∥$ is an interesting and still open question. Furthermore, the above analysis can be generalised to any single particle operator fulfilling certain properties (see [Citation15,Citation32]).

3.3. The CC method's numerical analysis

As computational schemes, the convergence behaviour of CC methods is one of the main objects of study. This covers whether or not the method converges towards the exact solution as well as how fast it converges. We note that the quasi optimality as given in Equation (Equation7(7) $∥ t_{*} - t_{d} ∥ \leq L / γ min_{u \in X^{(d)}} ∥ t_{*} - u ∥ .$ (7) ) yields $t_{d} \to t_{*}$ as $d \to \infty$ (for increasing approximation spaces $X^{(d)}$ ). Furthermore, in the case of the CC method one studies the CC-energy residual $| E (t_{*}) - E (t) | .$ A major difference between the CI and CC method is that the CC formalism is not variational in the Rayleigh–Ritz sense. Consequently, it is not evident that the CC energy error decays quadratically with respect to the error of the wave function or cluster amplitudes. In the sequel we present two approaches that were used in previous mathematical analyses of different CC methods to derive such quadratic error bounds [Citation17,Citation31,Citation32].

3.3.1. The Aubin–Nitsche duality method

The Aubin–Nitsche duality method is a standard tool for deriving a priori error estimates for finite element methods. It was introduced independently by Aubin [Citation42], Nitsche [Citation43] and Oganesyan–Ruchovets [Citation44]. We here elaborate the Aubin–Nitsche duality type method used in [Citation15,Citation17,Citation32] to derive a quadratic error bound for the CC method and the closely related TNS-TCC method, a special case of the tailored CC method [Citation45]. This approach exploits the mathematical framework introduced by Bangerth and Rannacher [Citation46]. The untruncated Euler–Lagrange method gives the Lagrangian $L (t, s) = E (t) - ⟨ f (t), s ⟩$ with f and $E$ from Equation (Equation4(4) $\begin{aligned} E (t) & = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩, \\ (f (t))_{μ} & = ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ = 0, for all φ_{μ} . \end{aligned}$ (4) ). The corresponding Gâteaux derivative in direction $(u, v)$ is denoted $L^{'} (\cdot, \cdot) (u, v)$ and we study $(t_{*}, s_{*})$ fulfilling (21) $L^{'} (t_{*}, s_{*}) (u, v) = \{\begin{matrix} E^{'} (t_{*}) u - ⟨ f^{'} (t_{*}) u, s_{*} ⟩ \\ - ⟨ f (t_{*}), v ⟩ \end{matrix}\} = 0,$ (21) for all pairs of CC amplitude vectors $(u, v)$ . Under the assumption that f is locally strongly monotone inside a ball around $t_{*}$ , there exists a unique $s_{*}$ determined by $t_{*}$ such that $(t_{*}, s_{*})$ solves Equation (Equation21(21) $L^{'} (t_{*}, s_{*}) (u, v) = \{\begin{matrix} E^{'} (t_{*}) u - ⟨ f^{'} (t_{*}) u, s_{*} ⟩ \\ - ⟨ f (t_{*}), v ⟩ \end{matrix}\} = 0,$ (21) ). Note, that the assumptions imposed to ensure local strong monotonicity are different for the single-reference CC method [Citation15,Citation17] and the TNS-TCC method [Citation32]. Moreover, there exists a solution $s_{d}$ to the corresponding discretisation of the problem that approximates $s_{*}$ quasi optimally [Citation17,Citation32]. Equipped with these so called dual solutions, the energy-error characterisation given by Bangerth and Rannacher [Citation46] yields $\begin{aligned} 2 (E (t_{*}) - E (t_{d})) & = R_{d}^{(3)} + ρ (t_{d}) (s_{*} - υ_{d}) \\ + ρ^{*} (t_{d}, s_{d}) (t_{*} - w_{d}), \end{aligned}$ with arbitrarily chosen discrete CC amplitudes $υ_{d}, w_{d}$ . The given remainder term $R_{d}^{(3)}$ is cubic in the primal and dual error, i.e. $e = t_{*} - t_{d}$ and $e^{*} = s_{*} - s_{d}$ . Using this energy-error characterisation, a quadratic energy-error bound for the single-reference CC method [Citation15,Citation17] and the TNS-TCC method [Citation32] follows.

3.3.2. The bivariational approach

The extended version of the CC method rests on Arponen's bivariational approach [Citation47,Citation48]. This unconventional formulation of the CC method parametrises two independent wave functions and thus makes use of two sets of cluster amplitudes $t = {t_{μ}}$ and $λ = {λ_{μ}}$ . It gained recent attention in the study [Citation31] and has a major advantage as far as the error analysis is concerned, namely, the energy itself is stationary, i.e. the solution $(t_{*}, λ_{*})$ is a critical point of the bivariational energy, see Equation (Equation22(22) $B (ψ, \tilde{ψ}) = \frac{⟨ \tilde{ψ} | \hat{H} | ψ ⟩}{⟨ \tilde{ψ} | ψ ⟩} .$ (22) ). Consequently, when the corresponding Galerkin solution $(t_{d}, λ_{d})$ is close to the exact solution, a quadratic error estimate is guaranteed. Subsequently, we elaborate on this further.

Consider the Rayleigh–Ritz quotient, we write $E_{0} = R (ψ_{*}) = min_{ψ \neq 0} R (ψ) .$ Hence, $ψ_{*}$ is a stationary point of $R$ , i.e. $R^{'} (ψ_{*}) = 0$ . By Taylor expanding $R$ around $ψ_{*}$ we obtain the quadratic error estimation for the Rayleigh–Ritz quotient $\begin{aligned} | R (ψ) - R (ψ_{*}) | & \leq \frac{1}{2} ∥ R^{″} (ψ_{*}) ∥ ∥ ψ - ψ_{*} ∥^{2} \\ + O (∥ ψ - ψ_{*} ∥^{3}) . \end{aligned}$ As mentioned before, the CC formalism does not arise from the Rayleigh–Ritz variational principle. However, it can be described by Arponen's bivariational approach, as follows. Let the bivariate quotient be (22) $B (ψ, \tilde{ψ}) = \frac{⟨ \tilde{ψ} | \hat{H} | ψ ⟩}{⟨ \tilde{ψ} | ψ ⟩} .$ (22) Equation (Equation22(22) $B (ψ, \tilde{ψ}) = \frac{⟨ \tilde{ψ} | \hat{H} | ψ ⟩}{⟨ \tilde{ψ} | ψ ⟩} .$ (22) ) can be seen as a generalisation of the Rayleigh–Ritz quotient where a stationary point $(ψ_{*}, {\tilde{ψ}}_{*})$ is given by a left and right eigenvector of $\hat{H}$ with corresponding eigenvalue $E = B (ψ, \tilde{ψ})$ . Note that $B$ is no longer a below bounded functional, hence critical points do not necessarily correspond to extremal points as they do for $R$ . In the extended CC theory, the bivariational quotient is studied indirectly by means of the so-called flipped gradient [Citation31]. Following [Citation47], we assume $⟨ φ_{0} | ψ ⟩ = ⟨ ψ | \tilde{ψ} ⟩ = 1$ and note that there exists $\hat{T}$ such that $ψ = e^{\hat{T}} φ_{0}$ (cf. Section 2.1). Then $1 = ⟨ ψ | \tilde{ψ} ⟩ = ⟨ φ_{0} | e^{{\hat{T}}^{†}} | \tilde{ψ} ⟩$ and consequently there exists a cluster operator $\hat{Λ}$ so that $e^{{\hat{T}}^{†}} \tilde{ψ} = e^{\hat{Λ}} φ_{0}$ . This defines a smooth coordinate map Φ from cluster amplitudes $(t, λ)$ to wave functions $(ψ, \tilde{ψ})$ . The flipped gradient is then given by $F (t, λ) := \hat{R} \nabla B (Φ (t, λ))$ , where we introduced the flipping map $\begin{aligned} \hat{R} = (\begin{matrix} 0 & \hat{I} \\ \hat{I} & 0 \end{matrix}) . \end{aligned}$ Under certain assumptions, $F$ is locally strongly monotone [Citation31]. By the extended CC approach [Citation31], ${\tilde{ψ}}_{*} = e^{- {\hat{T}}_{*}^{†}} e^{{\hat{Λ}}_{*}} φ_{0}$ and $ψ_{*} = e^{{\hat{T}}_{*}} φ_{0}$ solve the SE if and only if $F (t_{*}, λ_{*}) = 0$ . Note that $F (t_{*}, λ_{*}) = 0$ implies $\nabla B (Φ (t_{*}, λ_{*})) = 0$ and therewith a quadratic energy error.

Furthermore, by identifying $e^{\hat{Λ}} = \hat{I} + \hat{S}$ we obtain from Equation (Equation22(22) $B (ψ, \tilde{ψ}) = \frac{⟨ \tilde{ψ} | \hat{H} | ψ ⟩}{⟨ \tilde{ψ} | ψ ⟩} .$ (22) ) the CC Lagrangian, i.e. (23) $\begin{aligned} B (e^{\hat{T}} φ_{0}, e^{- {\hat{T}}^{†}} e^{\hat{Λ}} φ_{0}) = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ \\ + \sum_{μ} s_{μ} ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ =: L (t, s) . \end{aligned}$ (23) Introducing the Lagrangian is a general method for optimisation with constraints. In the special case of CC theory with fixed orbitals, as in this article, Equation (Equation23(23) $\begin{aligned} B (e^{\hat{T}} φ_{0}, e^{- {\hat{T}}^{†}} e^{\hat{Λ}} φ_{0}) = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ \\ + \sum_{μ} s_{μ} ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ =: L (t, s) . \end{aligned}$ (23) ) demonstrates the equivalence to Arponen's bivariational method [Citation47]. In the context of obtaining an efficient evaluation of CC energy gradient, the derivative of the variational functional was obtained by Bartlett [Citation49]. The functional itself (Equation (Equation23(23) $\begin{aligned} B (e^{\hat{T}} φ_{0}, e^{- {\hat{T}}^{†}} e^{\hat{Λ}} φ_{0}) = ⟨ φ_{0} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ \\ + \sum_{μ} s_{μ} ⟨ φ_{μ} | e^{- \hat{T}} \hat{H} e^{\hat{T}} | φ_{0} ⟩ =: L (t, s) . \end{aligned}$ (23) )) was first used in quantum chemistry by Helgaker and Jørgensen [Citation50] to derive CC energy derivatives. We would also like to mention the related extended CC work of Piecuch and Bartlett [Citation51]. Note that their assumption that the reference determinant $φ_{0}$ is both a left- and right eigenvector of the doubly similarity transformed $\hat{H}$ can be rigorously proven in the continuous case (see Lemma 13 in [Citation31]).

Denoting the dual solution $s_{*} = {(s_{*})_{μ}}$ as in Section 3.3.1, it can then be seen that $s_{*}$ also describes cluster amplitudes parameterising the wave function ${\tilde{ψ}}_{*}$ . Indeed, using the relation $e^{{\hat{Λ}}_{*}} = \hat{I} + {\hat{S}}_{*}$ , we obtain that ${\tilde{ψ}}_{*} = e^{- {\hat{T}}_{*}^{†}} (\hat{I} + {\hat{S}}_{*}) φ_{0}$ together with $ψ_{*} = e^{{\hat{T}}_{*}} φ_{0}$ solve the SE corresponding to the same energy $B (ψ_{*}, {\tilde{ψ}}_{*})$ . Assuming non-degeneracy and using the constraint $⟨ {\tilde{ψ}}_{*} | ψ_{*} ⟩ = 1$ , we arrive at the condition $e^{- {\hat{T}}_{*}^{†}} (\hat{I} + {\hat{S}}_{*}) φ_{0} = \frac{1}{∥ e^{{\hat{T}}_{*}} φ_{0} ∥^{2}} e^{{\hat{T}}_{*}} φ_{0}$ for the primal and dual solutions $t_{*}$ and $s_{*}$ . Thus, from the extended CC theory we have obtained a constraint relating $s_{*}$ to $t_{*}$ for the traditional CC method.

4. Conclusion

In this article, we have introduced the reader to a local analysis of the CC method and its variations. In particular, we have demonstrated that the Gårding inequalities for $\hat{F}$ and $\hat{H}$ are key as far as a better understanding of the sufficient conditions for a locally unique and quasi-optimal solution of the CC equations is concerned. Moreover, these investigations are geared towards an a posteriori criterion of assessing the CC amplitudes from a given computation. This is a mathematical approach that is alternative to the controversial diagnostic suggested in [Citation33]. Indeed, the mathematically derived criteria in Equations (Equation12(12) $η_{opt} (ε) > ∥ e^{- {\hat{T}}_{*}^{†}} - \hat{I} ∥ + ∥ e^{- {\hat{T}}_{*}^{†}} ∥ ∥ e^{{\hat{T}}_{*}} - \hat{I} ∥ .$ (12) ) and (Equation13(13) $c (e_{opt}) > 2 ∥ T_{*} ∥ .$ (13) ) use the total $∥ t ∥$ and not just the single amplitudes $t_{1}$ . Since the single amplitudes could be removed by an appropriate choice of the reference determinant (i.e. an ideal choice of the basis functions), the sufficient condition for a locally unique solution given by Equation (Equation13(13) $c (e_{opt}) > 2 ∥ T_{*} ∥ .$ (13) ) puts constraints on the remaining amplitudes ( $t_{2}, t_{3}, \dots$ ). However, it is not yet a rejection criterion since it only implies locally unique and quasi-optimal solutions under certain conditions. As outlined, the upper bound in Equation (Equation13(13) $c (e_{opt}) > 2 ∥ T_{*} ∥ .$ (13) ) is fundamentally different from previous heuristic and potentially misleading diagnostics [Citation33] since the former is derived in a rigorous mathematical framework, where not just the singles amplitudes are taken into consideration. We have also shown that the condition on the two particle operator in Equation (Equation17(17) $⟨ Δ \hat{T} φ_{0} | \hat{O} (t) - \hat{O} (t^{'}) | φ_{0} ⟩ \geq - L ∥ t - t^{'} ∥^{2} .$ (17) ) implies a locally unique CC solution. Here, the condition does not explicitly depend on the amplitude norm and might offer a broader understanding of the reliability of a CC solution. Moreover, the derived condition is independent of the chosen single particle operator. In connection with the extended CC formalism, we have set up a constraint for the exact CC Lagrange multipliers $s_{*} = {(s_{*})_{μ}}$ , relating them to the exact CC amplitudes $t_{*} = {(t_{*})_{μ}}$ . Numerical investigations are left for future work.

Disclosure statement

No potential conflict of interest was reported by the authors.

ORCID

A. Laestadius http://orcid.org/0000-0001-7391-0396

Additional information

Funding

This work was supported by H2020 European Research Council [639508] and Norges Forskningsråd [262695].

References

R.J. Bartlett and M. Musiał, Rev. Mod. Phys. 79 (1), 291 (2007). doi: 10.1103/RevModPhys.79.291
Web of Science ®Google Scholar
F. Coester, Nucl. Phys. 7, 421–424 (1958). doi: 10.1016/0029-5582(58)90280-3
Web of Science ®Google Scholar
J. Hubbard, Proc. R. Soc. Lond. A 240 (1223), 539–560 (1957). doi: 10.1098/rspa.1957.0106
Web of Science ®Google Scholar
N. Hugenholtz, Physica 23 (1–5), 533–545 (1957). doi: 10.1016/S0031-8914(57)93009-4
Google Scholar
J. Čížek, J. Chem. Phys. 45 (11), 4256–4266 (1966). doi: 10.1063/1.1727484
Web of Science ®Google Scholar
O. Sinanoğlu, J. Chem. Phys. 36 (3), 706–717 (1962). doi: 10.1063/1.1732596
Web of Science ®Google Scholar
H. Kümmel, Theor. Chim. Acta. 80 (2–3), 81–89 (1991). doi: 10.1007/BF01119615
Google Scholar
J. Čížek, Theor. Chim. Acta. 80 (2–3), 91–94 (1991). doi: 10.1007/BF01119616
Google Scholar
R. Bartlett, in Theory and Applications of Computational Chemistry: The First Forty years, edited by C.E. Dykstra, G. Frenking, K.S. Kim, and G.E. Scuseria (2005), pp. 1191–1221.
Google Scholar
J. Paldus, Theory and Applications of Computational Chemistry (Elsevier, 2005), pp 115–147.
Google Scholar
J.S. Arponen, Theor. Chim. Acta. 80 (2–3), 149–179 (1991). doi: 10.1007/BF01119618
Google Scholar
R. Bishop, Theor. Chim. Acta. 80 (2–3), 95–148 (1991). doi: 10.1007/BF01119617
Google Scholar
K. Raghavachari, G.W. Trucks, J.A. Pople and M. Head-Gordon, Chem. Phys. Lett. 157 (6), 479–483 (1989). doi: 10.1016/S0009-2614(89)87395-6
Web of Science ®Google Scholar
R.J. Bartlett, J. Watts, S. Kucharski and J. Noga, Chem. Phys. Lett. 165 (6), 513–522 (1990). doi: 10.1016/0009-2614(90)87031-L
Web of Science ®Google Scholar
R. Schneider, Numerische Mathematik 113 (3), 433–471 (2009). doi: 10.1007/s00211-009-0237-3
Web of Science ®Google Scholar
T. Rohwedder, ESAIM Math. Model. Numer. Anal. 47 (2), 421–447 (2013). doi: 10.1051/m2an/2012035
Web of Science ®Google Scholar
T. Rohwedder and R. Schneider, ESAIM: Math. Model. Numer. Anal. 47 (6), 1553–1582 (2013). doi: 10.1051/m2an/2013075
Web of Science ®Google Scholar
T.P. Živković, Int. J. Quantum. Chem. 12 (S11), 413–420 (1977). doi: 10.1002/qua.560120849
Google Scholar
T.P. Živković and H.J. Monkhorst, J. Math. Phys. 19 (5), 1007–1022 (1978). doi: 10.1063/1.523761
Web of Science ®Google Scholar
P. Piecuch, S. Zarrabian, J. Paldus and J. Čížek, Phys. Rev. B 42 (6), 3351 (1990). doi: 10.1103/PhysRevB.42.3351
Web of Science ®Google Scholar
K. Kowalski and K. Jankowski, Phys. Rev. Lett. 81 (6), 1195 (1998). doi: 10.1103/PhysRevLett.81.1195
Web of Science ®Google Scholar
P. Piecuch and K. Kowalski, in Computational Chemistry: Reviews of Current Trends, edited by J. Leszczynski (World Scientific, Singapore, 2000), Vol. 5.
Google Scholar
B. Jeziorski and J. Paldus, J. Chem. Phys. 90 (5), 2714–2731 (1989). doi: 10.1063/1.455919
Web of Science ®Google Scholar
W. Meyer, Int. J. Quantum. Chem. 5 (S5), 341–348 (1971). doi: 10.1002/qua.560050839
Google Scholar
W. Meyer, J. Chem. Phys. 58 (3), 1017–1035 (1973). doi: 10.1063/1.1679283
Web of Science ®Google Scholar
R. Ahlrichs, F. Driessler, H. Lischka, V. Staemmler and W. Kutzelnigg, J. Chem. Phys. 62 (4), 1235–1247 (1975). doi: 10.1063/1.430638
Web of Science ®Google Scholar
W. Kutzelnigg, Methods of Electronic Structure Theory (Plenum Press, New York, 1977), p.129.
Google Scholar
R. Ahlrichs, P. Scharf and C. Ehrhardt, J. Chem. Phys. 82 (2), 890–898 (1985). doi: 10.1063/1.448517
Web of Science ®Google Scholar
C. Kollmar and F. Neese, Mol. Phys. 108 (19–20), 2449–2458 (2010). doi: 10.1080/00268976.2010.496743
Web of Science ®Google Scholar
F. Wennmohs and F. Neese, Chem. Phys. 343 (2–3), 217–230 (2008). doi: 10.1016/j.chemphys.2007.07.001
Web of Science ®Google Scholar
A. Laestadius and S. Kvaal, SIAM. J. Numer. Anal. 56 (2), 660–683 (2018). doi: 10.1137/17M1116611
Web of Science ®Google Scholar
F.M. Faulstich, A. Laestadius, S. Kvaal, Ö. Legeza and R. Schneider, arXiv preprint arXiv:1802.05699 (2018).
Google Scholar
T.J. Lee and P.R. Taylor, Int. J. Quantum. Chem. 36 (S23), 199–207 (1989). doi: 10.1002/qua.560360824
Web of Science ®Google Scholar
C.W. Kilmister and E. Schrödinger, Schrödinger: Centenary Celebration of a Polymath (Cambridge University Press, Cambridge, 1987).
Google Scholar
D. Yafaev, J. Funct. Anal. 168 (1), 121–144 (1999). doi: 10.1006/jfan.1999.3462
Web of Science ®Google Scholar
H. Yserentant, Regularity and Approximability of Electronic Wave Functions (Springer, Berlin, 2010).
Google Scholar
T. Helgaker, P. Jorgensen, and J. Olsen, Molecular Electronic-Structure Theory (John Wiley & Sons, New York, 2014).
Google Scholar
H.J. Monkhorst, Int. J. Quantum. Chem. 12 (S11), 421–432 (1977). doi: 10.1002/qua.560120850
Google Scholar
W. Heisenberg, Die Kopenhager Deutung der Quantentheorie (Battenberg, Stuttgart, 1963).
Google Scholar
E. Zaidler, Nonlinear Functional Analysis and Its Applications (Springer, New York, 1990).
Google Scholar
J. Céa, Ann. Inst. Fourier (Grenoble) 14 (fasc. 2), 345–444 (1964). doi: 10.5802/aif.181
Google Scholar
J.P. Aubin, Annali della Scuola Normale Superiore di Pisa-Classe di Scienze 21 (4), 599–637 (1967).
Google Scholar
J. Nitsche, Numerische Mathematik 11 (4), 346–348 (1968). doi: 10.1007/BF02166687
Web of Science ®Google Scholar
L.A. Oganesyan and L.A. Rukhovets, USSR Comput. Math. Math. Phys. 9 (5), 158–183 (1969). doi: 10.1016/0041-5553(69)90159-1
Google Scholar
T. Kinoshita, O. Hino and R.J. Bartlett, J. Chem. Phys. 123 (7), 074106 (2005). doi: 10.1063/1.2000251
PubMed Web of Science ®Google Scholar
W. Bangerth and R. Rannacher, Adaptive Finite Element Methods for Differential Equations (Birkhäuser, Basel, 2013).
Google Scholar
J. Arponen, Ann. Phys. 151 (2), 311–382 (1983). doi: 10.1016/0003-4916(83)90284-1
Web of Science ®Google Scholar
P.O. Löwdin, J. Math. Phys. 24 (1), 70–87 (1983). doi: 10.1063/1.525604
Web of Science ®Google Scholar
R. Bartlett, in Geometrical Derivatives of Energy Surfaces and Molecular Properties, edited by P. Jorgensen and J. Simons (Reidel, Dordrecht, 1986).
Google Scholar
T. Helgaker and P. Jørgensen, in Advances in Quantum Chemistry, edited by Per-Olov Löwdin (Academic Press, Cambridge, MA, 1988), Vol. 19, pp. 183–245.
Google Scholar
P. Piecuch and R.J. Bartlett, in Advances in Quantum Chemistry (Academic Press, Cambridge, MA, 1999), Vol. 34, pp. 295–380.
Google Scholar

The coupled-cluster formalism – a mathematical perspective

ABSTRACT

GRAPHICAL ABSTRACT

1. Introduction

2. Wave functions on an exponential manifold

2.1. The exponential ansatz

2.2. The CC energy

3. Local analysis in CC theory

3.1. Local unique solutions and quasi-optimality

3.2. Local strong monotonicity of the CC function

3.2.1. A Gårding inequality for the hamiltonian

3.2.2. A Gårding inequality for the fock operator

3.3. The CC method's numerical analysis

3.3.1. The Aubin–Nitsche duality method

3.3.2. The bivariational approach

4. Conclusion

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

The coupled-cluster formalism – a mathematical perspective

ABSTRACT

GRAPHICAL ABSTRACT

1. Introduction

2. Wave functions on an exponential manifold

2.1. The exponential ansatz

2.2. The CC energy

3. Local analysis in CC theory

3.1. Local unique solutions and quasi-optimality

3.2. Local strong monotonicity of the CC function

3.2.1. A Gårding inequality for the hamiltonian

3.2.2. A Gårding inequality for the fock operator

3.3. The CC method's numerical analysis

3.3.1. The Aubin–Nitsche duality method

3.3.2. The bivariational approach

4. Conclusion

Disclosure statement

ORCID

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date