246
Views
0
CrossRef citations to date
0
Altmetric
Articles

An Introduction to Kristof’s Theorem for Solving Least-Square Optimization Problems Without Calculus

Pages 190-198 | Published online: 11 Jan 2018
 

ABSTRACT

Kristof’s Theorem (Kristof, Citation1970) describes a matrix trace inequality that can be used to solve a wide-class of least-square optimization problems without calculus. Considering its generality, it is surprising that Kristof’s Theorem is rarely used in statistics and psychometric applications. The underutilization of this method likely stems, in part, from the mathematical complexity of Kristof’s (Citation1964, Citation1970) writings. In this article, I describe the underlying logic of Kristof’s Theorem in simple terms by reviewing four key mathematical ideas that are used in the theorem’s proof. I then show how Kristof’s Theorem can be used to provide novel derivations to two cognate models from statistics and psychometrics. This tutorial includes a glossary of technical terms and an online supplement with R (R Core Team, Citation2017) code to perform the calculations described in the text.

Article information

Conflict of Interest Disclosures: The author signed a form for disclosure of potential conflicts of interest. The author did not report any financial or other conflicts of interest in relation to the work described.

Ethical Principles: The author affirms having followed professional ethical guidelines in preparing this work. These guidelines include obtaining informed consent from human participants, maintaining ethical treatment and respect for the rights of human or animal participants, and ensuring the privacy of participants and their data, such as ensuring that individual participants cannot be identified in reported results or from publicly available original or archival data.

Funding: This work was not supported.

Role of the Funders/Sponsors: None of the funders or sponsors of this research had any role in the design and conduct of the study; collection, management, analysis, and interpretation of data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

Acknowledgments: The author would like to thank Mr. Casey Giordano and Dr. Jeff Jones for their comments on prior versions of this manuscript. Special thanks are extended to my next door neighbor and friend, Dr. Greg Anderson, for bringing Simon (Citation2005) to my attention. The ideas and opinions expressed herein are those of the author alone, and endorsement by the author's institution is not intended and should not be inferred.

Notes

1 As of March 22, 2017, Google Scholar reports that Kristof’s (Citation1970), paper has been cited only 30 times.

2 As of March 22, 2017, Google Scholar reports that Levin’s (1979) paper has been cited only once.

3 In this article, the terms “orthonormal” and “orthogonal” will be used interchangeably when referring to matrices.

4 I have adopted the following notation conventions (Abadir & Magnus, Citation2002): boldface lower-case letters (, ) will denote vectors; boldface uppercase letters (, ) will denote matrices; and lower-case (normal font) letters (x) will denote scalars. Other notational conventions are introduced as needed.

5 Where denotes the trace operator.

6 We will assume throughout this paper that vectors and matrices contain only real-valued scalars.

7 When working with a fixed origin, vectors can also represent points in space.

8 Other definitions of vector norms exist but are not reviewed in this paper.

9 A basis is a set of linearly independent vectors that span a space. Basis vectors are fundamental in linear algebra because all vectors in a space can be constructed from a weighted linear combination of the basis vectors. Each column in an identity matrix, , is aligned with a unique axis of the Cartesian coordinate system and is therefore called a standard basis for .

10 A scalar function is a function that returns a scalar.

11 A lemma is a subsidiary theorem that is used in the proof of a larger theorem.

12 Actually, Kristof noted that we need only require that the diagonal entries of and that .

13 It can be shown that maximizing (Equation28) is equivalent to minimizing .

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 352.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.