295
Views
4
CrossRef citations to date
0
Altmetric
Optimization with Penalization

Iterative Proportional Scaling Revisited: A Modern Optimization Perspective

&
Pages 48-60 | Received 01 Sep 2016, Published online: 29 Oct 2018

References

  • Agresti, A. (2012), Categorical Data Analysis (3rd ed.), New York: Wiley.
  • Akaike, H. (1974), “A New Look at the Statistical Model Identification,” IEEE Transactions on Automatic Control, 19, 716–723.
  • Anna, K., and Tamás, R. (2015), “Iterative Scaling in Curved Exponential Families,” Scandinavian Journal of Statistics, 42, 832–847.
  • Berger, A. L., Pietra, V. J. D., and Pietra, S. A. D. (1996), “A Maximum Entropy Approach to Natural Language Processing,” Computational Linguistics, 22, 39–71.
  • Bertsekas, D. P. (2015), Convex Optimization Algorithms, Belmont: Athena Scientific.
  • Bishop, Y. M. M., Fienberg, S. E., and Holland, P. W. (1975), Discrete Multivariate Analysis: Theory and Practice, Cambridge: The MIT Press.
  • Bohning, D., and Lindsay, B. G. (1988), “Monotonicity of Quadratic-approximation Algorithms,” Annals of the Institute of Statistical Mathematics, 40, 641–663.
  • Boyd, S., and Vandenberghe, L. (2004), Convex Optimization, New York: Cambridge University Press.
  • Chen, J., and Chen, Z. (2008), “Extended Bayesian Information Criteria for Model Selection with Large Model Spaces,” Biometrika, 95, 759–771.
  • Csiszár, I. (1975), “I-divergence Geometry of Probability Distributions and Minimization Problems,” The Annals of Probability, 3, 146–158.
  • Darroch, J. N., and Ratcliff, D. (1972), “Generalized Iterative Scaling for Log-Linear Models,” The Annals of Mathematical Statistics, 43, 1470–1480.
  • Deming, W. E., and Stephan, F. F. (1940), “On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known,” The Annals of Mathematical Statistics, 11, 427–444.
  • Dudík, M., Phillips, S. J., and Schapire, R. E. (2004), Performance Guarantees for Regularized Maximum Entropy Density Estimation, Berlin, Heidelberg: Springer, pp. 472–486.
  • Elith, J., Phillips, S. J., Hastie, T., Dudík, M., Chee, Y. E., and Yates, C. J. (2011), “A Statistical Explanation of Maxent for Ecologists,” Diversity and Distributions, 17, 43–57.
  • Fienberg, S. E. (1970), “An Iterative Procedure for Estimation in Contingency Tables,” The Annals of Mathematical Statistics, 41, 907–917.
  • Fienberg, S. E., and Meyer, M. M. (2006), “Iterative Proportional Fitting,” Encyclopedia of Statistical Sciences, 6, 3723–3726.
  • Fienberg, S. E., and Rinaldo, A. (2012), “Maximum Likelihood Estimation in Log-linear Models,” Annals of Statistics, 40, 996–1023.
  • Good, I. J. (1963), “Maximum Entropy for Hypothesis Formulation, Especially for Multidimensional Contingency Tables,” The Annals of Mathematical Statistics, 34, 911–934.
  • Haberman, S. J. (1974), The Analysis of Frequency Data, Chicago: The University of Chicago Press.
  • Hunter, D. R., and Lange, K. (2004), “A Tutorial on MM Algorithms,” The American Statistician, 58, 30–37.
  • Ireland, C. T., and Kullback, S. (1968), “Contingency Tables with Given Marginals,” Biometrika, 55, 179–188.
  • Kurras, S. (2015), “Symmetric Iterative Proportional Fitting,” in Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS) 2015, San Diego, CA, JMLR: W&CP, pp. 526–534.
  • Lahr, M., and De Mesnard, L. (2004), “Biproportional Techniques in Input-output Analysis: Table Updating and Structural Analysis,” Economic Systems Research, 16, 115–134.
  • Lange, K. (2013), Optimization (2nd ed.), New York: Springer.
  • Lange, K., Hunter, D. R., and Yang, I. (2000), “Optimization Transfer Using Surrogate Objective Functions,” Journal of Computational and Graphical Statistics, 9, 1–20.
  • Lauritzen, S. L. (1996), Graphical Models, New York: Oxford University Press.
  • Liu, D. C., and Nocedal, J. (1989), “On the Limited Memory BFGS Method for Large Scale Optimization,” Mathematical Programming, 45, 503–528.
  • Luo, Z.-Q., and Tseng, P. (1992), “On the Convergence of the Coordinate Descent Method for Convex Differentiable Minimization,” Journal of Optimization Theory and Applications, 72, 7–35.
  • McCallum, A., Freitag, D., and Pereira, F. C. N. (2000), “Maximum Entropy Markov Models for Information Extraction and Segmentation,” in Proceedings of the 17th International Conference on Machine Learning, ACM, pp. 591–598.
  • Moro, S., Cortez, P., and Rita, P. (2014), “A Data-driven Approach to Predict the Success of Bank Telemarketing,” Decision Support Systems, 62, 22–31.
  • Nesterov, Y. (1988), “On an Approach to the Construction of Optimal Methods of Minimization of Smooth Convex Functions,” Ekonomika i Matem Metody, 24, 509–517.
  • ——— (2012), “Efficiency of Coordinate Descent Methods on Huge-scale Optimization Problems,” SIAM Journal on Optimization, 22, 341–362.
  • Nutini, J., Schmidt, M., Laradji, I. H., Friedlander, M., and Koepke, H. (2015), “Coordinate Descent Converges Faster with the Gauss-SouthwellRule than Random Selection,” in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, JMLR: W&CP, pp. 1632–1641.
  • Phillips, S. J., Dudík, M., and Schapire, R. E. (2004), “A Maximum Entropy Approach to Species Distribution Modeling,” in Proceedings of the 21th International Conference on Machine Learning, ACM, pp. 655–662.
  • Pietra, S. D., Pietra, V. D., and Lafferty, J. (1997), “Inducing Features of Random Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 380–393.
  • Pukelsheim, F. (2014), “Biproportional Scaling of Matrices and the Iterative Proportional Fitting Procedure,” Annals of Operations Research, 215, 269–283.
  • Razaviyayn, M., Hong, M., and Luo, Z.-Q. (2013), “A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization,” SIAM Journal on Optimization, 23, 1126–1153.
  • Richtárik, P., and Takáč, M. (2014), “Iteration Complexity of Randomized Block-Coordinate Descent Methods for Minimizing a Composite Function,” Mathematical Programming, 144, 1–38.
  • Schmidt, M. (2012), “minFunc: Unconstrained Differentiable Multivariate Optimization in Matlab.” available at http://www.cs.ubc.ca/ schmidtm/Software/minFunc.html.
  • Sinkhorn, R. (1964), “A Relationship between Arbitrary Positive Matrices and Doubly Stochastic Matrices,” The Annals of Mathematical Statistics, 35, 876–879.
  • Sinkhorn, R., and Knopp, P. (1967), “Concerning Nonnegative Matrices and Doubly Stochastic Matrices,” Pacific Journal of Mathematics, 21, 343–348.
  • Tseng, P. (2001), “Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization,” Journal of Optimization Theory and Applications, 109, 475–494.
  • Tseng, P. (2010), “Approximation Accuracy, Gradient Methods, and error Bound for Structured Convex Optimization,” Mathematical Programming, 125, 263–295.
  • Wang, N., Rauh, J., and Massam, H. (2016), “Approximating Faces of Marginal Polytopes in Discrete Hierarchical Models,” arXiv preprint arXiv:1603.04843.
  • Wright, S. J. (2015), “Coordinate Descent Algorithms,” Mathematical Programming, 151, 3–34.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.