1,760
Views
3
CrossRef citations to date
0
Altmetric
Theory and Methods

Nonlinear Causal Discovery with Confounders

ORCID Icon, ORCID Icon & ORCID Icon
Pages 1205-1214 | Received 03 Jun 2021, Accepted 06 Feb 2023, Published online: 15 Mar 2023

References

  • Agrawal, R., Squires, C., Prasad, N., and Uhler, C. (2021), “The DeCAMFounder: Non-linear Causal Discovery in the Presence of Hidden Variables,” arXiv preprint arXiv:2102.07921.
  • Akaike, H. (1992), “Information Theory and an Extension of the Maximum Likelihood Principle,” in Breakthroughs in Statistics, eds. S. Kotz and N. L. Johnson, pp. 610–624, New York: Springer.
  • Anderson, T., and Darling, D. (1952), “Asymptotic Theory of Certain “Goodness of Fit” Criteria based on Stochastic Processes,” The Annals of Mathematical Statistics, 23, 193–212.
  • Bühlmann, P., Peters, J., and Ernest, J. (2014), “CAM: Causal Additive Models, High-Dimensional Order Search and Penalized Regression,” The Annals of Statistics, 42, 2526–2556.
  • Chandrasekaran, V., Parrilo, P. A., and Willsky, A. S. (2012), “Latent Variable Graphical Model Selection via Convex Optimization,” The Annals of Statistics, 40, 1935–1967.
  • Chen, C., Ren, M., Zhang, M., and Zhang, D. (2018), “A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations,” Journal of Machine Learning Research, 19, 40–73.
  • Chickering, D. M. (2002), “Optimal Structure Identification with Greedy Search,” Journal of Machine Learning Research, 3, 507–554.
  • Colombo, D., Maathuis, M. H., Kalisch, M., and Richardson, T. S. (2012), “Learning High-Dimensional Directed Acyclic Graphs with Latent and Selection Variables,” The Annals of Statistics, 40, 294–321.
  • de Campos, C. P., and Ji, Q. (2011), “Efficient Structure Learning of Bayesian Networks using Constraints,” Journal of Machine Learning Research, 12, 663–689.
  • de Campos, L. M. (2006), “A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests,” Journal of Machine Learning Research, 7, 2149–2187.
  • Farrell, M. H., Liang, T., and Misra, S. (2021), “Deep Neural Networks for Estimation and Inference,” Econometrica, 89, 181–213.
  • Frot, B., Nandy, P., and Maathuis, M. H. (2019), “Robust Causal Structure Learning with Some Hidden Variables,” Journal of the Royal Statistical Society, Series B, 81, 459–487.
  • Glymour, C., Zhang, K., and Spirtes, P. (2019), “Review of Causal Discovery Methods based on Graphical Models,” Frontiers in Genetics, 10.
  • Gu, J., Fu, F., and Zhou, Q. (2019), “Penalized Estimation of Directed Acyclic Graphs from Discrete Data,” Statistics and Computing, 29, 161–176.
  • Hastie, T., Tibshirani, R., and Friedman, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer.
  • Hoyer, P. O., Janzing, D., Mooij, J., Peters, J., and Schölkopf, B. (2008), “Nonlinear Causal Discovery with Additive Noise Models,” in Proceedings of the 21st International Conference on Neural Information Processing Systems, pp. 689–696.
  • Jaakkola, T., Sontag, D., Globerson, A., and Meila, M. (2010), “Learning Bayesian Network Structure using LP Relaxations,” in International Conference on Artificial Intelligence and Statistics, pp. 358–365. PMLR.
  • Janzing, D., Peters, J., Mooij, J., and Schölkopf, B. (2009), “Identifying Confounders using Additive Noise Models,” in Conference on Uncertainty in Artificial Intelligence, pp. 249–257.
  • Kalisch, M., and Bühlman, P. (2007), “Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm,” Journal of Machine Learning Research, 8.
  • Kanehisa, M., and Goto, S. (2000), “KEGG: Kyoto Encyclopedia of Genes and Genomes,” Nucleic Acids Research, 28, 27–30. DOI: 10.1093/nar/28.1.27.
  • Kelleher III, R. J., and Shen, J. (2017), “Presenilin-1 Mutations and Alzheimer’s Disease,” Proceedings of the National Academy of Sciences, 114, 629–631.
  • Kingma, D. P., and Ba, J. (2014), “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980.
  • Li, C., Shen, X., and Pan, W. (2020), “Likelihood Ratio Tests for a Large Directed Acyclic Graph,” Journal of the American Statistical Association, 115, 1304–1319. DOI: 10.1080/01621459.2019.1623042.
  • Li, C., Shen, X., and Pan, W. (2021), “Inference for a Large Directed Acyclic Graph with Unspecified Interventions,” arXiv preprint arXiv:2110.03805.
  • Meinshausen, N., and Bühlmann, P. (2006), “High-Dimensional Graphs and Variable Selection with the Lasso,” The Annals of Statistics, 34, 1436–1462.
  • Monti, R. P., Zhang, K., and Hyvärinen, A. (2020), “Causal Discovery with General Non-linear Relationships using Non-linear ICA,” in Conference on Uncertainty in Artificial Intelligence, pp. 186–195. PMLR.
  • Mooij, J., Janzing, D., Peters, J., and Schölkopf, B. (2009), “Regression by Dependence Minimization and its Application to Causal Inference in Additive Noise Models,” in International Conference on Machine Learning, pp. 745–752.
  • O’brien, R. J., and Wong, P. C. (2011), “Amyloid Precursor Protein Processing and Alzheimer’s Disease,” Annual Review of Neuroscience, 34, 185–204.
  • Palmqvist, S., Janelidze, S., Quiroz, Y., Zetterberg, H., Lopera, F., Stomrud, E., Su, Y., Chen, Y., Serrano, G., Leuzy, A., et al. (2020), “Discriminative Accuracy of Plasma Phospho-tau217 for Alzheimer Disease vs Other Neurodegenerative Disorders,” JAMA, 324, 772–781.
  • Pearl, J. (2009), Causality, Cambridge: Cambridge University Press.
  • Peters, J., and Bühlmann, P. (2014), “Identifiability of Gaussian Structural Equation Models with Equal Error Variances,” Biometrika, 101, 219–228.
  • Peters, J., Janzing, D., and Scholkopf, B. (2017), Elements of Causal Inference, Cambridge, MA: MIT Press.
  • Peters, J., Mooij, J. M., Janzing, D., and Schölkopf, B. (2014), “Causal Discovery with Continuous Additive Noise Models,” Journal of Machine Learning Research, 15, 2009–2053.
  • Reisach, A., Seiler, C., and Weichwald, S. (2021), “Beware of the Simulated DAG! Causal Discovery Benchmarks may be Easy to Game,” in Advances in Neural Information Processing Systems (Vol. 34), pp. 27772–27784.
  • Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A., and Nolan, G. P. (2005), “Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data,” Science, 308, 523–529. DOI: 10.1126/science.1105809.
  • Schmidt-Hieber, J. (2019), “Deep ReLU Network Approximation of Functions on a Manifold,” arXiv preprint arXiv:1908.00695.
  • Shah, R. D., Frot, B., Thanei, G.-A., and Meinshausen, N. (2020), “Right Singular Vector Projection Graphs: Fast High Dimensional Covariance Matrix Estimation under Latent Confounding,” Journal of the Royal Statistical Society, Series B, 82, 361–389.
  • Shen, X., Pan, W., and Zhu, Y. (2012), “Likelihood-based Selection and Sharp Parameter Estimation,” Journal of the American Statistical Association, 107, 223–232.
  • Shimizu, S., Hoyer, P. O., Hyvärinen, A., and Kerminen, A. (2006), “A Linear non-Gaussian Acyclic Model for Causal Discovery,” Journal of Machine Learning Research, 7, 2003–2030.
  • Spirtes, P., Glymour, C., and Scheines, R. (2000), Causation, Prediction, and Search, Cambridge, MA: MIT Press.
  • Stone, C. J. (1982), “Optimal Global Rates of Convergence for Nonparametric Regression,” The Annals of Statistics, 10, 1040–1053.
  • Tsamardinos, I., Brown, L. E., and Aliferis, C. F. (2006), “The max-min Hill-Climbing Bayesian Network Structure Learning Algorithm,” Machine Learning, 65, 31–78.
  • Uhler, C., Raskutti, G., Bühlmann, P., and Yu, B. (2013), “Geometry of the Faithfulness Assumption in Causal Inference,” The Annals of Statistics, 41, 436–463.
  • van de Geer, S. A. (2000), Empirical Processes in M-Estimation (Vol. 6), Cambridge: Cambridge University Press.
  • Voorman, A., Shojaie, A., and Witten, D. (2014), “Graph Estimation with Joint Additive Models,” Biometrika, 101, 85–101. DOI: 10.1093/biomet/ast053.
  • Wang, Y., and Blei, D. M. (2019), “The Blessings of Multiple Causes,” Journal of the American Statistical Association, 114, 1574–1596.
  • Wong, W. H., and Shen, X. (1995), “Probability Inequalities for Likelihood Ratios and Convergence Rates of Sieve MLES,” The Annals of Statistics, 23, 339–362.
  • Yuan, Y., Shen, X., Pan, W., and Wang, Z. (2019), “Constrained Likelihood for Reconstructing a Directed Acyclic Gaussian Graph,” Biometrika, 106, 109–125.
  • Zhang, K., and Hyvärinen, A. (2009), “On the Identifiability of the Post-nonlinear Causal Model,” in Conference on Uncertainty in Artificial Intelligence, pp. 647–655.
  • Zheng, X., Aragam, B., Ravikumar, P., and Xing, E. P. (2018), “DAGs with NO TEARS: Continuous Optimization for Structure Learning,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 9492–9503.
  • Zheng, X., Dan, C., Aragam, B., Ravikumar, P., and Xing, E. (2020), “Learning Sparse Nonparametric DAGs,” in International Conference on Artificial Intelligence and Statistics, pp. 3414–3425, PMLR.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.