Learning linear non-Gaussian networks: A new view from matrix identification

Dingcheng Feng Department of Automation, Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, ChinaCorrespondence[email protected]

Feng Chen Department of Automation, Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Wenli Xu Department of Automation, Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Abstract

Identifying causal structures from observations is fundamental in many applications. Probabilistic graphical models provide a unifying framework for capturing complex causal dependencies among random variables. Recently, a linear non-Gaussian acyclic model (LiNGAM) and some smart algorithms (ICA-LiNGAM, DirectLiNGAM) have been proposed, which outperform previous graphical models and learning methods in identifying variable orders. We propose new solutions (TMFLiNGAM, SchurLiNGAM and RCLiNGAM) to the LiNGAM learning task from the perspective of matrix identification. TMFLiNGAM and SchurLiNGAM recover orders more directly, and RCLiNGAM can improve the accuracy of previous algorithms on uniform and sparse structures. The perspective also facilitates the learning of sparse models where the performance of all independent component analysis-based algorithms can be improved by reconstructing variable orders from the inverse of separation matrices. Experimental results under various settings provide average evaluations over the learning methods, and verify the effectiveness of our perspective and algorithms.

Keywords:

Acknowledgements

We would like to thank the anonymous editors and reviewers for their constructive suggestions and comments. This work was supported by the National Natural Science Foundation of China (Project Nos 60772050 and 61071131), No. 2 Important National Science and Technology Specific Projects (Project No. 2009ZX02001), the National Key Basic Research and Development Program of China (No. 2009CB320602) and United Technologies Research Center (UTRC).

Notes

Notes

1. For example, it is difficult to determine whether x ₁ → x ₂ (x ₁ causes x ₂) or x ₂ → x ₁ (x ₂ causes x ₁) from a bivariate Gaussian distribution since x ₁ and x ₂ are symmetrical in p(x ₁, x ₂) (Daniušis et al. Citation2010).

2. c implies that the expectation of disturbances can always be assumed to 0, and the LiNGAM equation can be simplified as x = B x + e by a translation transform x ← x − (I − B)⁻¹ c.

3. Learning structures are more important and difficult in our tasks, so we focus on estimating the variable orders. In fact, the parameters of LiNGAM, i.e. the coefficients between observed variables, can be easily determined by regression after the variable orders are correctly identified.

4. P _x incorporates the variable information, and is equivalent in describing the variable order as K. Representing the variable order from K to P _x brings clearer description of the LiNGAM learning problem since all the variables can now be written as matrices.

5. This statement holds in general situations such as all the connection strength coefficients are non-zero, but does not hold in some situations such as B is sparse (many coefficients are zero). In the latter case, Q is not unique since the LiNGAM learning problem is underdetermined.

6. In automatic control, signal processing and machine learning, the system state vector x is usually hidden and unobservable, but can be measured as a ‘observation vector’ y by various kinds of sensors, which is usually represented by a linear mapping y = T x.

7. We get Q from in ICA-LiNGAM, and find a permutation matrix P _x closest to Q by linear assignment algorithm. This modification of ICA-LiNGAM can be called SchurLiNGAM. Our numerical experiments in Section 7 shows that determining P _x from by Schur factorisation requires more observational data to achieve similar accuracies as previous ICA-LiNGAM and DirectLiNGAM algorithms.

8. Identifying the permutation and scaling can remove the unknown scaling of each row in the connection matrix B, and can improve the performance of RCLiNGAM algorithm. However, this step increases computational complexity since it calls the linear assignment algorithm (Hoyer et al. Citation2006).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Learning linear non-Gaussian networks: A new view from matrix identification

Information for

Open access

Opportunities

Help and information

Learning linear non-Gaussian networks: A new view from matrix identification

Abstract

Acknowledgements

Notes

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature