250
Views
0
CrossRef citations to date
0
Altmetric
Materials data analysis and utilization

Crystal structure map for materials classification and modeling

ORCID Icon
Article: 2355860 | Received 02 Jan 2024, Accepted 06 May 2024, Published online: 03 Jun 2024

Figures & data

Figure 1. Three-step workflow described in the present study. Capital italic name expresses each detailed procedure, CIF2ESC: conversion from a CIF file to electronic-structure-calculation data with output of fingerprint as structure feature for post structure analyses; DISTANCE: evaluation of distances between structure feature vectors; DISTANCE_DISTRIBUTION: calculation of distance distribution; DIAGNOSIS: extraction of inequivalent structures; CMDS: classical multidimensional scaling; PCEV: calculation of principal component eigenvectors; PROJECTION: projection of an arbitrary structure on the dimension-reduced map; ATLS: generation of atom-type list; FINDSYM : space-group finder  : Ref [Citation15,Citation16]. Open and solid (and dotted) arrows denote data transfers between the steps and inside each step, respectively. Dotted arrow is applicable if necessary.

Figure 1. Three-step workflow described in the present study. Capital italic name expresses each detailed procedure, CIF2ESC: conversion from a CIF file to electronic-structure-calculation data with output of fingerprint as structure feature for post structure analyses; DISTANCE: evaluation of distances between structure feature vectors; DISTANCE_DISTRIBUTION: calculation of distance distribution; DIAGNOSIS: extraction of inequivalent structures; CMDS: classical multidimensional scaling; PCEV: calculation of principal component eigenvectors; PROJECTION: projection of an arbitrary structure on the dimension-reduced map; ATLS: generation of atom-type list; FINDSYM ∗: space-group finder  ∗: Ref [Citation15,Citation16]. Open and solid (and dotted) arrows denote data transfers between the steps and inside each step, respectively. Dotted arrow is applicable if necessary.

Table 1. List of Al 2O 3 structures in the order of calculated total energy E (meV/atom) relative to the most stable R3ˉc structure, and the energy gap EG (eV). Figures in parentheses are values given in the materials project site [11]. Color scheme is used for distinguishing different structures in .

Figure 2. F-Fingerprint of all the Al 2O 3 systems listed in except for the 2D film model. Radius cutoff Rmax=10 Å is assumed and the fingerprint of dimensions between 0 and 100, 101 and 200, and 201 and 300 corresponds to O-O, O-Al, and Al-Al pairs, respectively. Line colors follow color scheme in Table 1 for distinguishing different structures.

Figure 2. F-Fingerprint of all the Al 2O 3 systems listed in Table 1 except for the 2D film model. Radius cutoff Rmax=10 Å is assumed and the fingerprint of dimensions between 0 and 100, 101 and 200, and 201 and 300 corresponds to O-O, O-Al, and Al-Al pairs, respectively. Line colors follow color scheme in Table 1 for distinguishing different structures.

Figure 3. Eigenvalues calculated from the fingerprint of Al 2O 3 polymorphs using (a) Rmax=10 Å and (b) Rmax=5 Å with Euclidean (blue) and cosine (red) distances.

Figure 3. Eigenvalues calculated from the fingerprint of Al 2O 3 polymorphs using (a) Rmax=10 Å and (b) Rmax=5 Å with Euclidean (blue) and cosine (red) distances.

Figure 4. Map of Al 2O 3 structures with the five principal components (PCs) with (a) Euclidean-distance Rmax=10 Å, (b) cosine-distance Rmax=10 Å, (c) Euclidean-distance Rmax=5 Å, and (d) cosine-distance Rmax=5 Å. Dot colors follow color scheme listed in . Square dots in each figure denote the three lowest energy structures (SG: R3ˉc (black), C2/m (maroon), and Pna21 (blue)), and small and large red dots express amorphous model structures (SG: P1) by the materials project [Citation11] and momida [Citation43], respectively.

Figure 4. Map of Al 2O 3 structures with the five principal components (PCs) with (a) Euclidean-distance Rmax=10 Å, (b) cosine-distance Rmax=10 Å, (c) Euclidean-distance Rmax=5 Å, and (d) cosine-distance Rmax=5 Å. Dot colors follow color scheme listed in Table 1. Square dots in each figure denote the three lowest energy structures (SG: R3ˉc (black), C2/m (maroon), and Pna21 (blue)), and small and large red dots express amorphous model structures (SG: P1) by the materials project [Citation11] and momida [Citation43], respectively.

Figure 5. Eigenvector of the five principal components (PCs) with (a) Euclidean-distance Rmax=10 Å, (b) cosine-distance Rmax=10 Å, (c) Euclidean-distance Rmax=5 Å, and (d) cosine-distance Rmax=5 Å, calculated by using EquationEquation (17) in Al 2O 3 polymorphs. Red, lime, blue, maroon, and green lines denote C1, C2, C3, C4, and C5 PC, respectively. Horizontal axis denotes the dimension of the original fingerprint space (see ). In (a) and (b), the dimensions between 1 and 100, 101 and 200, and 201 and 300 express the O-O, O-Al, and Al-Al pair radius, respectively, and in (c) and (d), the dimensions between 1 and 50, 51 and 100, and 1001 and 150 do the O-O, O-Al, and Al-Al pair radius, respectively.

Figure 5. Eigenvector of the five principal components (PCs) with (a) Euclidean-distance Rmax=10 Å, (b) cosine-distance Rmax=10 Å, (c) Euclidean-distance Rmax=5 Å, and (d) cosine-distance Rmax=5 Å, calculated by using EquationEquation (17)(17) u+=XT−1XˉTQ+Λ+−12=XT−1XˉTX+Λ+−1(17) in Al 2O 3 polymorphs. Red, lime, blue, maroon, and green lines denote C1, C2, C3, C4, and C5 PC, respectively. Horizontal axis denotes the dimension of the original fingerprint space (see Figure 2). In (a) and (b), the dimensions between 1 and 100, 101 and 200, and 201 and 300 express the O-O, O-Al, and Al-Al pair radius, respectively, and in (c) and (d), the dimensions between 1 and 50, 51 and 100, and 1001 and 150 do the O-O, O-Al, and Al-Al pair radius, respectively.

Figure 6. Regression analysis of the total energy with descriptors up to the second order of the five PCs in Al 2O 3 polymorph structures. (a) Euclidean-distance Rmax=10 Å, (b) cosine-distance Rmax=10 Å, (c) Euclidean-distance Rmax=5 Å, and (d) cosine-distance Rmax=5 Å. Blue, red, and green dots and lines denote the coefficient of determination (R2), its leave-one-out coefficient (Q2), and root mean square error (RMSE), respectively.

Figure 6. Regression analysis of the total energy with descriptors up to the second order of the five PCs in Al 2O 3 polymorph structures. (a) Euclidean-distance Rmax=10 Å, (b) cosine-distance Rmax=10 Å, (c) Euclidean-distance Rmax=5 Å, and (d) cosine-distance Rmax=5 Å. Blue, red, and green dots and lines denote the coefficient of determination (R2), its leave-one-out coefficient (Q2), and root mean square error (RMSE), respectively.

Figure 7. Double cross-validation test of the total-energy regression model constructed with the descriptors from the Euclidean distance and Rmax=5 Å in Al 2O 3 polymorph structures. Dot colors follow color scheme listed in . Red open circle denotes one of the amorphous structure (mp -1,245,063). The model of the number of descriptors of eight shows the smallest standard deviation (0.1 eV).

Figure 7. Double cross-validation test of the total-energy regression model constructed with the descriptors from the Euclidean distance and Rmax=5 Å in Al 2O 3 polymorph structures. Dot colors follow color scheme listed in Table 1. Red open circle denotes one of the amorphous structure (mp -1,245,063). The model of the number of descriptors of eight shows the smallest standard deviation (∼0.1 eV).

Figure 8. Crystal structure of La 2Fe 26ySi y. Green, orange, and brown balls denote La, Fe1 and Fe2 atomic sites, respectively. Partial occupation of Si at the Fe2 sites is depicted with blue. The drawing is made with VESTA [Citation9].

Figure 8. Crystal structure of La 2Fe 26−ySi y. Green, orange, and brown balls denote La, Fe1 and Fe2 atomic sites, respectively. Partial occupation of Si at the Fe2 sites is depicted with blue. The drawing is made with VESTA [Citation9].

Figure 9. Heats of formation H in eV/atom (EquationEquation (22)) and ΔH in meV/atom (EquationEquation (23)), and spin magnetic moments in μB per Fe atom calculated for La 2Fe 26ySi y. Red dots, green dots, and blue circles denote values for the configurations with Si occupying at only 8b, at 8b and 96i, and at only 96i, respectively. Black dots indicate the corresponding values for La 2Fe 26(y=0).

Figure 9. Heats of formation H in eV/atom (EquationEquation (22)(22) H(y;i)=128{E[La2Fe26−ySiy;i]−2E[La] −(26−y)E[Fe]−yE[Si]}(22) ) and ΔH in meV/atom (EquationEquation (23)(23) ΔH(y;i)=H(y;i)−minjH(y;j)(23) ), and spin magnetic moments in μB per Fe atom calculated for La 2Fe 26−ySi y. Red dots, green dots, and blue circles denote values for the configurations with Si occupying at only 8b, at 8b and 96i, and at only 96i, respectively. Black dots indicate the corresponding values for La 2Fe 26(y=0).

Figure 10. (a) F-fingerprints and (b) Euclidean eigenvectors of three principal components of Si-Si (0–80) and Fe-Si (80–160) dimensions of La 2Fe 24Si 2 (Case 2–3). Line colors used for the F-fingerprints in (a) correspond to those of dots in . Red, lime, and blue lines in (b) denote the eigenvectors of C1, C2, and C3, respectively.

Figure 10. (a) F-fingerprints and (b) Euclidean eigenvectors of three principal components of Si-Si (0–80) and Fe-Si (80–160) dimensions of La 2Fe 24Si 2 (Case 2–3). Line colors used for the F-fingerprints in (a) correspond to those of dots in Fig 11. Red, lime, and blue lines in (b) denote the eigenvectors of C1, C2, and C3, respectively.

Figure 11. Structure map of La 2Fe 24Si 2 (case 2–3) obtained by dimension reduction of the Euclidean distance. Dot colors denote the different configurations corresponding to the F-fingerprint shown in ).

Figure 11. Structure map of La 2Fe 24Si 2 (case 2–3) obtained by dimension reduction of the Euclidean distance. Dot colors denote the different configurations corresponding to the F-fingerprint shown in Fig 10(a).

Figure 12. (a) Selected F-fingerprints and (b) eigenvectors of three principal components of Si-Si (0–80) and Fe-Si (80–160) dimensions of La 2Fe 23Si 3 (case 3–3). Red, lime, and blue lines in (b) denote the eigenvectors of C1, C2, and C3, respectively.

Figure 12. (a) Selected F-fingerprints and (b) eigenvectors of three principal components of Si-Si (0–80) and Fe-Si (80–160) dimensions of La 2Fe 23Si 3 (case 3–3). Red, lime, and blue lines in (b) denote the eigenvectors of C1, C2, and C3, respectively.

Figure 13. Structure map of La 2Fe 23Si 3 (case 3–3) obtained by dimension reduction of the Euclidean distance. Dot colors denote the different configurations corresponding to the F-fingerprint shown in ), where the F-fingerprints of the small black dots here are not included.

Figure 13. Structure map of La 2Fe 23Si 3 (case 3–3) obtained by dimension reduction of the Euclidean distance. Dot colors denote the different configurations corresponding to the F-fingerprint shown in Fig. 12(a), where the F-fingerprints of the small black dots here are not included.

Data availability statement

The raw data required to reproduce these findings are available by making an e-mail request to the corresponding author.