2,088
Views
0
CrossRef citations to date
0
Altmetric
Theory and Methods

Survival Mixed Membership Blockmodel

ORCID Icon, , ORCID Icon & ORCID Icon
Pages 1647-1656 | Received 14 Apr 2021, Accepted 24 Apr 2023, Published online: 27 Jun 2023

Figures & data

Table 1 Time complexity of MCMC steps.

Fig. 1 Boxplot of coverage probabilities among 100 synthetic replicated datasets (a) when K = 3, nijPois(25)+5 and the connectivity probabilities pc varies over (0.2,0.3,0.4,0.5); (b) when K = 3, pc=0.3, nijPois(μ)+5 and μ varies over (10,15,20,25,30,35); (c) when pc=0.3,nijPois(25)+5 and K varies over (2,3,4,5).

Fig. 1 Boxplot of coverage probabilities among 100 synthetic replicated datasets (a) when K = 3, nij∼Pois(25)+5 and the connectivity probabilities pc varies over (0.2,0.3,0.4,0.5); (b) when K = 3, pc=0.3, nij∼Pois(μ)+5 and μ varies over (10,15,20,25,30,35); (c) when pc=0.3,nij∼Pois(25)+5 and K varies over (2,3,4,5).

Fig. 2 Running time of 100,000 iterations of the MCMC algorithm on one core of an Intel Xeon Gold 6226R Processor. The number of observations per edge nijPois(μ)+5. (a) We fix the connectivity probability pc=0.3 and μ = 25 while varying the number of roles K from 2 to 5. (b) We fix K = 3 and μ = 25 but vary pc from 0.2 to 0.5. (c) We fix K = 3 and pc=0.3 but vary μ from 10 to 35.

Fig. 2 Running time of 100,000 iterations of the MCMC algorithm on one core of an Intel Xeon Gold 6226R Processor. The number of observations per edge nij∼Pois(μ)+5. (a) We fix the connectivity probability pc=0.3 and μ = 25 while varying the number of roles K from 2 to 5. (b) We fix K = 3 and μ = 25 but vary pc from 0.2 to 0.5. (c) We fix K = 3 and pc=0.3 but vary μ from 10 to 35.

Fig. 3 Patterns learned from the Enron E-mail corpus. (a), (b) The scatterplots of the employee-specific probability πi1 of belonging to the first role and the log-scale E-mail numbers (a) when the SMMB is applied to time-to-event data, and (b) when the MMSB is applied to relational data. Each node represents an employee, with the color indicating his or her position. The size of each node represents the number of E-mails related to the employee. (c) The estimated baseline survival curve exp{exp(β̂1lk)[1S0(t|λ̂lk)]} of role l replying E-mails to role k when x=(1,0,0,,0)T.

Fig. 3 Patterns learned from the Enron E-mail corpus. (a), (b) The scatterplots of the employee-specific probability πi1 of belonging to the first role and the log-scale E-mail numbers (a) when the SMMB is applied to time-to-event data, and (b) when the MMSB is applied to relational data. Each node represents an employee, with the color indicating his or her position. The size of each node represents the number of E-mails related to the employee. (c) The estimated baseline survival curve  exp {− exp (β̂1lk)[1−S0(t|λ̂lk)]} of role l replying E-mails to role k when x=(1,0,0,…,0)T.

Table 2 The estimated role probabilities πi by the SMMB from time-to-event data and by the MMSB from binary data of the four CEOs, respectively.

Table 3 The posterior mean, posterior standard deviation (SD), and 95% credible interval (CI) of coefficient βrlk’s for the SMMB learned from the Enron E-mail corpus without considering confidential information.

Supplemental material

Supplemental Material

Download Zip (1.5 MB)