1,302
Views
2
CrossRef citations to date
0
Altmetric
Theory and Methods

Latent Space Modeling of Hypergraph Data

ORCID Icon, , ORCID Icon &
Received 12 Mar 2020, Accepted 22 Sep 2023, Published online: 11 Dec 2023

Abstract

The increasing prevalence of relational data describing interactions among a target population has motivated a wide literature on statistical network analysis. In many applications, interactions may involve more than two members of the population and this data is more appropriately represented by a hypergraph. In this article, we present a model for hypergraph data that extends the well-established latent space approach for graphs and, by drawing a connection to constructs from computational topology, we develop a model whose likelihood is inexpensive to compute. A delayed acceptance MCMC scheme is proposed to obtain posterior samples and we rely on Bookstein coordinates to remove the identifiability issues associated with the latent representation. We theoretically examine the degree distribution of hypergraphs generated under our framework and, through simulation, we investigate the flexibility of our model and consider estimation of predictive distributions. Finally, we explore the application of our model to two real-world datasets. Supplementary materials for this article are available online.

1 Introduction

A hypergraph describes interactions which occur among arbitrary subsets of a population of interest. This data structure is comprised of a node set, which indexes the population, and a hyperedge set, which indicates which members of the population interact. Examples of hypergraphs occur in image co-tagging (see ), where hyperedges indicate users who appear in the same photograph in an online platform, and coauthorship (see ), where hyperedges represent the set of authors who collaborated on an academic article. A key feature of a hypergraph representation is that can express higher-order interactions and, whilst these data could be analyzed according to an extensive graph modeling literature (see Kolaczyk Citation2009), this results in a loss of structural information which cannot be recovered (see ). This represents the main motivation of this article where we develop a novel model for hypergraph data, detail a procedure for inference and present theoretical results on the degree distribution.

Fig. 1 Examples of hypergraph datasets. (b) shows co-tagging data and (c) shows a subsample of the coauthorship network of Ji and Jin (Citation2016). The figures were made with R packages HyperG (Marchette Citation2021) and igraph (Csardi and Nepusz Citation2006).

Fig. 1 Examples of hypergraph datasets. (b) shows co-tagging data and (c) shows a subsample of the coauthorship network of Ji and Jin (Citation2016). The figures were made with R packages HyperG (Marchette Citation2021) and igraph (Csardi and Nepusz Citation2006).

More formally, a hypergraph G=(V,E) consists of a set of N node labels V and M hyperedges E, where eE contains no repeated elements and eV (see ). This data type can also be equivalently represented as a bipartite graph with an (N×M) adjacency tensor in which each hyperedge is indexed by a node and an edge from a population node to a hyperedge node indicates presence in a hyperedge (see ). Although a significant portion of the existing literature relies on the bipartite representation, we focus on the former representation since it allows us to develop a model which allows for an arbitrary number of hyperedges by avoiding conditioning on M.

We consider extending the latent distance model of Hoff, Raftery, and Handcock (Citation2002) to the hypergraph setting using the representation in . In this approach, a low-dimensional coordinate is associated with each node and nodes whose latent coordinates are close in terms of Euclidean distance are assumed more likely to share a tie. This modeling choice is appealing since the latent space offers an intuitive visualization of the data, encourages transitive relationships due to the metric property of Euclidean distance and allows control over the joint distribution of subgraph counts. Models of this type also allow quantification of uncertainty in interaction probabilities and prediction of future interactions. In our model, we wish to harness hypergraph analogues of these properties and we focus on the following inferential questions.

  • (Q1)How can we determine an intuitive visualization of an observed hypergraph?

  • (Q2)How do we expect new nodes to interact with the observed hypergraph relationships?

There is a growing literature concerned with hypergraph modeling (Battiston et al. Citation2020), and several graph models have been extended to the bipartite setting. Examples include exponential random graphs (Wang, Pattison, and Robins Citation2013), blockmodels (Doreian, Batagelj, and Ferligoj Citation2004) and latent space network models (Friel et al. Citation2016). However, in contrast to our work, models for bipartite representations assume the number of hyperedges is known and fixed.

Examples of models for general hypergraphs include the β model of Stasi et al. (Citation2014), preferential attachment models (Wang et al. Citation2010) and the clustering-based model of Ng and Murphy (Citation2021). In a m-uniform hypergraph all hyperedges contain exactly m elements, and extensions of stochastic blockmodels (see Ghoshdastidar and Dukkipati Citation2017; Chien, Lin, and Wang Citation2018) and graphons (Balasubramanian Citation2021) have been studied for this hypergraph type. Simplicial hypergraphs, or rather simplicial complexes (see Edelsbrunner and Harer Citation2010), are a significant class of hypergraphs in which the presence of a hyperedge implies the presence of all subsets of that hyperedge. Proposed models for simplicial complexes include an analogue of exponential random graphs (Zuev, Eisenberg, and Krioukov Citation2015), preferential attachment based models (Bianconi and Rahmede Citation2017), simplicial configuration models (Young et al. Citation2017) and the recursive models of Kahle (Citation2014). A discussion of random simplicial complexes is presented in Kahle (Citation2014) and work on geometric simplicial complexes (Bobrowski and Kahle Citation2018) is most related to our proposed procedure. Simplicial complexes also appear more broadly in the statistics literature (Chazal and Michel Citation2017; Lunagómez et al. Citation2017) and our terminology “simplicial hypergraph” is nonstandard.

Our focus on inference for the full generality of hypergraphs marks a distinction between our work and much of the existing literature. By considering general hypergraphs, we avoid restrictions inherent to simplicial and uniform representations and this allows us to consider how hyperedges of different orders interact in the data. Non-simplicial hypergraph arise in many settings and particular instances of non-simplicial coauthorship networks are presented in and Section 7.2.

The main contributions of this article are as follows. First, using the representation shown in , we develop a computationally tractable latent space model for non-simplicial hypergraph data. Second, we explore the novel application of Bookstein coordinates to address non-identifiability in the latent representation. Third, we perform inference on hypergraph data that is outside the reach of competing methods and consider a novel application of delayed acceptance MCMC methodology. Finally, we study properties of the degree distribution for our model and present approximate theoretical results.

The rest of this article is organized as follows. Section 2 provides the background for the hypergraph model presented in Section 3. Section 4 considers the degree distribution of our model and Section 5 describes a procedure for obtaining posterior samples. The simulation studies and real data examples are presented in Sections 6 and 7, respectively, and we conclude with a discussion in Section 8.

2 Background

2.1 Latent Space Network Modeling

Latent space models were introduced for network data in Hoff, Raftery, and Handcock (Citation2002). This modeling framework associates the nodes of a network with low-dimensional latent coordinates and expresses the probability of edges forming as a function of the latent representation.

To describe this model for a N node network, we let Y={yij}i,j=1,2,,N denote the observed (N×N) adjacency matrix, where yij represents the connection between nodes i and j. For binary interactions, we take yij = 1 if i and j share an edge and yij = 0 otherwise. We also let uiRd denote the d-dimensional latent coordinate associated with the ith node, for i={1,2,,N}=[N]. The presence of an edge is then given by (1) YijBernoulli(pij),pij=P(yij=1|ui,uj,θ)=1/(1+exp{f(ui,uj,θ)}),(1) for i,j[N], where θ represents additional model parameters and f controls how ui and uj affect the propensity for ties to form. Typically, f is specified as monotonically decreasing in a measure of similarity between ui and uj . For example, Hoff, Raftery, and Handcock (Citation2002) take (2) f(ui,uj,θ)=αuiuj,(2) where · is the Euclidean distance, and θ=(α) represents the base-rate tendency for edges to form. The function f may also be adapted to incorporate covariate information so that nodes which share certain characteristics are more likely to be connected. The likelihood, conditional on U={ui}i=1N and θ, is given by (3) L(U,θ;Y)i<jP(yij=1|ui,uj,θ)yij×[1P(yij=1|ui,uj,θ)]1yij.(3)

Properties of this model are well-understood and we refer to Rastelli, Friel, and Raftery (Citation2016) for details.

2.2 Random Geometric Graphs

Random geometric graphs (RGGs) (Penrose Citation2003) can be viewed as a special case of the latent space network model outlined in Section 2.1. To express the connection probabilities we let Br(ui)={uRd|uuir}, where · is Euclidean distance, denote a ball of radius r with center ui . The presence of an edge is then expressed deterministically as (4) pij=P(yij=1|ui,uj,r)=1(Br(ui)Br(uj))=1(uiuj2r),(4) so that nodes i and j share a tie only if their corresponding sets have a nonempty intersection (see ) or, equivalently, when uiuj2r.

Fig. 2 Example of a Čech complex. Left: Br(ui) for {ui=(ui1,ui2)}i=17 in R2. Middle: the graph obtained by taking pairwise intersections. Right: the hypergraph obtained by taking intersections of arbitrary order. The shaded region indicates an order 3 hyperedge.

Fig. 2 Example of a Čech complex. Left: Br(ui) for {ui=(ui1,ui2)}i=17 in R2. Middle: the graph obtained by taking pairwise intersections. Right: the hypergraph obtained by taking intersections of arbitrary order. The shaded region indicates an order 3 hyperedge.

We can now express the likelihood of observing Y conditional on U and r as (5) L(U,r;Y)i<j1(uiuj2r)yij×[11(uiuj2r)]1yij.(5)

Since a RGG is deterministic conditional on U and r, (5) is equal to 1 only when there is a perfect correspondence between the observed connections Y and the connections induced by the latent positions and radius, and is otherwise equal to 0.

2.3 Random Geometric Hypergraphs

We can extend the RGG from Section 2.2 to hypergraphs by considering the full intersection pattern of convex sets. To begin, we introduce the concept of a nerve (Edelsbrunner and Harer Citation2010, sec. 3.2) in the following definition.

Definition 2.1.

(Nerve) Let A={Ai}i=1N represent a collection of non-empty sets. The nerve of A is given by Nrv(A)={σ{1,2,,N}|jσAj}.

The nerve represents the sets of indices whose corresponding regions have a non-empty intersection. The sets {1},{2},,{N} are included in Nrv(A) and |σ|N for σNrv(A), where |σ| is the order of the set. The nerve defines a hypergraph where σNrv(A) represents a hyperedge, and for sets σ1Nrv(A) and σ2σ1, it follows immediately that σ2Nrv(A) implying that the hypergraph is simplicial (Kahle Citation2017).

Taking Ai=Br(ui), as for the RGG in Section 2.2, gives rise to the well-studied Čech complex (Edelsbrunner and Harer Citation2010, sec. 3.2). This is defined below.

Definition 2.2

(Čech Complex). For a set of coordinates U={ui}i=1N and a radius r, the Čech complex Cr(U) is given by Cr(U)=Nrv({Br(ui)}i=1N).

We now introduce a subset of the Čech complex known as the k-skeleton.

Definition 2.3

(k-skeleton of the Čech Complex). Let Cr(U) denote the Čech complex, as given in Definition 2.2. The k-skeleton of Cr(U) is given by Cr(k)(U)={σCr(U) ||σ|k}.

Example 2.1.

For the Čech complex in , Cr(U)={{1},{2},{3},{4},{5},{6},{7},{2,4}, {2, 7}, {3, 5}, {3, 6}, {5, 6}, {3,5,6}},Cr(1)(U)={{1},{2},{3},{4},{5},{6},{7}}, Cr(2)(U)=Cr(1){{2,4},{2,7}, {3,5},{3,6},{5,6}}, and Cr(3)(U)=Cr(2)(U){3,5,6}.

3 Latent Space Hypergraphs

3.1 Motivation and Notation

In the motivating examples from Section 1, we observed arbitrary hypergraph interactions. Since the models discussed in Section 2 express either pairwise or simplicial interactions, they are not appropriate for this data. This motivates the development of a non-simplicial latent space hypergraph model. We aim to propose a model which has (i) a convenient and computationally tractable likelihood with (ii) a support that is simple to describe.

We consider general hypergraphs on N nodes and let K denote the maximum hyperedge order, where 2KN. Let GN,K=(VN,EN,K) denote the space of hypergraphs on N and K, where VN={1,2,,N} and EN,K represent the node labels and hyperedges up to order K on N nodes, respectively. Additionally, let ENk represent hyperedges of exactly order k on N nodes, so that EN,K=k=2KENk, and let ekENk denote an order k interaction, where |ek|=k and ek contains no repeated elements.

3.2 Combining k-skeletons

To extend the model in Section 2.3 to express non-simplicial hypergraphs we allow the radii to differ for each hyperedge order. We generate a hypergraph by isolating the order k hyperedges from the complex Crk(U) and combining these for k=2,3,,K, where rkrk for kk. For example, when K = 3, we take radii r2 and r3 along with their Čech complexes Cr2(U) and Cr3(U). Then, we consider the order 2 hyperedges in Cr2(U) and the order 3 hyperedges in Cr3(U), as shown in . We refer to a hypergraph constructed in this way as a non-simplicial random geometric hypergraph (nsRGH).

Fig. 3 Example of a nsRGH (see Definition 3.1) with U={ui}i=17={(ui1,ui2)}i=17. Left: Cr2(U). Middle: Cr3(U). Right: k=23Drk(k)(U).

Fig. 3 Example of a nsRGH (see Definition 3.1) with U={ui}i=17={(ui1,ui2)}i=17. Left: Cr2(U). Middle: Cr3(U). Right: ∪k=23Drk(k)(U).

Definition 3.1

(nsRGH). Take r=(r2,r3,,rK) such that rk>rk1>0 for k=3,4,,K. We define a nsRGH on N nodes as the hypergraph gN,K(U,r) with hyperedges given by k=2KDrk(k)(U), where Drk(k)(U)=Crk(k)(U)Crk(k1)(U) denotes the hyperedges of exactly order k in the Čech complex with radius rk and Crk(k)(U) is as in Definition 2.3.

Example 3.1.

For the nsRGH shown in the right panel of we have g7,3(U,r)=Dr2(2)(U)Dr3(3)(U), where Dr2(2)(U)={{2,4},{2,7},{3,5},{5,6}}, and Dr3(3)(U)={3,5,6}.

In Definition 3.1 constraints are imposed on the radii r=(r2,r3,,rK) to ensure that the hypergraphs are non-simplicial since for example, if r3<r2 and the hyperedge {i, j, k} is present in the hypergraph, it follows that {i,j},{i,k}, and {j, k} must also be present.

3.3 Generative Model and Likelihood

Our procedure for generating a nsRGH gN,K(U,r) (see Definition 3.1) is deterministic conditional on U and r. Practical application of this model is therefore challenging since (i) the probability observing a hypergraph hN,KGN,K is binary (similarly to the RGG in Section 2.2) and (ii) it is difficult to characterize the space of hypergraphs which can be expressed as a nsRGH. These issues hinder estimation via likelihood methods and make it difficult to say whether a latent representation can be found a priori.

To address these issues we consider a modification of the nsRGH gN,K(U,r), denoted gN,K*(U,r,φ), obtained by independently modifying the state of each order k hyperedge in gN,K(U,r) with probability φk[0,1]. More precisely, if yek(g){0,1} is the state of ek in gN,K*(U,r,φ) where yek(g)=1 indicates presence, we take yek(g*)=(yek(g)+sk)mod2, where yek(g*){0,1} is the state of hyperedge ek in gN,K*(U,r,φ) and SkBernoulli(φk). This means that yek(g) is modified from either present to absent or absent to present with probability φk to obtain yek(g*). The noise terms {φk}k=2K control the amount of modification of gN,K(U,r), and so setting φk close to 0 obtains a hypergraph that is similar to a nsRGH.

To fully specify a generative model, we assign a probability distribution on the latent coordinates U. We let uiiidN(μ,Σ), for i[N] to reflect the intuition that nodes placed more centrally in the latent representation are likely to have high degrees and neighbors with high degrees. A hypergraph can now be generated by the procedure given in Algorithm 1, and details of efficient implementation of this are given in supplements C and E.3.

Algorithm 1

Sample a hypergraph gN,K* given N,K,r,φ,μ, and Σ.

Sample U={ui}i=1N such that uiiidN(μ,Σ), for i=1,2,,N.

For k=2,3,,K,

a) Given U and rk , check which ek={i1,i2,,ik}ENk satisfy yek(g)=1.

To determine if yek(g)=1, check that l=1kBrk(uil).

b) For all ekENk, sample SkBernoulli(φk).

Let yek(g*)=(yek(g)+sk)mod2

To express the likelihood of observing the hypergraph hN,KGN,K, we let (6) dk(gN,K(U,r),hN,K)=ekENk|yek(g)yek(h)|(6) denote the discrepancy between the order k hyperedges in gN,K(U,r) and hN,K, where k=2,3,,K and yek(h){0,1} indicates the state of ek in hN,K. Intuitively, the distance (6) corresponds to the number of hyperedges whose state differs between hN,K and gN,K(U,r). We note that evaluating this distance does not require the k=2K(Nk) computations suggested by (6), and refer to the supplement for details.

Given this notion of hypergraph distance the likelihood of observing hN,K, conditional on U,r, and φ, can be written as (7) L(U,r,φ;hN,K)k=2Kφkdk(gN,K(U,r),hN,K)×(1 φk)(Nk)dk(gN,K(U,r),hN,K).(7)

This follows by considering which hyperedges in gN,K(U,r) must have their state modified to match the hyperedges in hN,K, and which hyperedges are the same as in hN,K. Since our likelihood is of the same form as (Lunagómez, Olhede, and Wolfe Citation2020, proof of Proposition 3.1), it follows that hypergraphs with a greater number of hyperedge modification are less likely for 0< φk<1/2 and so (7) behaves in an intuitive way.

The model specification is complete with the following priors for k=2,3,,K (8) μN(mμ,Σμ)ΣμW1(Φ,ν),rkexp(λk),andφkB(ak,bk),(8) where W1(·,·) and B(·,·) denote an Inverse-Wishart and Beta, respectively.

3.4 Can We Improve Model Flexibility?

In a nsRGH, the constraint rk>rk1, for k=3,,K, ensures that the interactions are non-simplicial. However, this implies rk will impact the higher-order hyperedges and, to improve model flexibility, we introduce additional modification parameters. In Algorithm 1 the noise φk is applied independently across all hyperedges of order k and, alternatively, we can modify each hyperedge depending on its state in gN,K(U,r). For k=2,3,,K, let ψ0=(ψ20,ψ30,,ψK0)[0,1] denote the probability of modifying the state of a hyperedge in gN,K(U,r) from absent to present, and let ψ1=(ψ21,ψ31,,ψK1)[0,1] denote the probability of modifying the state of a hyperedge in gN,K(U,r) from present to absent. Crucially, for large k, the term ψk1 may increase to account for the discrepancy between the density of order k hyperedges in hN,K and gN,K(U,r). This generative model is summarized in Algorithm A.1 in the supplement and, as commented in Section 3.3, we can implement this without the suggested k=2K(Nk) likelihood computations. Further details of this are given in the supplement, as highlighted in Section 3.3.

As in Section 3.3, the likelihood of observing hN,K is based on a distance metric, (9) dk(ab)(gN,K(U,r),hN,K)=#{ekENk|yek(g)=ayek(h)=b},(9) which records the number of hyperedges that have state a{0,1} in gN,K(U,r) and state b{0,1} in hN,K. For example, dk(01)(gN,K(U,r),hN,K) represents the number of hyperedges absent in gN,K(U,r) and present in hN,K. Efficient evaluation of (9) is discussed in supplement E.3.2 and the likelihood conditional on U,r,ψ1, and ψ0 is given by (10) L(U,r,ψ1,ψ0;hN,K)k=2K[(ψk1)dk(10)(gN,K(U,r),hN,K)×(1ψk1)dk(11)(gN,K(U,r),hN,K)×(ψk0)dk(01)(gN,K(U,r),hN,K)×(1ψk0)dk(00)(gN,K(U,r),hN,K)].(10)

We obtain (10) in a similar way to (7), and note that (10) is equivalent to (7) when ψk1=ψk0= φk, for k=2,3,,K.

The model specification is complete with the following priors for k=2,3,,K (11) μN(mμ,Σμ),ΣμW1(Φ,ν),rkexp(λk),ψk0B(ak0,bk0),ψk1B(ak1,bk1),(11) where W1(·,·) and B(·,·) denote an Inverse-Wishart and Beta, respectively.

3.5 Identifiability

For the model presented in Section 3.3, the joint distribution of the hypergraph gN,K*(U,r,φ) and its associated nsRGH (see Definition 3.1) gN,K(U,r) can be decomposed as follows. (12) p(gN,K*(U,r,φ),gN,K(U,r)|μ,Σ,φ,r)=p(gN,K*(U,r,φ)|gN,K(U,r),φ)p(gN,K(U,r)|μ,Σ,r).(12)

Note that a similar decomposition exists for the model given in Section 3.4.

The hyperedges expressed by the conditional distribution p(gN,K(U,r)|μ,Σ,r) are determined through the positions of their relative latent coordinates and are therefore unchanged given distance-preserving transformations of U and scaling of U and r. We address this non-identifiability by defining U on the Bookstein space of coordinates (see Bookstein Citation1986; Dryden and Mardia Citation1998, sec. 2.3.3 and details in Supplement B) which determine a translation, rotation and rescaling of U with respect to a set of anchor points. These anchor points are fixed throughout the posterior sampling procedure (see Section 5) and also address the non-identifiability associated with rescaling r.

Removing non-identifiability in U is typically achieved via Procrustes analysis (Dryden and Mardia Citation1998) in which a post-processing step is included to remove the effect of distance-preserving transformations of U (Hoff, Raftery, and Handcock Citation2002). This approach can be used for our model, providing that scaling between U and r is accounted for. By using Bookstein coordinates, we instead work directly with our posterior samples on a quotient space (Dryden and Mardia Citation1998) which accounts for distance-preserving transformations. This approach avoids the additional computation associated with the Procrustes transformation.

Finally, a hyperedge in our model can be expressed by either the geometric or noise component. To maintain the properties imposed by gN,K(U,r), we wish to keep the noise φ small. When there are few observed hyperedges, it will become increasingly difficult to distinguish between these competing sources. This implies an additional source of non-identifiability which we address by ensuring the noise terms φ are sufficiently small via prior distributions.

4 Theoretical Results

We now study the behavior of the node degree in the hypergraph model detailed in Algorithm 1. Since the nodes in our hypergraph model are exchangeable, it is sufficient to study the degree properties of the ith node, and it is straightforward to extend the results in this section to the model detailed in Algorithm A.1.

To begin, we make explicit the following properties of our hypergraph model: (P1) A hypergraph generated from our model is a modification of a nsRGH (see Definition 3.1), and (P2) Conditional on U and r, hyperedges of each order occur independently in our model. We also make the following assumptions: (A1) The number of nodes N and maximum hyperedge order K are fixed, and (A2) The covariance of the latent coordinates Σ is diagonal, where Σll=σll2 for l=1,2, ,d. Note that (A2) is not restrictive since we can obtain this by applying a distance-preserving transformation to our point cloud in Rd.

Following Section 3, we denote a hypergraph obtained by modifying the nsRGH gN,K(U,r) as gN,K*(U,r,φ), and we let yek(g) and yek(g*) indicate the presence or absence of the hyperedge ek in gN,K(U,r) and gN,K*(U,r,φ), respectively. The degree of node i in a hypergraph is given by the number of hyperedges which contain i and, for node i in gN,K(U,r), we denote the order k degree as Deg(i,k)(g)={ekENk|iek}yek(g) and the full degree as Deg(i)(g)=k=2KDeg(i,k). Analogous expressions for gN,K*(U,r,φ) are obtained by replacing g with g*. To organize our results, we separate the discussion into hyperedges with order k = 2 and k3. Proofs can be found in Supplement F.

4.1 Degree Properties for k = 2

We begin by restating a result from Rastelli, Friel, and Raftery (Citation2016) which allows us to express the degree distribution and average degree for order k = 2 hyperedges in our model. Throughout, we write hyperedges of order 2 that contain node i as {i, j}.

Theorem 4.1

(Reproduced from (Rastelli, Friel, and Raftery Citation2016, Theorem 1)). Let p{i,j}(ui|θ) denote the probability of node i with latent coordinate ui belonging to an order 2 hyperedge in the model detailed in Section 3.3, conditional on θ=(r2, φ2,μ,Σ). The degree distribution of order 2 hyperedges for the ith node for our hypergraph model, conditional on θ, is given by (13) p(Degi,2(g*)=l|θ)=p(ui|μ,Σ)(N1l)p{i,j}(ui|θ)l(1p{i,j}(ui|θ))N1l dui,(13) where Deg(i,2)(g)={e2EN,2|ie2}ye2(g), and the average degree is given by (14) d¯2(g*)(θ)=(N1)p(ui|μ,Σ)p{i,j}(ui|θ) dui.(14)

Proof.

See Supplement F. □

In order to apply this result, we must derive an expression for p{i,j}(ui|θ). This is given in the following proposition.

Proposition 4.1

(Probability of an order k = 2 hyperedge). Let e2={i,j} denote an order k = 2 hyperedge. The probability of e2 occurring in g*, given that node i has latent position ui , can be expressed as (15) p{i,j}(yij(g*)=1|θ1)=p(y{i,j}(g)=1|θ2)(1φ2)+(1p(y{i,j}(g)=1|θ2)) φ2,(15) where θ1=(ui,r2, φ2,μ,Σ),θ2=(ui,r2,μ,Σ) and p(y{i,j}(g)=1|θ2) denotes the probability of e2={i,j} being present in gN,K(U,r) when node i is positioned at ui . It follows that p(y{i,j}(g)=1|θ2)=p((Uj*)TUj*4r22), where Uj*=UjuiN(μui,Σ). Then (16) (Uj*)TUj*l=1dσl2χ1,((μui)l/σl)22,(16) where (μui)l is the lth component of (μui), for l=1,2,,d and Σ=diag(σ12,σ22,,σd2).

Proof.

See Supplement F. □

Theorem 4.1 and Proposition 4.1 allow us to evaluate properties of the node degree of order k = 2 hyperedges, where the intractable integrals (13) and (14) are evaluated using numerical methods. It is straightforward to calculate (16) when the variances are constant, but for general diagonal Σ, we require numerical evaluation as implemented in the R package CompQuadForm (Duchesne and de Micheaux Citation2010). The expression for the connection probabilities in Proposition 4.1 is validated for an example in .

Fig. 4 Comparison of theoretical (solid) and Monte Carlo (dashed) estimates of p(yek,i(g)=1|ui,rk,μ,Σ) for varying rk . We take Σ=diag(1,1),μ=(0,0) and consider connection probabilities for k = 2, 3, 4. In the left plot ui=μ and in the right plot ui=(1,2). The same study with Σ=diag(1,2) is provided in Supplement F.3.

Fig. 4 Comparison of theoretical (solid) and Monte Carlo (dashed) estimates of p(yek,i(g)=1|ui,rk,μ,Σ) for varying rk . We take Σ=diag(1,1),μ=(0,0) and consider connection probabilities for k = 2, 3, 4. In the left plot ui=μ and in the right plot ui=(1,2). The same study with Σ=diag(1,2) is provided in Supplement F.3.

4.2 Degree Properties for k3

We now present some results for the order k degree distribution. First, note that the expression for the degree distribution of order k = 2 hyperedges, given in Theorem 4.1, arises since hyperedges {i, j} occur independently given ui . This is no longer true for higher-order hyperedges since, for example, if k = 3 it is clear that hyperedges {i, j, l} and {i, j, m} are not independent given i. Deriving an exact expression for the order k degree distribution is therefore challenging, and we consider an approximation of this in the following proposition where we denote an order k hyperedge containing node i as ek,i.

Proposition 4.2

(Approximate order k degree distribution). Let pek,i(yek,i(g*)=1|θk) denote the probability of node i with latent position ui belonging to an order k hyperedge, denoted as ek,i, in the model detailed in Section 3.3, conditional on θk=(ui,rk, φk,μ,Σ). Conditional on rk, φk,μ, and Σ, the degree distribution of order k3 hyperedges for the ith node for our hypergraph model is approximately (17) p(Degi,k(g*)=l|θ)p(ui|μ,Σ)(λ(N,k,θk))lexp(λ(N,k,θk))l! dui(17) where λ(N,k,θk)=(N1k1)p(yek,i(g*)=1|θk) and Deg(i,k)(g*)={ekENk|iek}yek(g*).

Proof.

See Supplement F. □

This proposition relies on a Poisson approximation to the sum of dependent Bernoulli trials. Bounds for the quality of this approximation have been studied (e.g., see Teerapabolarn Citation2014), and it is well-understood that this approximation will degrade as either the number of possible hyperedges, as controlled by N and k, or the connection probability pek,i(ui|rk, φk,μ,Σ) grows, or both.

As in the previous section, we require an expression for pek,i(ui|rk, φk,μ,Σ) to explore this distribution. As in Proposition 4.1, an order k hyperedge can be expressed either by the geometric or noise component of our model, where the probability of occurrence in gN,K(U,r) is the more challenging to determine. Recall that this quantity is given by the probability that the balls of radius rk corresponding to the nodes in ek,i have a non-empty intersection, so that the hyperedge ek,i={i,i2,i3,,ik} is present in gN,K(U,r) if jek,iBrk(uj). This condition is equivalent to the coordinates {uj}jek,i lying within a ball of radius rk (see Supplement E.3.1). Since the latent coordinates are assumed to follow a Normal distribution, evaluating this probability requires the evaluation of an intractable integral (see Gilliland Citation1962). To address this we consider an approximation of this quantity in which the region of integration is approximated by a square of side length sk , denoted as Ssk. We choose sk=πrk so that it has an equal area to a disk of radius rk . This choice simplifies the region of integration by allowing each dimension to be considered independently. Alternative choices for the distribution on the latent coordinates, such as the uniform, may be more amenable to analytic expressions for the degree distribution and we leave exploration of this to future work.

Lemma 4.1

(Probability of an order k3 hyperedges). We can express the probability of an order k hyperedge containing node i, denoted as ek,i, being present in g* conditional on the latent coordinate ui as (18) p(yek,i(g*)=1|θ1,k)=p(yek,i(g)=1|θ2,k)(1φk)+(1p(yek,i(g)=1|θ2,k)) φk,(18) where θ1,k=(ui,rk, φk,μ,Σ),θ2,k=(ui,rk,μ,Σ) and p(yek,i(g)=1|θ2,k) is the probability of the order k hyperedge ek,i={i,i2,i3, ,ik} being present in gN,K(U,r) given ui .

The probability of ek,i occurring in gN,K(U,r) with UjN(μ,Σ), conditional on ui , can then be approximated as follows. (19) p(yek,i(g)=1|θ2,k)l=1d[(k1)f(u|μl,σl)×[F(u+πrk|μl,σl)F(u|μl,σl)]k2×1(ui(u,u+πrk)) du+[F(ui+πrk|μl,σl)F(ui|μl,σl)]k1],(19) where f(·|μl,σl) and F(·|μl,σl) denote the pdf and cdf of the random variable with distribution N(μl,σl2), μl denotes the lth element of μ and Σ=diag(σ12,σ22,,σd2).

Proof.

See Supplement F. □

These results allow us to examine the behavior of the degree distribution. We consider the quality of the p(yek,i(g)=1|ui,rk,μ,Σ) approximation and the Poisson approximation in and , respectively. Since hyperedges in our model may come from the latent coordinates and or the random noise, similar connection probabilities can occur through different parameter combinations. However, we should find such combinations have notable differences in the degree distribution and other properties such as, for instance, motif counts and measures of transitivity. The above calculations suggest that examining such properties theoretically is likely to be challenging. Techniques such as those used in Kahle (Citation2017) may be applied here, however, our choice of underlying distribution on the latent coordinates is likely to limit their applicability.

Fig. 5 Comparison between empirical (dashed line) and Poisson approximation (points) of the order k = 3 degree distribution conditional on the latent coordinate ui . We take N=10,μ=(0,0),Σ=diag(1,1) and evaluate the distribution for r3(0.1,0.4,1.0). The left plot shows ui=μ and the right plot shows ui=(1,2). The equivalent Figures with Σ=diag(1,2) and N = 20 are given in Supplement F.3.

Fig. 5 Comparison between empirical (dashed line) and Poisson approximation (points) of the order k = 3 degree distribution conditional on the latent coordinate ui . We take N=10,μ=(0,0),Σ=diag(1,1) and evaluate the distribution for r3∈(0.1,0.4,1.0). The left plot shows ui=μ and the right plot shows ui=(1,2). The equivalent Figures with Σ=diag(1,2) and N = 20 are given in Supplement F.3.

5 Posterior Sampling

Here we outline our posterior sampling scheme for the model detailed in Section 3.3 and note that this can be modified to sample from the model in Section 3.4. Posterior samples are obtained via a Metropolis-Hastings-within-Gibbs MCMC scheme (Gamerman and Lopes Citation2006, sec. 6.4.2) with a delayed acceptance (DA) update (Banterle et al. Citation2019) for the latent positions and algorithmic details of our MCMC scheme are given in Supplement E.

First, recall that we define U on the Bookstein space of coordinates (Section 3.5) so that a set of anchor points remain fixed throughout estimation. For uiRd, let the d anchor points be denoted by {uB1,uB2,,uBd}. Then, for i{1,2,,N}\{B1,B2,,Bd} we propose U* with ith entry ui*=ui+ϵu where ϵuN(0,σuId) and uj*=uj for ji. This proposal is then accepted as a sample from p(U|μ,Σ,r,ψ0,ψ1,hN,K) with probability (20) ARui=min{1,L(U*,r,ψ1,ψ0;hN,K)p(ui*|μ,Σ)L(U,r,ψ1,ψ0;hN,K)p(ui|μ,Σ)}.(20)

Calculating the hypergraph gN,K(U,r) presents a computational bottleneck in evaluating the likelihood (20). We utilize the GUDHI library (The GUDHI Project Citation2021) to perform this calculation and find that the cost of calculating the order k hyperedges in gN,K(U,r) grows with both N and k. This suggests our target is a natural candidate for delayed acceptance in which terms of the likelihood are sequentially included in order of increasing computational cost, allowing for a faster and computationally cheaper rejection of poor proposals. Let hNk denote the observed hyperedges of order k and write (21) ARui=min{1,p(hN,2|U*,r2,ψ21,ψ20)p(ui*|μ,Σ)p(hN,2|U,r2,ψ21,ψ20)p(ui|μ,Σ)ρ2(U,U*)    ×k=3Kp(hNk|U*,rk,ψk1,ψk0)p(hNk|U,rk,ψk1,ψk0)ρk(U,U*)},(21) where p(hNk|U,rk,ψk1,ψk0) denotes the contribution of the order k hyperedge terms in (10), for k=2,3,,K. Following Banterle et al. (Citation2019), we take ρk(U,U*) as specified in (21) and accept U* if each intermediary rate min{1,ρk(U,U*)} indicates acceptance as outlined in Algorithm E.1 in the supplement.

The kth radii is updated according to a standard random-walk MH step, where the proposal rk*=rk+ϵr with ϵrN(0,σr), is accepted with probability (22) min{1,p(hNk|U,rk*, φk)p(rk*|λk)p(hNk|U,rk, φk)p(rk|λk)}.(22)

The remaining parameters can be sampled directly from their full conditionals and updated via a Gibbs step. Full details are presented in the supplement and we note here that our implementation also incorporates adaptive updates for U and r as detailed in Vihola (Citation2012) and implemented in the R library ramcmc (Helske Citation2021).

6 Simulations

Sections 6.1 evaluate the efficacy of our approach by investigating predictive distributions, and Section 6.2 explores the scalability of our sampling procedure. Additional simulation studies on model-depth and misspecification are presented in supplements H and K.

6.1 Assessing Model Fit

For this study, we simulate a hypergraph from the model outlined in Algorithm 1 with N=50,d=2,K=3,r=(0.35,0.45),φ0=φ1=(0.001,0.001),μ=(0,0) and Σ=(1001) and examine the performance of our MCMC scheme through inspection of the predictive degree distribution. After 5000 post burn-in iterations we obtain posterior mean estimates to (2 significant figures) Û, r̂=(0.11,0.15),ψ̂0=ψ̂1=(0.00108,0.00011),μ̂=(0.081,0.12) and Σ̂=(0.0820.0150.0150.091). Since the anchor coordinates specify a scaling on the estimated latent positions, we cannot make a direct comparison between the true and estimated model parameters (see and I.9).

We can instead examine the posterior predictive distribution and compare this to the simulated hypergraph. If the sampling procedure has produced meaningful estimates, we expect a reasonable correspondence between these two quantities. Since a hypergraph is a complex object, we make a comparison by summarising each hypergraph by its degree distribution. shows this separately for the degree distributions of order k = 2 and k = 3 hyperedges. Overall we see that there is reasonable overlap between these distributions. For completeness, we also summarize the output of the MCMC scheme in Supplement I where we see that the sampling procedure appears to have converged and the estimated latent coordinates correspond reasonably to the truth.

Fig. 6 Comparison of true and posterior predictive degree distributions for hyperedges of order k = 2 (left) and k = 3 (right). Vertical lines and black dots correspond to the range and median of the probabilities of observing each degree as calculated via the posterior predictive. The observed degree is shown as green triangles.

Fig. 6 Comparison of true and posterior predictive degree distributions for hyperedges of order k = 2 (left) and k = 3 (right). Vertical lines and black dots correspond to the range and median of the probabilities of observing each degree as calculated via the posterior predictive. The observed degree is shown as green triangles.

6.2 Scalability Study

Our MCMC scheme (see Section 5) updates the latent coordinates via a delayed acceptance (DA) step that allows poor proposals to be quickly rejected on a subset of the likelihood. Here we substantiate our claim that this offers computational advantages over a standard Metropolis-Hastings update on data simulated according to Algorithm 1 by comparing average per-iteration cost times for an MCMC scheme implemented with and without DA. Since the computational cost is closely connected to the number of nodes N and the density of the hypergraphs, we consider timings for hypergraphs with increasing N under two different density regimes, referred to as (R1) and (R2). We specify the parameters for each regime so that (R1) generates sparser hypergraphs than (R2). Details of the specific parameters are given in Supplement J alongside plots of the density of the simulated data.

This study allows us to examine the scalability of our approach as N grows and to demonstrate the reduction in computational cost offered by DA. reports average per-iteration times for the DA MCMC scheme for both regimes in addition to the ratio of average per-iteration times for a DA and non-DA MCMC scheme. This figure confirms our intuitions that hypergraphs with a larger N are more computationally challenging, and this cost increases as the density of the hypergraphs grows. By inspecting the right-hand plot of , we observe that incorporating DA into our MCMC offers a significant improvement in average per-iteration cost. We note that this relative improvement increases with N and is largely unaffected by the density of the hypergraphs. Whilst K remains fixed throughout this section, we do expect the improvements offered by DA to increase with K.

Fig. 7 Summary of average per-iteration costs after 200 iterations for an MCMC implemented with and without delayed acceptance (DA) for 10 datasets sampled from data regimes (R1) (black, circle) and (R2) (red, triangle). Left: average per-iteration cost in seconds for a DA scheme. Right: ratio of average per-iteration cost for a DA scheme and a scheme without DA. Numbers less than 1 imply DA offers a computational speed-up.

Fig. 7 Summary of average per-iteration costs after 200 iterations for an MCMC implemented with and without delayed acceptance (DA) for 10 datasets sampled from data regimes (R1) (black, circle) and (R2) (red, triangle). Left: average per-iteration cost in seconds for a DA scheme. Right: ratio of average per-iteration cost for a DA scheme and a scheme without DA. Numbers less than 1 imply DA offers a computational speed-up.

7 Real Data Examples

In this section we analyze two hypergraph datasets using the model from Algorithm A.1 of the supplement. These examples describe grocery co-purchases and academic coauthorship ().

Fig. 8 Visualization of the datasets considered in Section 7. The hyperedges correspond to the observations for each dataset. Each figure shows the full hypergraph (left) and a subset sampled according to a random walk (right).

Fig. 8 Visualization of the datasets considered in Section 7. The hyperedges correspond to the observations for each dataset. Each figure shows the full hypergraph (left) and a subset sampled according to a random walk (right).

7.1 Grocery Items

Here we consider a subset of the grocery transaction data available as part of the R package arules (Hahsler et al. 2007). Here, each node corresponds to a category of items and each hyperedge indicates whether a set of items were bought together. To analyze this data we take a random subset of size N = 41 with K = 6 and hold back N*=4 nodes for validation of the predictive distribution for additional nodes. Since the hyperedges can be expressed through the geometric or noise component of our model, we impose constraints on the noise terms to ensure that the set of hyperedges explained by the latent positions is non-empty. More specifically, we limit the order k noise terms so that they cannot exceed 90% of the density of the order k hyperedges in the observed hypergraph. With these constraints, we implemented our MCMC scheme for 100,000 iterations and discarded the first three quarters as burn-in. Figure L.14 in the supplement depicts the observed hypergraph with the nodes positioned at posterior mean of the latent coordinates Û and we obtain the posterior mean estimate of the radii r̂=(1.096,1.097,1.101,1.107,1.11). Traceplots for each parameter are shown in Supplement L and the average per-iteration cost was 0.012 min.

To select the N* nodes for which we evaluate the predictive distribution, we take the nodes with the smallest eigen-centralities in the graph obtained by connecting all node pairs who share a hyperedge. To reflect how these nodes were selected, we then sample additional latent coordinates from the posterior predictive by first obtaining N+N* samples from N(μ(i),Σ(i)) and then selecting the N* coordinates that are furthest from the mode μ(i) in terms of Euclidean distance. Similarly to Section 6.1, we summarize the predictive hypergraphs by considering node degree and motif counts and reports the results of this. In particular, for each measure, we report the proportion of predictive samples which have discrepancy D from the true observed value where D denotes the absolute difference between the prediction and the truth. Overall, we see that the majority of samples have either zero or close to zero discrepancy from the observed values, suggesting that we reasonably capture the unobserved interaction patterns. We believe that the uptick in the upper tail for some of the motif counts does not have a structural explanation and is simply an artifact of this particular data example. Note that we do not observe this behavior for the coauthorship dataset presented in Section 7.2.

Fig. 9 Summary of predictive distributions for N* additional nodes for the Grocery dataset. For each measure we report the proportion of predictions which are distance D from the truth, where D is the absolute difference between the predictive and the truth.

Fig. 9 Summary of predictive distributions for N* additional nodes for the Grocery dataset. For each measure we report the proportion of predictions which are distance D from the truth, where D is the absolute difference between the predictive and the truth.

7.2 Coauthorship for Statisticians

We now repeat the analysis from the previous section for a hypergraph with larger N. In this section we consider a coauthorship hypergraph constructed from the DBLP dataset,Footnote1 presented in Mao, Sarkar, and Chakrabarti (Citation2020) and available online.Footnote2 From this dataset we construct a subset of size N = 99 with K = 4 via a random walk and, similarly to Section 7.1, we hold out N*=10 of these nodes according to the graph eigen-centrality and use these to validate predictive inference. We fit the model outlined in Section 3.4 by implementing our MCMC for 400,000 iterations and taking the initial 3/4th of these as burn-in. Given equivalent restrictions on the noise parameters as described for the previous example, we obtain the posterior mean of the radii r̂=(0.23,0.25,0.31) and the posterior means of the latent position as shown in Figure L.14 in the supplement. The average per-iteration cost for our scheme was 0.012 min and traceplots showing convergence for the model parameters are given in supplement L. A summary of the predictive distribution of the N* additional nodes is given in and, for all summaries of the predicted interactions, we find that the majority of the discrepancies between the predicted and true values are zero or close to zero. This suggests the predicted interaction patterns are close to the truth.

Fig. 10 Summary of predictive distributions for N* additional nodes for the coauthorship dataset. For each measure we report the proportion of predictions which are distance D from the truth, where D is the absolute difference between the predictive and the truth.

Fig. 10 Summary of predictive distributions for N* additional nodes for the coauthorship dataset. For each measure we report the proportion of predictions which are distance D from the truth, where D is the absolute difference between the predictive and the truth.

To provide further justification for our assumption of increasing radii we compare our model to a restricted version with a simplicial geometric component in which rk = r for k=2,3,,K. Since this data example is non-simplicial, we expect that the latent representation in the restricted model will be less informative for the hyperedges than the latent representation for our model. To verify this, we present the latent coordinate traceplots for a subset of the data in for both models. By comparing the middle and right panels of this figure, we see that the latent coordinates are more constrained for our model with a non-simplicial geometric component. This behavior occurs because the restricted model expresses the observed interactions by deleting hyperedges from a simplicial motif whilst our model aims to express the non-simplicial motif directly through the geometric component. Since a simplicial motif occurs when all latent positions are sufficiently close, this means that the traceplot does not allow us to distinguish between the latent positions in the restricted model. Contrast this with our model in which the node labeled “22” is distinctly separated from the nodes labeled “27” and “31”.

Fig. 11 Comparison between latent positions for subset of the coauthorship network on nodes {22, 27, 31}. Left: observed interactions. Middle: traceplot of latent positions for model with rk = r. Right: traceplot of latent positions for our model with rk>rk1.

Fig. 11 Comparison between latent positions for subset of the coauthorship network on nodes {22, 27, 31}. Left: observed interactions. Middle: traceplot of latent positions for model with rk = r. Right: traceplot of latent positions for our model with rk>rk−1.

8 Discussion

Our proposed model allows hyperedges to be expressed by a geometric or noise component, and we keep the noise term small to impose desirable properties on the hypergraphs. The geometric component of our model may be unable to express certain hypergraph relationships and this is made clear by considering, for example, that the maximum number of leaves in a star is limited by the latent space dimension. This may be mediated by choosing an alternative geometric simplicial complex to generate the nerve, increasing the probability of hyperedge modification, or taking a different specification for the latent positions.

There exist several modeling extensions which could be considered for our framework, such as modeling nonbinary hyperedges, the addition of community structure analogously to Handcock, Raftery, and Tantrum (Citation2007), or the introduction of covariate information. Additional avenues for future work follow from considering other choices for the nerve construction, latent geometry and restrictions on the radii. For example, we could instead use the more scalable Vietoris-Rips complex (Edelsbrunner and Harer Citation2010), consider non-Euclidean embeddings (Krioukov et al. Citation2010) or explore models with node-specific radii. However, it is important to stress that estimation for a given modification of our model may be challenging. Even for our proposed model, scalability remains a key limitation and this may be exacerbated by certain extensions. While we found that delayed acceptance offered computational improvements, it remains an open line of enquiry to consider how methodology to improve scalability for latent space networks (Raftery et al. Citation2012; Salter-Townshend and Murphy Citation2013) may be adapted to this setting.

Supplemental material

Supplemental Material

Download Zip (17.2 MB)

Supplemental Material

Download PDF (120 KB)

Acknowledgments

Thank you to Tin Lok James Ng for providing the Star Wars dataset analyzed in a previous version of this work and Matthew Kahle, Sandipan Roy, Brendan Murphy and Marco Battiston for helpful comments. We acknowledge the High End Computing facility at Lancaster University and are grateful to Daniel Grose for creating the associated R package.

Supplementary Materials

The supplementary materials contain additional details on Bookstein coordinates.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

We acknowledge the support of the EPSRC funded EP/L015692/1 STOR-i CDT and EPSRC grants EP/S00159X/1, EP/R01860X/1 and EP/V022636/1. This work was supported, in part, by NSF awards CAREER IIS-1149662 and IIS-1409177, by ONR awards YIP N00014-14-1-0485 and N00014-17-1-2131.

Notes

References

  • Balasubramanian, K. (2021), “Nonparametric Modeling of Higher-Order Interactions via Hypergraphons,” Journal of Machine Learning Research, 22, 1–35.
  • Banterle, M., Grazian, C., Lee, A., and Robert, C. P. (2019), “Accelerating Metropolis-Hastings Algorithms by Delayed Acceptance,” Foundations of Data Science, 1, 103–128. DOI: 10.3934/fods.2019005.
  • Battiston, F., Cencetti, G., Iacopini, I., Latora, V., Lucas, M., Patania, A., Young, J.-G., and Petri, G. (2020), “Networks Beyond Pairwise Interactions: Structure and Dynamics,” Physics Reports, 874, 1–92. DOI: 10.1016/j.physrep.2020.05.004.
  • Bianconi, G., and Rahmede, C. (2017), “Emergent Hyperbolic Network Geometry,” Scientific Reports, 7, 41974. DOI: 10.1038/srep41974.
  • Bobrowski, O., and Kahle, M. (2018), “Topology of Random Geometric Complexes: A Survey,” Journal of Applied and Computational Topology, 1, 331–364. DOI: 10.1007/s41468-017-0010-0.
  • Bookstein, F. L. (1986), “Size and Shape Spaces for Landmark Data in Two Dimensions,” Statistical Science, 1, 181–222. DOI: 10.1214/ss/1177013696.
  • Chazal, F., and Michel, B. (2017), “An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists,” Frontiers in Artificial Intelligence, 4, 667963. DOI: 10.3389/frai.2021.667963.
  • Chien, I., Lin, C.-Y., and Wang, I.-H. (2018), “Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms,” in International Conference on Artificial Intelligence and Statistics (Vol. 84), pp. 871–879, PMLR.
  • Csardi, G., and Nepusz, T. (2006), “The igraph Software Package for Complex Network Research,” InterJournal, complex systems, 1695.5, 1–9.
  • Doreian, P., Batagelj, V., and Ferligoj, A. (2004), “Generalized Blockmodeling of Two-Mode Network Data,” Social Networks, 26, 29–53. DOI: 10.1016/j.socnet.2004.01.002.
  • Dryden, I. L., and Mardia, K. V. (1998), Statistical Shape Analysis, Chichester: Wiley.
  • Duchesne, P., and de Micheaux, P. L. (2010), “Computing the Distribution of Quadratic Forms: Further Comparisons between the Liu-Tang-Zhang Approximation and Exact Methods,” Computational Statistics and Data Analysis, 54, 858–862. DOI: 10.1016/j.csda.2009.11.025.
  • Edelsbrunner, H., and Harer, J. (2010), Computational Topology - an Introduction, Providence, RI: American Mathematical Society.
  • Friel, N., Rastelli, R., Wyse, J., and Raftery, A. E. (2016), “Interlocking Directorates in Irish Companies Using a Latent Space Model for Bipartite Networks,” Proceedings of the National Academy of Sciences, 113, 6629–6634. DOI: 10.1073/pnas.1606295113.
  • Gamerman, D., and Lopes, H. F. (2006), Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference (2nd ed.), London: Chapman and Hall/CRC.
  • Ghoshdastidar, D., and Dukkipati, A. (2017), “Consistency of Spectral Hypergraph Partitioning Under Planted Partition Model,” The Annals of Statistics, 45, 289–315. DOI: 10.1214/16-AOS1453.
  • Gilliland, D. C. (1962), “Integral of the Bivariate Normal Distribution over an Offset Circle,” Journal of the American Statistical Association, 57, 758–768. DOI: 10.1080/01621459.1962.10500813.
  • Hahsler, C. B., Buchta, C., Gruen, B., Hornik, K., Borgelt, C., Johnson, I., and Ledmi, M. (2021), arules: Mining Association Rules and Frequent Itemsets, R package version 1.6-8.
  • Handcock, M. S., Raftery, A. E., and Tantrum, J. M. (2007), “Model-based Clustering for Social Networks,” Journal of the Royal Statistical Society, Series A, 170, 301–354. DOI: 10.1111/j.1467-985X.2007.00471.x.
  • Helske, J. (2021), ramcmc: Robust Adaptive Metropolis Algorithm, R package version 0.1.1.
  • Hoff, P. D., Raftery, A. E., and Handcock, M. S. (2002), “Latent Space Approaches to Social Network Analysis,” Journal of the American Statistical Association, 97, 1090–1098. DOI: 10.1198/016214502388618906.
  • Ji, P., and Jin, J. (2016), “Coauthorship and Citation Networks for Statisticians,” The Annals of Applied Statistics, 10, 1779–1812. DOI: 10.1214/15-AOAS896.
  • Kahle, M. (2017), “Random Simplicial Complexes,” in Handbook of Discrete and Computational Geometry, eds. J. E. Goodman, J. O’Rourke, and C. D. Tóth, pp. 581–604, Boca Raton, FL: CRC Press.
  • Kahle, M. (2014), “Topology of Random Simplicial Complexes: A Survey,” AMS Contemporary Mathematics, 620, 201–222.
  • Kolaczyk, E. D. (2009), Statistical Analysis of Network Data: Methods and Models (1st ed.), New York: Springer.
  • Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., and Boguñá, M. (2010), “Hyperbolic Geometry of Complex Networks,” Physical Review E, 82, 036106. DOI: 10.1103/PhysRevE.82.036106.
  • Lunagómez, S., Mukherjee, S., Wolpert, R. L., and Airoldi, E. M. (2017), “Geometric Representations of Random Hypergraphs,” Journal of the American Statistical Association, 112, 363–383. DOI: 10.1080/01621459.2016.1141686.
  • Lunagómez, S., Olhede, S. C., and Wolfe, P. J. (2020), “Modeling Network Populations via Graph Distances,” Journal of the American Statistical Association, 116, 2023–2040. DOI: 10.1080/01621459.2020.1763803.
  • Mao, X., Sarkar, P., and Chakrabarti, D. (2020), “Estimating Mixed Memberships with Sharp Eigenvector Deviations,” Journal of the American Statistical Association, 116, 1928–1940. DOI: 10.1080/01621459.2020.1751645.
  • Marchette. (2021), HyperG: Hypergraphs in R, R package version 1.0.0.
  • Ng, T. L. J., and Murphy, T. B. (2021), “Model-based Clustering for Random Hypergraphs,” Advances in Data Analysis and Classification, 16, 691–723. DOI: 10.1007/s11634-021-00454-7.
  • Penrose, M. (2003), Random Geometric Graphs. Oxford Studies in Probability. Oxford: Oxford University Press.
  • Raftery, A. E., Niu, X., Hoff, P. D., and Yeung, K. Y. (2012), “Fast Inference for the Latent Space Network Model Using a Case-Control Approximate Likelihood,” Journal of Computational and Graphical Statistics, 21, 901–919. DOI: 10.1080/10618600.2012.679240.
  • Rastelli, R., Friel, N., and Raftery, A. E. (2016), “Properties of Latent Variable Network Models,” Network Science, 4, 407–432. DOI: 10.1017/nws.2016.23.
  • Salter-Townshend, M., and Murphy, T. B. (2013), “Variational Bayesian Inference for the Latent Position Cluster Model for Network Data,” Computational Statistics & Data Analysis, 57, 661–671. DOI: 10.1016/j.csda.2012.08.004.
  • Stasi, D., Sadeghi, K., Rinaldo, A., Petrović, S., and Fienberg, S. E. (2014), “β Models for Random Hypergraphs with a Given Degree Sequence,” ArXiv e-prints.
  • Teerapabolarn, K. (2014), “An Improvement of Poisson Approximation for Sums of Dependent Bernoulli Random Variables,” Communications in Statistics - Theory and Methods, 43, 1758–1777. DOI: 10.1080/03610926.2012.674606.
  • The GUDHI Project (2021), GUDHI User and Reference Manual. GUDHI Editorial Board, 3.4.1 edition.
  • Vihola, M. (2012), “Robust Adaptive Metropolis Algorithm with Coerced Acceptance Rate,” Statistics and Computing, 22, 997–1008. DOI: 10.1007/s11222-011-9269-5.
  • Wang, J.-W., Rong, L.-L., Deng, Q.-H., and Zhang, J.-Y. (2010), “Evolving Hypernetwork Model,” The European Physical Journal B, 77, 493–498. DOI: 10.1140/epjb/e2010-00297-8.
  • Wang, P., Pattison, P., and Robins, G. (2013), “Exponential Random Graph Model Specifications for Bipartite Networks—A Dependence Hierarchy,” Social Networks, 35, 211–222. DOI: 10.1016/j.socnet.2011.12.004.
  • Young, J.-G., Petri, G., Vaccarino, F., and Patania, A. (2017), “Construction of and Efficient Sampling from the Simplicial Configuration Model,” Physical Review E, 96, 032312. DOI: 10.1103/PhysRevE.96.032312.
  • Zhong, Y., Arandjelovic, R., and Zisserman, A. (2018), “Compact Deep Aggregation for Set Retrieval,” in Proceedings of the European Conference on Computer Vision Workshops.
  • Zuev, K., Eisenberg, O., and Krioukov, D. (2015), “Exponential Random Simplicial Complexes,” Journal of Physics A: Mathematical and Theoretical, 48, 465002. DOI: 10.1088/1751-8113/48/46/465002.