Heterogeneous logistic regression for estimation of subgroup effects on hypertension: Journal of Biopharmaceutical Statistics: Vol 32 , No 6

ABSTRACT

Personalized medicine has gained much attention in the past decades, and identifying the effects of factors is essential for personalized preventions and treatments. Hypertension is a major modifiable risk factor for cardiovascular disease and is influenced by complex factors. In order to decrease the incidence of hypertension effectively, the subjects should be divided into subgroups according to their characteristics. In this study, we proposed to use a heterogeneous logistic regression combined with a concave fusion penalty to analyze the population-based survey data, including common influencing factors of hypertension. The analytic steps include: (1) identifying the most important predictor; (2) estimating subgroup-based heterogeneous effects. In the present context of primary hypertension data, the modeling results showed that the calculated prediction accuracy under our method was greater than 99%, while zero under the classical logistic regression. The findings could provide a practical guide for further individualized measures implementation.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the authors.

7.2. Initial value

To facilitate the update of $(α^{(k + 1)}, ν^{(k + 1)}, β^{(k + 1)})$ , at the $(k + 1) t h$ step in (2.8) to (2.10) of the ADMM iterative algorithm, we need to specify a proper initial value. We obtain the regression estimators $β^{(0)}$ at the first step by minimizing a ridge fusion criterion

ℓ_{R} (β) = \frac{1}{2} ∥ Y - X β ∥^{2} + \frac{λ^{*}}{2} \sum_{1 \leq i < j \leq n} ∥ β_{i} - β_{j} ∥^{2},

by setting $λ^{*} = 0.001$ , and using the matrix notation, we have

β^{(0)} = (X^{T} X + λ^{*} Ω^{T} Ω)^{- 1} {X^{T} Y} .

7.3. Tuning parameter

From a grid of $λ$ values, we select the optimal tuning parameter $\hat{λ}$ by minimizing a modified $B I C$ ,

(2.13)

B I C (λ) = \sum_{i = 1}^{n} [Y_{i} \cdot ({\tilde{X}}_{i}^{T} \hat{β} (λ)) - log {1 + exp ({\tilde{X}}_{i}^{T} \hat{β} (λ))}] + C_{n} \frac{log n}{n} (\hat{K} (λ) P),

(2.13)

where $C_{n}$ is a positive number dependent on $n$ . We adopt the strategy of Ma and Huang (2016) and take $C_{n} = l o g (n p), φ = 1$ , and $a = 3$ .

7.4. Proof of Theorem 1

First, we prove the consistency in the $K p$ -dimensional space ${ρ \in R^{K p}}$ by constraining $l (ρ),$ we prove that there exists a strict local maximizer $\hat{ρ}$ of $l (w)$ satisfies ${∥\hat{ρ} - ρ_{0}∥}_{2} = O_{p} (\sqrt{K p / G_{m i n}})$ . We define an event,

H = \{l (ρ_{0}) > max_{w \in \partial N_{ι}} l (w)\},

where

l (ρ) = \frac{1}{n} \sum_{i = 1}^{n} [Y_{i} (U_{i}^{T} ρ) - log {1 + exp (U_{i}^{T} ρ)}],

and $\partial N_{ι}$ denotes the boundary of the closed set $N_{ι} = {w \in R^{K p} :∥ w - ρ_{0} ∥_{2} \leq ι \sqrt{K p / G_{m i n}}}$ , and $ι \in (0, \infty)$ . It is easy to know that there exists a local maximizer $\hat{ρ}$ of $l_{n} (w)$ on the event $H$ in $N_{ι}$ . Then we need to prove that when $ι$ is large enough, $P (H)$ is close to 1 as $n \to \infty$ . So we need to analyze the function $l$ on the boundary $\partial N_{ι}$ .

By Taylor’s theorem, we have for any $w \in N_{ι}$ , and the Taylor’s expansion is

(2)

l (w) - l (ρ_{0}) = (w - ρ_{0})^{T} v - \frac{1}{2} (w - ρ_{0})^{T} D (w - ρ_{0}),

(2)

where $v = \frac{1}{n} U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}]$ , $θ_{0} = U ρ_{0}$ ,

D = \frac{1}{n} U^{T} d i a g {\frac{exp (θ^{*})}{{(1 + exp (θ^{*}))}^{2}}} U,

$θ^{*} = U ρ^{*}$ , and $ρ^{*}$ lies on the segment joining $\b w$ and $ρ_{0}$ . Based on

E_{min} (D) \geq c G_{min} / n,

we get

max_{w \in \partial N_{ι}} l (w) - l (ρ_{0}) \leq \sqrt{K p / G_{min}} ι (∥ v ∥_{2} - c \frac{G_{min}}{n} ι \sqrt{K p / G_{min}} / 2),

also along with Markov’s inequality entails that

P (H) \geq P (∥ v ∥_{2}^{2} < \frac{G_{min} c^{2} K p ι^{2}}{4 n^{2}}) \geq 1 - \frac{4 n^{2} E ∥ v ∥_{2}^{2}}{c^{2} G_{min} K p ι^{2}} .

From $E (Y | U) = μ (θ_{0}) = \frac{1}{exp (- θ_{0}) + 1}, c o v ([Y - \frac{1}{exp (- θ_{0}) + 1}]) = Σ (θ_{0}) = d i a g {\frac{exp (θ_{0})}{{(1 + exp (θ_{0}))}^{2}}}$ , we have

(3)

E ∥ v ∥_{2}^{2} = n^{- 2} E ∥ U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}] ∥_{2}^{2} \leq n^{- 2} t r [U^{T} d i a g {Σ (θ_{0})} U] = O (1 / n),

(3)

with $G_{min} = O (n)$ , we get $P (H) \geq 1 - O (ι^{- 2})$ . This proves $∥ \hat{ρ} - ρ_{0} ∥_{2} = O_{p} (\sqrt{K p / G_{min}}) .$

Next we prove the asymptotic normality of $\hat{ρ}$ . On the event $H$ , we have shown that $\hat{ρ} \in N_{ι} \subset N_{0}$ is a strict local maximizer of $l (w),$ then we can easily get that $\nabla l (\hat{ρ}) = 0$ . Next, we expand $l (\hat{ρ})$ around $ρ_{0}$ to the first-order componentwise. Then, by the properties of $X$ and ${∥\hat{ρ} - ρ_{0}∥}_{2} = O_{p} (\sqrt{K p / G_{min}})$ , we have

0 = \nabla l (ρ_{0}) - n^{- 1} U^{T} d i a g {\frac{exp (θ_{0})}{{(1 + exp (θ_{0}))}^{2}}} U (\hat{ρ} - ρ_{0})

+ o (1) ∥ \hat{ρ} - ρ_{0} ∥_{2}^{2}

= n^{- 1} U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}] - n^{- 1} U^{T} d i a g {\frac{exp (θ_{0})}{{(1 + exp (θ_{0}))}^{2}}} U (\hat{ρ} - ρ_{0})

(4)

+ o_{p} (G_{min}^{- 1}) .

(4)

It follows from $\hat{ρ} \in N_{0}$ , then we have,

n^{- 1} U^{T} d i a g {\frac{exp (θ_{0})}{{(1 + exp (θ_{0}))}^{2}}} U (\hat{ρ} - ρ_{0}) = n^{- 1} U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}] + o_{p} (G_{min}^{- 1 / 2}),

then along with the properties of $U$ entails,

(5)

V_{n}^{1 / 2} (\hat{ρ} - ρ_{0}) = V_{n}^{- 1 / 2} U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}] + o_{p} (1),

(5)

where $V_{n} = U^{T} d i a g {\frac{exp (θ_{0})}{{(1 + exp (θ_{0}))}^{2}}} U$ , the small order term can be understood under the $L_{2}$ norm.

Then, we can show the asympotic normality of $\hat{ρ}$ . Define $G_{n}$ as a $1 \times (K p)$ row vector such that $∥ G_{n} ∥= 1$ , Then follows from (5) that

G_{n} V_{n}^{1 / 2} (\hat{ρ} - ρ_{0}) = u_{n} + o_{p} (1),

where $u_{n} = G_{n} V_{n}^{- 1 / 2} U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}]$ . Thus, by Slutlsy’s lemma, to show $G_{n} V_{n}^{1 / 2} (\hat{ρ} - ρ_{0}) \overset{D}{\to} N (0, 1)$ , it suffices to prove $u_{n} \overset{D}{\to} N (0, 1)$ . Next, we consider the asymptotic distribution of the linear combination

v_{n} = G_{n} V_{n}^{- 1 / 2} U^{T} [Y - \frac{1}{exp (- θ_{0}) + 1}] = \sum_{i = 1}^{n} ξ_{i},

where $ξ_{i} = G_{n} V_{n}^{- 1 / 2} U_{i} [Y_{i} - \frac{1}{exp (- θ_{0 i}) + 1}]$ . Clearly, $ξ_{i}$ ‘s are independent and have mean 0, and

\sum_{i = 1}^{n} v a r (ξ_{i}) = G_{n} V_{n}^{- 1 / 2} [U^{T} d i a g {\frac{exp (θ_{0})}{{(1 + exp (θ_{0}))}^{2}}} U] V_{n}^{- 1 / 2} G_{n}^{T}

(6)

= G_{n} G_{n}^{T} \to 1,

(6)

as $n \to \infty$ . By the Cauchy-Schwarz inequality, we have

\sum_{i = 1}^{n} E | ξ_{i} |^{3} = \sum_{i = 1}^{n} | G_{n} V_{n}^{- 1 / 2} U_{i} |^{3} E | Y_{i} - \frac{1}{exp (- θ_{0}) + 1} |^{3}

= O (1) \sum_{i = 1}^{n} | G_{n} V_{n}^{- 1 / 2} U_{i} |^{3}

\leq O (1) \sum_{i = 1}^{n} {∥G_{n}∥}_{2}^{3} {∥V_{n}^{- 1 / 2} U_{i}∥}_{2}^{3}

(7)

= O (1) \sum_{i = 1}^{n} (U_{i}^{T} V_{n}^{- 1} U_{i})^{3 / 2} = o (1) .

(7)

Then with use of Lyapunov’s

u_{n} = \sum_{i = 1}^{n} ξ_{i} \overset{D}{\to} N (0, 1),

we complete the proof

7.5. Proof of Theorem 2

Define

ℓ (β) = \sum_{i = 1}^{n} [Y_{i} \cdot (X_{i}^{T} β) - log {1 + exp (X_{i}^{T} β)}]; P_{λ} (β) = \sum_{1 \leq i < j \leq n} λ ρ_{λ} (∥ β_{i} - β_{j} ∥),

ℓ^{G} (ρ) = \sum_{i = 1}^{n} [Y_{i} \cdot (U_{i}^{T} ρ) - log {1 + exp (U_{i}^{T} ρ)}]; P_{λ}^{G} (ρ) = \sum_{1 \leq l < l^{'} \leq L} λ | G_{l} ∥ G_{l^{'}} | ρ_{λ} (∥ ρ_{l} - ρ_{l^{'}} ∥),

and let $ℓ_{p} (β) = ℓ (β) + P_{λ} (β)$ , $ℓ_{p}^{G} (ρ) = ℓ^{G} (ρ) + P_{λ}^{G} (ρ)$ . Let $H : M_{G} \to R^{K p}$ be the mapping which $H (β)$ is the $K p \times 1$ vector consisting of $K$ vectors with dimension $p$ and its $l^{t h}$ vector component equals the common value of $β_{i}$ for $i \in G_{k}$ . Let $H^{*} : R^{n p} \to R^{K p}$ be the mapping which $H^{*} (β) = {| G_{k} |^{- 1} \sum_{i \in G_{k}} β_{i}^{T}, k = 1, \dots, K}^{T} .$

Consider the neighborhood of $β_{0}$ :

Θ = {β \in R^{n p} : sup_{i} ∥ β_{i} - β_{0 i} ∥\leq c v_{n}},

where $v_{n} = \sqrt{K p / G_{min}} .$ We show that ${\hat{β}}^{o r})$ is a strictly local minimizer of the proposed penalized objective function almost surely through the following two steps:

(i) In event $A_{1}$ , where $A_{1} = {{sup}_{i} ∥ {\hat{β}}_{i}^{o r} - β_{0 i} ∥\leq c v_{n}}$ , $ℓ_{p} (β^{*}) > ℓ_{p} ({\hat{β}}^{o r})$ for any $β^{*} \in Θ$ and $β^{*} \neq \hat{β}$ , where $β^{*} = H^{- 1} (H^{*} (β)) .$

(ii) There is an event $A_{2}$ such that $P (A_{2}^{C}) \leq \frac{2}{n}$ and in $A_{1} \cap A_{2}$ , there is a neighborhood $Θ_{n}$ of $\hat{β}$ , and for $β \in Θ_{n} \cap Θ$ , $ℓ_{p} (β) > ℓ_{p} (β^{*})$ .

It is easy to show (i) following Ma and Huang (2016). To show the result in (ii), we consider $Θ_{n} = {β_{i} : {sup}_{i} ∥ β_{i} - {\hat{β}}_{i}^{o r} ∥\leq s_{n}}$ for a positive sequence $s_{n}$ . For $β \in Θ_{n} \cap Θ$ , by Taylor’s expansion, we have

ℓ_{p} (β) - ℓ_{p} (β^{*}) = H_{1} + H_{2},

where

H_{1} = \sum_{i = 1}^{n} X_{i}^{T} [Y_{i} - \frac{1}{exp {- X_{i} {\tilde{β}}_{i}} + 1}] (β_{i} - β_{i}^{*}), a n d H_{2} = \sum_{i = 1}^{n} \frac{\partial P_{λ} (\tilde{β})}{\partial β_{i}^{T}} (β_{i} - β_{i}^{*}) .

Here, $\tilde{β} = a β + (1 - a) β^{*}$ , Note that

H_{2} \geq \sum_{k = 1}^{K} \sum_{i, j \in G_{k}, i < j} λ ρ_{λ^{'}} (4 s_{n}) ∥ β_{i} - β_{j} ∥ .

Setting $Q_{i} = X_{i} [Y_{i} - \frac{1}{exp {- X_{i} {\tilde{β}}_{i}} + 1}]$ , we have

H_{1} = - \sum_{l = 1}^{L} \sum_{i, j \in G_{k}, i < j} \frac{{(Q_{j} - Q_{i})}^{T} (β_{j} - β_{i})}{| G_{k} |},

sup_{i} ∥ Q_{i} ∥\leq P_{1} + P_{2}

where

P_{1} = sup_{i} ∥ X_{i} ∥ sup_{i} {| Y_{i} - \frac{1}{exp {- X_{i} β_{0 i}} + 1} |},

P_{2} = sup_{i} ∥ X_{i} ∥ {| \frac{1}{exp {- X_{i} {\tilde{β}}_{i}} + 1} - \frac{1}{exp {- X_{i} β_{0 i}} + 1} |} .

For $P_{1}$ , since $E (Y_{i} | X_{i}) = - \frac{1}{exp {- X_{i} β_{0 i}} + 1}$ , with Condition (C2) we have

P (sup_{i} | Y_{i} - \frac{1}{exp {- X_{i} β_{0 i}} + 1} | > \sqrt{2 log (n) / c_{1}})

\leq \sum_{i = 1}^{n} P (| Y_{i} - \frac{1}{exp {- X_{i} β_{0 i}} + 1} | > \sqrt{2 log (n) / c_{1}})

\leq \frac{2}{n},

we conclude that there is an event $A_{2}$ such that $P (A_{2}^{C}) \leq \frac{2}{n}$ , and under the event $A_{2}$ and condition (C3) (i),

P_{1} \leq c_{2} (\sqrt{2 log (n) / c_{1}}) .

Thus, we have

| \frac{{(Q_{j} - Q_{i})}^{T} (β_{j} - β_{i})}{| G_{k} |} | \leq 2 G_{min}^{- 1} sup_{i} ∥ Q_{i} ∥∥ β_{j} - β_{i} ∥

(8)

\leq 4 c_{2} G_{min}^{- 1} [\sqrt{2 log (n) / c_{1}} + c_{2} \sqrt{K p / G_{min}}] ∥ β_{j} - β_{i} ∥,

(8)

and

ℓ_{p} (β) - ℓ_{p} (β^{*}) \geq \sum_{k = 1}^{K} \sum_{i, j \in G_{k}, i < j} {λ {ρ_{λ}}^{'} (4 s_{n}) - 4 c_{2} G_{min}^{- 1} [\sqrt{2 log (n) / c_{1}} + c_{2} \sqrt{K p / G_{min}}]} ∥ β_{i} - β_{j} ∥ .

Let $s_{n} \to 0$ , and then $λ {ρ_{λ}}^{'} (4 s_{n}) \to c λ$ . Since $λ ≫ max (\sqrt{log (n)} / G_{min}, \sqrt{K p / G_{min}} / G_{min})$ , we have $ℓ_{p} (β) - ℓ_{p} (β^{*}) \geq 0$ for a sufficiently large $n$ , which completes the proof of Theorem 2.

Additional information

Funding

This study was equally supported by the National Natural Science Foundation of China [grant number 11901352], the Social Science Foundation of Ministry of Education of China [grant number:18XJC910001], the Natural Science Foundation of Shandong Province [grant number ZR2019BA017], the Social Science Foundation of Shandong Province [grant number 19DTJJ03] and the Young Scholars Program of Shandong University [YSPSDU: 11020088964008]. Xinglin Scholar Academic Backbone Project of Chengdu University of TCM [XSGG2020006].

Heterogeneous logistic regression for estimation of subgroup effects on hypertension

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Heterogeneous logistic regression for estimation of subgroup effects on hypertension

ABSTRACT

Disclosure statement

7.2. Initial value

7.3. Tuning parameter

7.4. Proof of Theorem 1

7.5. Proof of Theorem 2

Additional information

Funding

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature