ABSTRACT
Personalized medicine has gained much attention in the past decades, and identifying the effects of factors is essential for personalized preventions and treatments. Hypertension is a major modifiable risk factor for cardiovascular disease and is influenced by complex factors. In order to decrease the incidence of hypertension effectively, the subjects should be divided into subgroups according to their characteristics. In this study, we proposed to use a heterogeneous logistic regression combined with a concave fusion penalty to analyze the population-based survey data, including common influencing factors of hypertension. The analytic steps include: (1) identifying the most important predictor; (2) estimating subgroup-based heterogeneous effects. In the present context of primary hypertension data, the modeling results showed that the calculated prediction accuracy under our method was greater than 99%, while zero under the classical logistic regression. The findings could provide a practical guide for further individualized measures implementation.
Disclosure statement
No potential conflict of interest was reported by the authors.
7.2. Initial value
To facilitate the update of , at the step in (2.8) to (2.10) of the ADMM iterative algorithm, we need to specify a proper initial value. We obtain the regression estimators at the first step by minimizing a ridge fusion criterion
by setting , and using the matrix notation, we have
7.3. Tuning parameter
From a grid of values, we select the optimal tuning parameter by minimizing a modified ,
where is a positive number dependent on . We adopt the strategy of Ma and Huang (2016) and take , and .
7.4. Proof of Theorem 1
First, we prove the consistency in the -dimensional space by constraining we prove that there exists a strict local maximizer of satisfies . We define an event,
where
and denotes the boundary of the closed set , and . It is easy to know that there exists a local maximizer of on the event in . Then we need to prove that when is large enough, is close to 1 as . So we need to analyze the function on the boundary .
By Taylor’s theorem, we have for any , and the Taylor’s expansion is
where , ,
, and lies on the segment joining and . Based on
we get
also along with Markov’s inequality entails that
From , we have
with , we get . This proves
Next we prove the asymptotic normality of . On the event , we have shown that is a strict local maximizer of then we can easily get that . Next, we expand around to the first-order componentwise. Then, by the properties of and , we have
It follows from , then we have,
then along with the properties of entails,
where , the small order term can be understood under the norm.
Then, we can show the asympotic normality of . Define as a row vector such that , Then follows from (5) that
where . Thus, by Slutlsy’s lemma, to show , it suffices to prove . Next, we consider the asymptotic distribution of the linear combination
where . Clearly, ‘s are independent and have mean 0, and
as . By the Cauchy-Schwarz inequality, we have
Then with use of Lyapunov’s
we complete the proof
7.5. Proof of Theorem 2
Define
and let , . Let be the mapping which is the vector consisting of vectors with dimension and its vector component equals the common value of for . Let be the mapping which
Consider the neighborhood of :
where We show that is a strictly local minimizer of the proposed penalized objective function almost surely through the following two steps:
(i) In event , where , for any and , where
(ii) There is an event such that and in , there is a neighborhood of , and for , .
It is easy to show (i) following Ma and Huang (2016). To show the result in (ii), we consider for a positive sequence . For , by Taylor’s expansion, we have
where
Here, , Note that
Setting , we have
where
For , since , with Condition (C2) we have
we conclude that there is an event such that , and under the event and condition (C3) (i),
Thus, we have
and
Let , and then . Since , we have for a sufficiently large , which completes the proof of Theorem 2.