ABSTRACT
Personalized medicine has gained much attention in the past decades, and identifying the effects of factors is essential for personalized preventions and treatments. Hypertension is a major modifiable risk factor for cardiovascular disease and is influenced by complex factors. In order to decrease the incidence of hypertension effectively, the subjects should be divided into subgroups according to their characteristics. In this study, we proposed to use a heterogeneous logistic regression combined with a concave fusion penalty to analyze the population-based survey data, including common influencing factors of hypertension. The analytic steps include: (1) identifying the most important predictor; (2) estimating subgroup-based heterogeneous effects. In the present context of primary hypertension data, the modeling results showed that the calculated prediction accuracy under our method was greater than 99%, while zero under the classical logistic regression. The findings could provide a practical guide for further individualized measures implementation.
Disclosure statement
No potential conflict of interest was reported by the authors.
7.2. Initial value
To facilitate the update of , at the
step in (2.8) to (2.10) of the ADMM iterative algorithm, we need to specify a proper initial value. We obtain the regression estimators
at the first step by minimizing a ridge fusion criterion
by setting , and using the matrix notation, we have
7.3. Tuning parameter
From a grid of values, we select the optimal tuning parameter
by minimizing a modified
,
where is a positive number dependent on
. We adopt the strategy of Ma and Huang (2016) and take
, and
.
7.4. Proof of Theorem 1
First, we prove the consistency in the -dimensional space
by constraining
we prove that there exists a strict local maximizer
of
satisfies
. We define an event,
where
and denotes the boundary of the closed set
, and
. It is easy to know that there exists a local maximizer
of
on the event
in
. Then we need to prove that when
is large enough,
is close to 1 as
. So we need to analyze the function
on the boundary
.
By Taylor’s theorem, we have for any , and the Taylor’s expansion is
where ,
,
, and
lies on the segment joining
and
. Based on
we get
also along with Markov’s inequality entails that
From , we have
with , we get
. This proves
Next we prove the asymptotic normality of . On the event
, we have shown that
is a strict local maximizer of
then we can easily get that
. Next, we expand
around
to the first-order componentwise. Then, by the properties of
and
, we have
It follows from , then we have,
then along with the properties of entails,
where , the small order term can be understood under the
norm.
Then, we can show the asympotic normality of . Define
as a
row vector such that
, Then follows from (5) that
where . Thus, by Slutlsy’s lemma, to show
, it suffices to prove
. Next, we consider the asymptotic distribution of the linear combination
where . Clearly,
‘s are independent and have mean 0, and
as . By the Cauchy-Schwarz inequality, we have
Then with use of Lyapunov’s
we complete the proof
7.5. Proof of Theorem 2
Define
and let ,
. Let
be the mapping which
is the
vector consisting of
vectors with dimension
and its
vector component equals the common value of
for
. Let
be the mapping which
Consider the neighborhood of :
where We show that
is a strictly local minimizer of the proposed penalized objective function almost surely through the following two steps:
(i) In event , where
,
for any
and
, where
(ii) There is an event such that
and in
, there is a neighborhood
of
, and for
,
.
It is easy to show (i) following Ma and Huang (2016). To show the result in (ii), we consider for a positive sequence
. For
, by Taylor’s expansion, we have
where
Here, , Note that
Setting , we have
where
For , since
, with Condition (C2) we have
we conclude that there is an event such that
, and under the event
and condition (C3) (i),
Thus, we have
and
Let , and then
. Since
, we have
for a sufficiently large
, which completes the proof of Theorem 2.