1,325
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Structural and mechanistic insights into nuclear transport and delivery of the critical pluripotency factor Oct4 to DNA

, , , , & ORCID Icon
Pages 767-778 | Received 20 Sep 2016, Accepted 26 Dec 2016, Published online: 17 Feb 2017

Abstract

Oct4 is a master regulator of the induction and maintenance of cellular pluripotency, and has crucial roles in early stages of differentiation. It is the only factor that cannot be substituted by other members of the same protein family to induce pluripotency. However, although Oct4 nuclear transport and delivery to target DNA are critical events for reprogramming to pluripotency, little is known about the molecular mechanism. Oct4 is imported to the nucleus by the classical nuclear transport mechanism, which requires importin α as an adaptor to bind the nuclear localization signal (NLS). Although there are structures of complexes of the NLS of transcription factors (TFs) in complex with importin α, there are no structures available for complexes involving intact TFs. We have therefore modeled the structure of the complex of the whole Oct4 POU domain and importin α2 using protein-protein docking and molecular dynamics. The model explains how the Ebola virus VP24 protein has a negative effect on the nuclear import of STAT1 by importin α but not on Oct4, and how Nup 50 facilitates cargo release from importin α. The model demonstrates the structural differences between the Oct4 importin α bound and DNA bound crystal states. We propose that the ‘expanded linker’ between the two DNA-binding domains of Oct4 is an intrinsically disordered region and that its conformational changes have a key role in the recognition/binding to both DNA and importin α. Moreover, we propose that this structural change enables efficient delivery to DNA after release from importin α.

1. Introduction

Differentiated somatic cells can be reprogrammed into induced pluripotent stem (iPS) cells by the forced expression of Oct4, Sox2, Klf4, and c-Myc (Takahashi & Yamanaka, Citation2006). Oct4 (also known as Oct3 or Oct3/4) is the only one of these factors that cannot be substituted by other members of the same protein family to induce pluripotency (Feng et al., Citation2009; Nakagawa et al., Citation2008; Shi et al., Citation2008). It has also become clear that Oct4 is a master regulator of the induction and maintenance of cellular pluripotency and has crucial roles in the early stages of differentiation (Jerabek, Merino, Schöler, & Cojocaru, Citation2014). Oct4 is encoded by the gene POU5F1 and is a member of the octamer motif (consensus ATGCAAAT)-binding subgroup of the POU family of transcription factors (TFs) (Latchman, Citation2015). The POU domain (a bipartite helix-turn-helix DNA-binding domain) consists of two subdomains, the POU-specific domain (POUS) with four α-helices and the POU homeodomain (POUHD) with three α-helices, which are connected by a variable linker region (Figure S1) (Esch et al., Citation2013). Oct4 functions by recognizing and binding to DNA regulatory regions alone or in cooperation with other TFs and recruits many different factors such as epigenetic factors (e.g. chromatin remodeling complexes), co-activator complexes, other TFs and components of the basal transcription machinery for regulating the expression of its target genes (Ding, Xu, Faiola, Ma’ayan, & Wang, Citation2012; Esch et al., Citation2013; Ho et al., Citation2009; Pardo et al., Citation2010; Singhal et al., Citation2010; van den Berg et al., Citation2010).

Understanding the processes involved in the subcellular localization of Oct4 is critical to the understanding of somatic cell reprogramming and the maintenance of pluripotency in embryonic stem (ES) cells (Oka, Moriyama, Asally, Kawakami, & Yoneda, Citation2013). Oct4 is imported into the nucleus by the classical nuclear transport mechanism (Goldfarb, Corbett, Mason, Harreman, & Adam, Citation2004; Mosammaparast & Pemberton, Citation2004); that is to say, Oct4 containing a nuclear localization signal (NLS) is imported by heterodimeric import receptors consisting of importin α and importin β (Figure S2a). Importin α is the adaptor protein, which directly binds to the NLS (223RKRKR227) at the N terminus of the POUHD of Oct4 (Figures S1 and S2a) (Li, Sun, & Jin, Citation2008; Pan, Qin, Liu, Scholer, & Pei, Citation2004; Yasuhara et al., Citation2007; Young, Major, Miyamoto, Loveland, & Jans, Citation2011). Importin β mediates the interaction of the trimeric complex with the nuclear pore while it translocates into the nucleus (Bayliss, Littlewood, & Stewart, Citation2000) and dissociates in the nucleus upon binding of Ran GTP followed by importin α release (Figure S2a) (Goldfarb et al., Citation2004; Mosammaparast & Pemberton, Citation2004). Currently, seven and six importin α subtypes of human (Pumroy & Cingolani, Citation2015) and mouse (Loveland et al., Citation2015) origin are known, respectively. Importin αs are composed of 10 armadillo (ARM) motifs, each of which is constructed from three α-helices and which have two classical NLS (cNLS) binding sites: the major binding site spans ARM repeats 2–4, whereas the minor site spans ARM repeats 6–8 (Figure S2b) (Pumroy & Cingolani, Citation2015). Our collaborators have revealed that the subtype switching of importin-α triggers neural differentiation of ES cells by changing the nuclear transport network (Yasuhara et al., Citation2007) and that importin α2 (KPNA2, karyopherin alpha 2) has a novel NLS binding site in its C-terminal region and maintains the undifferentiated state of ES cells by inhibiting nuclear import of particular POU TFs that induce differentiation (Yasuhara et al., Citation2013). However, although Oct4 nuclear transport and subsequent delivery to target DNA are critical events for reprogramming to pluripotency, little is known about the molecular mechanism. This arises from a lack of three-dimensional structural information; that is, although there are the crystal structure of the complex between Oct4 POU domain and DNA (Esch et al., Citation2013), there is no such structural information about Oct4 bound to importin α.

Here, we predict the structure of the complex between the whole Oct4 POU domain and importin α2 using protein-protein docking and molecular dynamics (MD) calculations. In this paper, we identify the structural differences between Oct4 in the nuclear transport factor bound and DNA bound states, and discuss the molecular mechanism of Oct4 nuclear transport and its delivery to DNA.

2. Materials and methods

2.1. Preparation for calculation

Chain B (including no cis-peptide) of PDB ID: 3L1P (Esch et al., Citation2013) was mainly used as the crystal structure data of the POU domain of Oct4. The main chain of a segment without coordinates (residues 217−220) due to poor electron density was built by modeling and energy optimization using Discovery Studio 4.1 (BIOVIA). The residues 217–220 correspond to residues 87–90 in the original PDB file: 3L1P. Note that the residue numbering scheme from the initial methionine in the gene is used for Oct4 throughout this paper (refer Figure S1).

2.2. Protein-protein docking

PDB ID: 1Q1T (Fontes et al., Citation2003) was used as the crystal structure data for importin α2 (KPNA2). Although 1Q1T is missing only 33 residues at the C-terminus (497–529) in the NLS binding domain (70–529), it includes full 10 ARM motifs structure. Therefore we used the data as they appear in the database. The coordinates completed by the above-mentioned method were used for Oct4. The protein-protein docking calculation of importin α2 and Oct4 was performed using ClusPro2.0 (Comeau, Gatchell, Vajda, & Camacho, Citation2004a, 2004b; Kozakov, Brenke, Comeau, & Vajda, Citation2006; Kozakov et al., Citation2013). To reproduce interactions in complex between importin α2 and SV40 NLS (Figure S2c), calculations were performed so that attraction force acts between specified side chains (Arg223(P1)-Asn239; Lys224(P2)-Thr155,Asp192; Arg225(P3)-Trp231; Lys226(P4)-Ile112; Arg227(P5)-Trp142,Gln181,Trp184). Observation of interactions such as hydrogen bond, salt bridge, pi-cation interaction, and van der Waals interactions was made using Discovery Studio 4.1 (BIOVIA).

2.3. MD calculation

The MD simulation was carried out using Amber 11 (Case et al., Citation2010). We first carried out simulation in a vacuum environment for 100 ps at 300 K using a distance dependent dielectric constant (ε = r). We expected conformational change of proteins is accelerated in vacuum, because it is not slowed down by the surrounding solvent molecules explicitly treated. We used ff12sb force field. The SHAKE algorithm was applied to all bonds involving hydrogen atoms and time step of 1 fs was used. The temperature was controlled by Langevin dynamics with the collision frequency of 5 ps−1. During the simulation, the six important interaction distances between specific pair of atoms of Arg223(P1)[CZ]-Asn239[OD1]; Lys224(P2)[NZ]-Asp192[OD1]; Arg227(P5)[CZ]-Trp142[CD2], Gln181[OE1], Trp184[CD2] were restrained in a range between 3 and 5 Å, which is ideal for the interaction (Figure S2c). These restraints were successively imposed in the order of P5, P2, and P1 with a force constant of 10 kcal/mol Å2. In this calculation, we have imposed three additional restraints. (i) Heavy chain atoms of the importin α2 were restrained to their initial position of the docked structure. (ii) There are two bundles (POUS and POUHD) of α helices in the Oct4 POU domain (chain B), and these are known to be well conserved. Thus hydrogen bonds formed in the backbone of the POU domain were retained during MD simulation by applying strong restraining potential for Oi–Ni+4 distances of about 3 Å (actually we applied a strong restraining potential when the distance deviated more than 0.2 Å from the value for the modeled structure) (iii) the backbone structure of residues 223 to 227 in Oct4 was also kept constant by restraining their torsion angles φ, ψ and ω. Actually, strong restraining potentials was applied when φ and ψ deviated more than 20° from their values in the docked structure.

Secondly, we also carried out simulations in an aqueous environment, including surrounding water molecules with periodic boundary conditions. Unless otherwise noted, we employed the same protocol as used for simulations in a vacuum environment. The modeled complex was solvated with TIP3P water molecules to fill a periodic box. Five Cl ions were placed around the proteins using the LEaP module in AMBER to obtain electrostatic neutrality. The total number of water molecules was 21,168. Joung/Cheatham ion parameters for TIP3P water were used. We performed 2000 step minimization with a 100 kcal /mol Å2 positional restraints placed on the heavy atoms of importin α2 and main chain atoms of Oct4. The entire system was then heated from 0 to 300 K over 20 ps. After performing 0.1 ns NVT MD simulation again with the positional restraints, the density was close to 1 g/cc. Finally the resultant structure was used to carry out 0.7 ns of MD simulations in the NPT ensemble at a constant temperature of 300 K and a pressure of 1 bar. The pressure was controlled by the Berendsen method with the relaxation time of 2 ps. Note that the restraints as used for simulations in vacuum environment are imposed. The structures obtained in the aqueous environment and in the vacuum were quite similar to each other.

Nevertheless, it must be noted that our MD calculation is less than ideal for two reasons. First presently available force fields that work well for folded proteins were shown to be imperfect when applied to intrinsically disordered proteins (Best, Zheng, & Mittal, Citation2014; Piana, Donchev, Robustelli, & Shaw, Citation2015). The second reason is that simulation time required for equilibration was estimated to be very long (10 to 100 μs for estimating delicate structural properties (Best et al., Citation2014; Bhowmick et al., Citation2016) but that such a long simulation is beyond the scope of this article. It is, however, sure that our model can be used as an initial structure of much longer simulation.

2.4. Molecular graphics and movies

All molecular graphics in this article were produced using Chimera (Pettersen et al., Citation2004). The morphing and MD simulation animation were created by Chimera (Pettersen et al., Citation2004). The atom-based 3D superimposition of molecules was carried out by Chimera (Pettersen et al., Citation2004).

3. Results and discussion

3.1. Prediction of Oct4-importin α complex structure by protein-protein docking

The monomer (chain B) in the crystal structure of mouse Oct4 POU domain homodimer bound to the PORE (palindromic-octamer-recognition-element) DNA (PDB ID: 3L1P) (Esch et al., Citation2013) was used as the Oct4 structure for the docking calculations. The missing coordinates of the variable linker region between the POUS and POUHD were initially generated using modeling and energy optimization. The crystal structure of mouse importin α2 (KPNA2) in complex with SV40 (simian-virus-40) large T-antigen NLS peptide (PDB ID: 1Q1T) (Fontes et al., Citation2003) was used as the importin α structure. Using these two structures, protein-protein docking was performed using the Cluspro 2.0 server (Comeau et al., Citation2004b; Kozakov et al., Citation2013). The Oct4 POU domain was docked to the major NLS binding site of importin α2, using the interactions between SV40 NLS (KKKRK) and the major site during docking as reference (Figure S2c). All 30 (model Nos. 0–29) of the complex structures (Figure S3) obtained by the docking were observed on 3-D display based on SGI or a PC in detail (visually inspected), and the top three models (model Nos. 1, 26, and 29), in which the Oct4 POU domain was bound to the major site groove of importin α2 were selected (Figure S4). The distances between the NLS residues (P1-P5) and the NLS-binding site residues were compared in these three models (Table S1 and Figure S2c). The backbone root mean square deviations (RMSD) between the SV40 NLS bound to importin α2 and the Oct4 NLS were also compared (Table S2). A smaller RMSD represents a smaller deviation from the binding mode observed in the crystal structure of the SV40 NLS bound to the major site of importin α2. We concluded that Model 26 was the best complex model based on comparisons of the distances (Table S1) and RMSD (Table S2). The structure of Model 26 is shown in Figure (a).

Figure 1. The structure of the Oct4-importin α2 complex obtained by protein-protein docking.

The structure before (a) and after (b) MD refinement. Overall view (left column) and enlarged view of NLS binding site (right column). Importin α2 is represented by a cyan ribbon. NLS binding site residues (I112, W142, T155, Q181, W184, D192, W231 and N239) of importin α2 are depicted as a stick model with carbon atoms in cyan, oxygen in red and nitrogen in blue, and labeled in cyan. Oct4 before MD (= model 26) and after MD is displayed in blue and purple ribbon, respectively. In the NLS (223RKRKR227), each stick carbon atom and label is also colored blue and purple in the same way.
Figure 1. The structure of the Oct4-importin α2 complex obtained by protein-protein docking.

3.2. Refinement of complex model using MD

Although we concluded that Model 26 was the best, the interaction distance and RMSD were still not good enough. This is because of the limitations of the rigid-body docking method, which calculates by considering both protein molecules as rigid bodies. This is especially relevant as it is believed that a number of TFs undergo major structural changes in vivo; therefore, there are issues with calculations that assume Oct4 to be a rigid body. MD calculations were therefore performed using Model 26 as the initial structure (Videos S1 and S2). The final structure after the MD is shown in Figure (b). The interaction distances (Table S3 and Figure S2c) and RMSDs (Table S4) of Model 26 before and after MD were compared (the same evaluation method used above for the comparison between the three models was used). The data shows that both the interaction distance and RMSD are considerably improved after the MD (refer also Figures (a) and (b)). It is noteworthy that the interaction distances at P3 are still large, although the P3 (Arg225) side chain was flipped to the same side of Trp231 of importin α2.

3.3. Relationship between Ebola virus VP24 protein, STAT1 and Oct4 – Validity and application of the model: Part 1

It is known that the NPI-1 subfamily (importin α1: KPNA1, α5: KPNA5, and α6: KPNA6) mediates nuclear transport of tyrosine-phosphorylated STAT1 (PY-STAT1) via a unique nonclassical NLS (ncNLS) (Sekimoto, Imamoto, Nakajima, Hirano, & Yoneda, Citation1997). It is thought that importin α1 (KPNA1, one member of subfamily NPI-1) binds between two STAT1 monomers which comprise the dimeric PY-STAT1 (Chen et al., Citation1998), via two extensive binding areas: STAT1 DNA binding domain-importin α1 major NLS binding groove (ARM 1–4) interaction and the STAT1 SH2 domain-importin α1 C terminus (ARM 9–10) interaction (Nardozzi, Wenta, Yasuhara, Vinkemeier, & Cingolani, Citation2010) (Figure (a)). Many viruses, including the Ebola virus, actively antagonize STAT1 signaling to counteract the antiviral effects of interferon. Ebola virus VP24 protein (eVP24) binds importin α to inhibit PY-STAT1 nuclear transport and render cells refractory to interferons. Recently, the crystal structure of human importin α5 (KPNA5, one member of subfamily NPI-1) C terminus in complex with eVP24 was resolved (Xu et al., Citation2014). The structure shows that eVP24 and STAT1 bind overlapping sites in ARM 8–10 and that eVP24 and cNLS cargo occupy independent binding sites on NPI-1 subfamily KPNAs. This leads to the prediction that eVP24 binding, via a portion of the region used by PY-STAT1 via ARMs 8–10, inhibits PY-STAT1 nuclear translocation, but not the transport of cNLS containing cargo. The group of Amarasinghe et al.
have proposed that eVP24 counters cell-intrinsic innate immunity by selectively targeting PY-STAT1 nuclear import while leaving the transport of other cargo that may be required for viral replication unaffected. In this way Ebola virus disables cell-intrinsic antiviral signaling in order to facilitate virus replication without impacting normal cellular cargo transport (Xu et al., Citation2014).

Figure 2. The model structure of STAT1, Oct4, Ebola virus VP24 and Nup50 bound to importin α.

(a) Model of complex between tyrosine-phosphorylated STAT1 and importin α1 (KPNA1). This model was built using reference (Nardozzi et al., Citation2010). Importin α1 (PDB ID: 4B1B) (Jeong et al., Citation2015) is displayed as a green ribbon. PY-STAT1 dimer (PDB ID: 1BF5) (Chen et al., Citation1998) is represented by yellow and pink ribbons. Phospho-Tyr701 is depicted as a stick model with phosphorus atom in orange. The DNA binding site (residues 400–413) and SH-2 domain (558–634) are drawn in blue and cyan, respectively. (b) Model of Oct4 and Ebola virus VP24 bound to importin α1. This model was built by superimposing the crystal structure of importin α5 C terminus in complex with eVP24 (PDB ID: 4U2X) (Xu et al., Citation2014) and our model (importin α2-Oct4) with importin, and then by superimposing these onto importin α1 (PDB ID: 4B1B) (Jeong et al., Citation2015). ARM 1–4, 5–7 and 8–10 of importin α1 are represented as red, green and blue surface models. Oct4 and eVP24 are displayed as purple and yellow ribbons, respectively. (c) Model of Oct4 and Nup50 bound to importin α2. The major and minor NLS binding sites of importin α2 (cyan) are highlighted in red and orange, respectively. Nup50 (PDB ID: 2C1M) (Matsuura & Stewart, Citation2005) is drawn as a yellow ribbon.
Figure 2. The model structure of STAT1, Oct4, Ebola virus VP24 and Nup50 bound to importin α.

Strikingly, our model supports these expectations (Figure (b)). Figure (b) shows the model structure of human importin α1 (KPNA1, one member of subfamily NPI-1) (Jeong et al., Citation2015) in complex with Oct4 and eVP24. From the model, we can clearly see that both Oct4 and eVP24 are able to independently and simultaneously bind to two distinct sites. Therefore we subscribe to the idea that eVP24 inhibits PY-STAT1 recognition and nuclear transport, but not the transport of the normal cellular cargos that bind to only the ‘major’ cNLS binding groove (ARM 1–4) such as Oct4. Note that these discussions are not possible without our model, in which the whole structure of a TF such as Oct4 POU domain is visualized bound to the importin α.

3.4. Relationship between Nup50 and Oct4 – validity and application of the model: Part 2

Dissociation of importin β occurs upon binding of RanGTP to importin β in the importin α/β/cargo ternary complex after nuclear transport. There are a number of possible mechanisms for the subsequent release of cargo from importin α, but here two mechanisms are considered (Stewart, Citation2007) (Figure S2a): the first mechanism is that the importin β binding (IBB) domain of importin α displaces the NLS-containing cargoes from the NLS binding site of importin α (the right of lower right corner of Figure S2a). The second mechanism is that Nup50 actively releases the cargo by binding to the minor NLS binding site and C terminus of importin α (the left of lower right corner of Figure S2a).

We examined the relative positional relationship between Oct4 and/or Nup50 bound to importin α to study whether Nup50 could be involved in the release of Oct4 from importin α. We generated a structure in which both Oct4 and Nup50 are bound to importin α2 simultaneously by superimposing our model on the crystal structure of Nup50 bound to mouse importin α2 (KPNA2) (Matsuura & Stewart, Citation2005) (Figure (c)). Consideration of Figure (c) demonstrates an obvious steric clash between the POUHD of Oct4 and the N terminus (around residues 7–22) of Nup50 around the NLS minor binding site of importin α. Based on this finding, we propose that even a protein like Oct4, which is considered to mainly bind to the NLS major binding site, is subject to direct steric inhibition by Nup50. Accordingly, we propose that Nup50 functions in the release not only of the cargo bound to the NLS minor binding site and C terminus of importin α but also of a number of cargos from importin α through steric inhibition.

3.5. DNA-Oct4 vs. importin α-Oct4: Structural differences in Oct4 in different binding modes

In the previous two sections, we demonstrated the validity of the Oct4-importin α complex model by applying it to VP24 and Nup50 and confirming that it provides good explanations for experimental observations. Next we compared the binding mode of the Oct4 POU domain to DNA (Esch et al., Citation2013) with it bound to importin α. In the Oct4 bound to DNA structure, the entire protein molecule forms a shape twisting around the DNA and Oct4 holds DNA between the POUS and POUHD (Figure (a)). In contrast, in Oct4 bound to importin α, importin α wraps around Oct4 from the side opposite to where DNA binds (Figure (b)). If it is possible to consider the DNA binding side (the side with the space between POUS and POUHD) as the ‘ventral side’ of Oct4 and the importin α binding side as the ‘dorsal side’, Oct4 is captured by importin α from ventral side and is bound to DNA from the dorsal side (compare lower left of Figures (a) and 3(b) and see also Figure (a)). It is noteworthy that the only amino acid critical for binding of Oct4 to both DNA and to importin α is Arg227 (Figure (a) upper left and Figure (b) upper left). There are no other amino acids commonly used for binding to importin α and DNA except for the amino acids around the NLS.

Figure 3. DNA-Oct4 vs. importin α-Oct4.

(a) Crystal structure of DNA-Oct4 complex. Oct4 is shown as a green ribbon and residues that have the interactions with DNA are highlighted (upper left). (b) Model structure of importin α-Oct4 complex. Oct4 is shown as a purple ribbon and residues that interact with importin α are highlighted (upper left).
Figure 3. DNA-Oct4 vs. importin α-Oct4.

Figure 4. Structural differences between DNA bound state (green) and importin α bound state (purple) of Oct4.

(a) Conformational changes of Oct4 backbone. POUS is on the left side of the picture. (b) Conformational changes of linker region in Oct4. See also Figure and Figure S1. POUS (upper left) and POUHD are represented by a surface model. The X-ray-invisible region (residues 217–219) of Oct4 in the DNA bound state (PDB ID: 3L1P chain A) (Esch et al., Citation2013) is represented by a green dashed line. Three rare cis-peptide bonds (221Q-222A, 222A-223R, 224 K-225R) are emphasized in orange. The corresponding Cα positions in the NLS (223RKRKR227) are shown as cyan (DNA bound state) and red lines (importin α2 bound state). ‘Expanded linker’ (206–230) includes the traditional linker (206–222 including the linker helix) and the proceeding extended region (223–230 including the NLS). (c) Arg227 of Oct4 in the DNA bound state and importin α bound state. DNA (yellow ribbon with carbon atom in yellow) and importin α2 (cyan surface) are drawn in the same picture. Oct4 in the DNA bound and importin α bound state is indicated in the same color (green and purple).
Figure 4. Structural differences between DNA bound state (green) and importin α bound state (purple) of Oct4.

The structure of the monomers of Oct4 in the complexes of Oct4 bound to DNA(Esch et al., Citation2013) and bound to importin α were compared to look for differences between the two bound forms (Figure ). In the structure of Oct4 bound to importin α, the space between the POUS and POUHD was increased when compared with the structure of Oct4 bound to DNA (Figure (a)). This is mainly caused by a substantial movement of the whole POUHD relative to the POUS (a maximum of about 20–22 Å). This substantial structural change accompanied by the rotation of the POUHD is mainly caused by main-chain rotations in the flexible linker region. If the structure of Oct4 remained as observed in the structure bound to DNA, the POUS and POUHD would clash with the importin α, and would be unable to bind to importin α (see also Figures (a) and 3(b)). Therefore, this major structural change at the domain level is essential for binding to importin α and/or DNA.

We have also compared the two Oct4 complexes with respect to the local structure in the highly flexible region (residues 206–230) including the linker connecting the POUS and POUHD (residues 206–222, based on the definition described in Ref. (Esch et al., Citation2013)) and the sequence immediately C-terminal (residues 223–230 which are N-terminal to the POUHD, and include NLS223–227) (Figure (b)). As a result, it is demonstrated that the directions of side chains of residues including those of the NLS (223RKRKR227) are significantly different between Oct4 bound to DNA and bound to importin α (cyan and red lines in Figure (b)). These changes are caused not only by a difference in the rotation angles χ of the side chains but also structural changes in the main chain per se between both structures. The main chain of residues 217–219 is invisible in the crystal structure of chain A of Oct4 bound to DNA (green dashed line in Figure (b)). In addition, all three of the peptide bonds between 221Q-222A, 222A-223R, and 224 K-225R are cis peptide bonds (ω ≈ 0°) (orange arrows in Figure (b)); however this is limited to Chain A (all the bonds in Chain B are trans). The vast majority of cis peptide bonds involve proline residues, specifically at X-Pro, X being any amino acid. Non-Pro X-non Pro X cis bonds occur much less frequently than X-Pro (Pal & Chakrabarti, Citation1999). Based on the above, it is clear that the main chain of the linker and its C-terminal region (residues 206–230) has a special structure with an extremely high mobility. The direction of side chains can be drastically changed due to this flexibility. The data demonstrate that some functional side chains of Oct4, including side chains essential for DNA binding and/or for nuclear localization, face opposite directions in the DNA bound and importin α bound states. For example, Arg227 which is contained in the NLS and also binds to a base of DNA, inverts between binding to DNA and binding to importin α (Figure (c)). It is likely that such local structural changes, i.e. inversion of side chains associated with structural changes in the main chain, are important for the function of Oct4.

Finally, we determined whether NLS structures bound to DNA and importin α are actually contained in the structure of Oct4 in solution. Twenty NMR structures (Morita, Shirakawa, Hayashi, Imagawa, & Kyogoku, Citation1995) of the linker and POUHD (residues 217–282) of Oct4 were used as the structure in solution (Video S3). However, these structures contain no POUS; therefore, it is important to bear in mind that the mobility of the structure of the linker and C-terminal end is higher than that in the crystal structure (Esch et al., Citation2013). Twenty NLS NMR structures were compared to NLS structures bound to DNA and bound to importin α (Figure S5). The structure of NMR No. 15 was found to be closest to the structure of the NLS bound to DNA (main chain RMSD = 1.56 Å). In addition, the structure of NMR No. 11 was found to be the closest to the NLS structure bound to importin α (main chain RMSD = 1.10 Å). As mentioned above, it was shown that both the structures of the NLS of Oct4 bound to DNA and bound to importin α could be formed in solution (Figure S5 and Video S3).

3.6. The linker and C-terminal region as an intrinsically disordered region

Recently it has become clear that many nuclear proteins, including TFs such as Oct4, epigenetic factors, and factors related to DNA replication and repair, are intrinsically disordered proteins (IDPs), a feature characteristic of eukaryotic organisms (Beh, Colwell, & Francis, Citation2012; Sandhu, Citation2009; Uversky & Dunker, Citation2010). We have proposed that the NLSs of these proteins (including Oct4) might be intrinsically disordered regions (IDRs), characterized by the so-called coupled folding and binding mechanism (Sugase, Dyson, & Wright, Citation2007), in which a polypeptide folds into a particular structure through interaction with a target molecule (although NLSs barely form a particular structure by themselves) (Yamagishi et al., Citation2015). As the group of Wilmanns and Schöler et al. has already demonstrated, an α-helix (residues 207–213) included in the linker of Oct4 (residues 206–222) (Figure S1 and Figure (b)) changes its structure to become exposed to the surface along with binding to DNA, and recruits many proteins such as epigenetic factors (e.g. chromatin remodeling complexes) that are crucial for reprogramming to pluripotency (Esch et al., Citation2013). Based on this significant finding, it is contended that the linker, especially the α-helix contained in the linker, is a functionally important IDR. However, we propose that not only this linker (residues 206–222) but also the following residues (223–230) have a very important function. This is because 1) This region contains NLS223–227 as well as amino acids interacting with DNA bases (Arg227, and probably Arg225, too, although the tip of the Arg225 side chain cannot be seen in the crystal structure). 2) The structure of this region is different when binding to DNA and when binding to importin α. In other words, the structure of this region can change depending on the function it is needed to fulfill. 3) To realize the flexibility of the main chain of the entire linker, the main chain can rotate at the ω angle level in addition to the φ and ψ angle levels. Based on these reasons we have designated this flexible region (residues 206–230) as an ‘expanded linker’ region (Figure (b)). This region encompasses the traditional linker (residues 206–222) which connects the POUS and POUHD and the proceeding extended region (residues 223–230) which includes the NLS (223RKRKR227) which is N-terminal to the POUHD. Here, we newly propose that the entire expanded linker region is an IDR and has an extremely important role in the variety of functions of Oct4. Thus, we propose that this extremely flexible region enables interaction with a variety of partner molecules (Jerabek et al., Citation2014) (DNA, other TFs such as SOX2 and NANOG, epigenetic factors (e.g. chromatin remodeling complexes such as BAF and NuRD), co-activator complexes and components of the basal transcription machinery) and enables recruitment of these molecules to the target genes.

3.7. Nuclear transport and delivery to DNA

As discussed above, it is predicted that the structure of Oct4 is different when bound to importin α and transported into the nucleus and when bound to DNA. Oct4 cannot bind to DNA without changing the structure observed when bound to importin α even if Oct4 is released from the importin α-Oct4 complex and delivered to DNA. It is estimated, based on the previous discussion (Figures (a–c)), that the following structural change (Figure ) occurs after release from importin α. After the release, main-chain structural changes associated with φ, ψ and ω rotations occur at two sites in the ‘expanded linker’ of Oct4, i.e. (i) around residues 217–219 (a region with a high mobility and poor electron density in the crystal structure) and (ii) around residues 221–225 (a region, which could form a cis peptide bond). These changes cause a rotation of the POUHD, and an optimal space to sandwich DNA is created because of the approach of POUHD and POUS relative to each other and narrowing of the distance between the POUS and POUHD. At the same time, side chains required for binding to DNA also change orientation to create a local site advantageous for interaction with DNA. It is surprising that a relatively large structural change (rotation of a domain) and a relatively small structural change (flip-flop of side chains) can be caused at the same time by only local and minimal changes, i.e. main-chain rotations at two sites in the expanded linker. We propose that these efficient structural changes in Oct4 are important for its effective delivery to DNA and to enable it to exert its function.

Figure 5. A possible model for nuclear transport and delivery of Oct4 to the DNA.

After release of Oct4 from importin α, conformational changes in the ‘extended linker’ region (green) of Oct4 are likely to occur as follows. (i) Rotation of main-chain including not only φ, ψ but also ω angles. (ii) Rotation of POUHD and consequently approach of POUS and POUHD, which allows a narrow space for DNA-binding. (iii) Flip-flop of functionally significant side-chains such as Arg225 and Arg227. (ii) and (iii) occur in concert with (i). Oct4 could be delivered to target DNA with minimal conformational changes in the extended linker region. See also Figure 4.
Figure 5. A possible model for nuclear transport and delivery of Oct4 to the DNA.

Most of information about the intranuclear environment and diffusion rate of Oct4 and importin in the nucleus remains unclear. In addition, detailed molecular mechanisms of dissociation of Oct4 from the importin α/β/Oct4 ternary complex are also unknown. However, the hypothesis that importin α functions to guide Oct4 near to the target genes and deliver Oct4 to DNA in an appropriate orientation in vivo means that the efficient delivery of Oct4 to DNA becomes possible via minimal conformational change in accordance with the model shown in Figure . Thus, Oct4 is captured by importin α from the ventral side and is imported into the nucleus. After release from importin α when it approaches target DNA, Oct4 is directly delivered to DNA as it is from the dorsal side (Figure and see also lower lefts of Figures (a) and (b)). At least in living cells, importin α is often detected inside the nucleus in immunohistochemical studies (for example, see Ref. Major et al., Citation2015) Figure ), showing that not all of the importin α molecules are immediately excluded from the nucleus after translocation through the nuclear pore, thus importin α is able to migrate and reach DNA to release Oct4.

In this study, we have developed a model whereby Oct4 binds to a major site of importin α. In addition, we have compared and discussed the structure of Oct4 as one molecule in a homodimer bound to the PORE sequence motif of DNA; however, the structure bound to the MORE (More palindromic Oct factor Recognition Elements) sequence motif should also be examined. Although it has proven difficult to experimentally determine the structure of whole proteins bound to the major site of importin α until now, the model in this study is beneficial for the understanding the molecular mechanism of the nuclear transport process of Oct4 and of delivery of Oct4 to DNA. If we could elucidate these mechanisms in detail and be able to control these processes, it could lead to the development of new methods for cellular reprogramming and new therapies for diseases characterized by deregulated gene expression such as some forms of cancer.

Supplementary material

The supplementary material for this paper is available online at http://dx.doi.org/10.1080/07391102.2017.1289124.

Author contributions

H.K. designed research; T.O., R.Y. and J.S. performed research; M.I. and Y.M. analyzed data; and H.K. wrote the paper.

Disclosure statement

No potential conflict of interest was reported by the authors.

Funding

This work was supported by JSPS KAKENHI [grant number 26280108].

Supplemental material

Supplementary_Information_11.1b.docx

Download MS Word (13.6 MB)

Acknowledgements

We thank Noriko Nakagawa, Masashi Kimura, and Masami Nakada for preliminary calculation and data analysis; Drs. Noriko Yasuhara and Yoshihiro Yoneda for useful discussions and critical reading of the manuscript.

References

  • Bayliss, R., Littlewood, T., & Stewart, M. (2000). Structural basis for the interaction between FxFG nucleoporin repeats and importin-β in nuclear trafficking. Cell, 102, 99–108. doi:10.1016/S0092-8674(00)00014-310.1016/S0092-8674(00)00014-3
  • Beh, L. Y., Colwell, L. J., & Francis, N. J. (2012). A core subunit of Polycomb repressive complex 1 is broadly conserved in function but not primary sequence. Proceedings of the National Academy of Sciences, 109, E1063–E1071. doi:10.1073/pnas.1118678109
  • Best, R. B., Zheng, W., & Mittal, J. (2014). Balanced protein-water interactions improve properties of disordered proteins and non-specific protein association. Journal of Chemical Theory and Computation, 10, 5113–5124. doi:10.1021/ct500569b
  • Bhowmick, A., Brookes, D. H., Yost, S. R., Dyson, H. J., Forman-Kay, J. D., Gunter, D., … Head-Gordon, M. (2016). Finding our way in the dark proteome. Journal of the American Chemical Society, 138, 9730–9742. doi:10.1021/jacs.6b06543
  • Case, D. A., Darden, T. A., Cheatham, T. E., Simmerling, C., Wang, J., Duke, R. E., … Kollman, P. A. (2010). AMBER11. San Francisco: University of California.
  • Chen, X., Vinkemeier, U., Zhao, Y., Jeruzalmi, D., Darnell, J. E., & Kuriyan, J. (1998). Crystal structure of a tyrosine phosphorylated STAT-1 dimer bound to DNA. Cell, 93, 827–839. doi:10.1016/S0092-8674(00)81443-9
  • Comeau, S. R., Gatchell, D. W., Vajda, S., & Camacho, C. J. (2004a). ClusPro: A fully automated algorithm for protein-protein docking. Nucleic Acids Research, 32, W96-9. doi:10.1093/nar/gkh354
  • Comeau, S. R., Gatchell, D. W., Vajda, S., & Camacho, C. J. (2004b). ClusPro: An automated docking and discrimination method for the prediction of protein complexes. Bioinformatics, 20, 45–50. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/1469380710.1093/bioinformatics/btg371
  • Ding, J., Xu, H., Faiola, F., & Ma’ayan, A., & Wang, J. (2012). Oct4 links multiple epigenetic pathways to the pluripotency network. Cell Research, 22, 155–167. doi:10.1038/cr.2011.179.
  • Esch, D., Vahokoski, J., Groves, M. R., Pogenberg, V., Cojocaru, V., vom Bruch, H., … Schöler, H. R. (2013). A unique Oct4 interface is crucial for reprogramming to pluripotency. Nature Cell Biology, 15, 295–301. doi:10.1038/ncb2680
  • Feng, B., Jiang, J., Kraus, P., Ng, J.-H., Heng, J.-C. D., Chan, Y.-S., … Ng, H.-H. (2009). Reprogramming of fibroblasts into induced pluripotent stem cells with orphan nuclear receptor Esrrb. Nature Cell Biology, 11, 197–203. doi:10.1038/ncb1827
  • Fontes, M. R. M., Teh, T., Toth, G., John, A., Pavo, I., Jans, D. A., & Kobe, B. (2003). Role of flanking sequences and phosphorylation in the recognition of the simian-virus-40 large T-antigen nuclear localization sequences by importin-α. Biochemical Journal, 375, 339–349. doi:10.1042/BJ20030510
  • Goldfarb, D. S., Corbett, A. H., Mason, D. A., Harreman, M. T., & Adam, S. A. (2004). Importin α: A multipurpose nuclear-transport receptor. Trends in Cell Biology, 14, 505–514. doi:10.1016/j.tcb.2004.07.016
  • Ho, L., Jothi, R., Ronan, J. L., Cui, K., Zhao, K., & Crabtree, G. R. (2009). An embryonic stem cell chromatin remodeling complex, esBAF, is an essential component of the core pluripotency transcriptional network. Proceedings of the National Academy of Sciences, 106, 5187–5191. doi:10.1073/pnas.0812888106
  • Jeong, S. A., Kim, K., Lee, J. H., Cha, J. S., Khadka, P., Cho, H.-S., & Chung, I. K. (2015). Akt-mediated phosphorylation increases the binding affinity of hTERT for importin α to promote nuclear translocation. Journal of Cell Science, 128, 2287–2301. doi:10.1242/jcs.166132
  • Jerabek, S., Merino, F., Schöler, H. R., & Cojocaru, V. (2014). OCT4: Dynamic DNA binding pioneers stem cell pluripotency. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms, 1839, 138–154. doi:10.1016/j.bbagrm.2013.10.001
  • Kozakov, D., Beglov, D., Bohnuud, T., Mottarella, S. E., Xia, B., Hall, D. R., & Vajda, S. (2013). How good is automated protein docking? Proteins: Structure, Function, and Bioinformatics, 81, 2159–2166. doi:10.1002/prot.24403
  • Kozakov, D., Brenke, R., Comeau, S. R., & Vajda, S. (2006). PIPER: An FFT-based protein docking program with pairwise potentials. Proteins: Structure, Function, and Bioinformatics, 65, 392–406. doi:10.1002/prot.21117
  • Latchman, D. S. (2015). Gene control (2nd ed., pp. 159–204). New York, NY: Garland Science.
  • Li, X., Sun, L., & Jin, Y. (2008). Identification of karyopherin-alpha 2 as an Oct4 associated protein. Journal of Genetics and Genomics, 35, 723–728. doi:10.1016/S1673-8527(08)60227-1
  • Loveland, K. L., Major, A. T., Butler, R., Young, J. C., Jans, D. A., & Miyamoto, Y. (2015). Putting things in place for fertilization: Discovering roles for importin proteins in cell fate and spermatogenesis. Asian Journal of Andrology, 17, 537–544. doi:10.4103/1008-682X.154310
  • Major, A. T., Hogarth, C. A., Miyamoto, Y., Sarraj, M. A., Smith, C. L., Koopman, P., … Loveland, K. L. (2015). Specific interaction with the nuclear transporter importin α2 can modulate paraspeckle protein 1 delivery to nuclear paraspeckles. Molecular Biology of the Cell, 26, 1543–1558. doi:10.1091/mbc.E14-01-0678
  • Matsuura, Y., & Stewart, M. (2005). Nup50/Npap60 function in nuclear protein import complex disassembly and importin recycling. The EMBO Journal, 24, 3681–3689. doi:10.1038/sj.emboj.7600843
  • Morita, E. H., Shirakawa, M., Hayashi, F., Imagawa, M., & Kyogoku, Y. (1995). Structure of the Oct-3 POU-homeodomain in solution, as determined by triple resonance heteronuclear multidimensional NMR spectroscopy. Protein Science: A Publication of the Protein Society, 4, 729–739. doi:10.1002/pro.5560040412
  • Mosammaparast, N., & Pemberton, L. F. (2004). Karyopherins: From nuclear-transport mediators to nuclear-function regulators. Trends in Cell Biology, 14, 547–556. doi:10.1016/j.tcb.2004.09.004
  • Nakagawa, M., Koyanagi, M., Tanabe, K., Takahashi, K., Ichisaka, T., Aoi, T., … Yamanaka, S. (2008). Generation of induced pluripotent stem cells without Myc from mouse and human fibroblasts. Nature Biotechnology, 26, 101–106. doi:10.1038/nbt1374
  • Nardozzi, J., Wenta, N., Yasuhara, N., Vinkemeier, U., & Cingolani, G. (2010). Molecular Basis for the Recognition of Phosphorylated STAT1 by Importin α5. Journal of Molecular Biology, 402, 83–100. doi:10.1016/j.jmb.2010.07.013
  • Oka, M., Moriyama, T., Asally, M., Kawakami, K., & Yoneda, Y. (2013). Differential role for transcription factor Oct4 nucleocytoplasmic dynamics in somatic cell reprogramming and self-renewal of embryonic stem cells. Journal of Biological Chemistry, 288, 15085–15097. doi:10.1074/jbc.M112.448837
  • Pal, D., & Chakrabarti, P. (1999). Cis peptide bonds in proteins: Residues involved, their conformations, interactions and locations. Journal of Molecular Biology, 294, 271–288. doi:10.1006/jmbi.1999.3217
  • Pan, G., Qin, B., Liu, N., Schöler, H. R., & Pei, D. (2004). Identification of a nuclear localization signal in OCT4 and generation of a dominant negative mutant by its ablation. Journal of Biological Chemistry, 279, 37013–37020. doi:10.1074/jbc.M405117200
  • Pardo, M., Lang, B., Yu, L., Prosser, H., Bradley, A., Babu, M. M., & Choudhary, J. (2010). An expanded Oct4 interaction network: Implications for stem cell biology, development, and disease. Cell Stem Cell, 6, 382–395. doi:10.1016/j.stem.2010.03.004
  • Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., & Ferrin, T. E. (2004). UCSF Chimera-A visualization system for exploratory research and analysis. Journal of Computational Chemistry, 25, 1605–1612. doi:10.1002/jcc.20084
  • Piana, S., Donchev, A. G., Robustelli, P., & Shaw, D. E. (2015). Water dispersion interactions strongly influence simulated structural properties of disordered protein states. The Journal of Physical Chemistry B, 119, 5113–5123. doi:10.1021/jp508971m
  • Pumroy, R. A., & Cingolani, G. (2015). Diversification of importin-α isoforms in cellular trafficking and disease states. Biochemical Journal, 466, 13–28. doi:10.1042/BJ20141186
  • Sandhu, K. S. (2009). Intrinsic disorder explains diverse nuclear roles of chromatin remodeling proteins. Journal of Molecular Recognition: JMR, 22(1), 1–8. doi:10.1002/jmr.915
  • Sekimoto, T., Imamoto, N., Nakajima, K., Hirano, T., & Yoneda, Y. (1997). Extracellular signal-dependent nuclear import of Stat1 is mediated by nuclear pore-targeting complex formation with NPI-1, but not Rch1. The EMBO Journal, 16, 7067–7077. doi:10.1093/emboj/16.23.7067
  • Shi, Y., Desponts, C., Do, J. T., Hahm, H. S., Schöler, H. R., & Ding, S. (2008). Induction of pluripotent stem cells from mouse embryonic fibroblasts by Oct4 and Klf4 with small-molecule compounds. Cell Stem Cell, 3, 568–574. doi:10.1016/j.stem.2008.10.004
  • Singhal, N., Graumann, J., Wu, G., Araúzo-Bravo, M. J., Han, D. W., Greber, B., … Schöler, H. R. (2010). Chromatin-remodeling components of the baf complex facilitate reprogramming. Cell, 141, 943–955. doi:10.1016/j.cell.2010.04.037
  • Stewart, M. (2007). Molecular mechanism of the nuclear protein import cycle. Nature Reviews. Molecular Cell Biology, 8, 195–208. doi:10.1038/nrm2114
  • Sugase, K., Dyson, H. J., & Wright, P. E. (2007). Mechanism of coupled folding and binding of an intrinsically disordered protein. Nature, 447, 1021–1025. doi:10.1038/nature05858
  • Takahashi, K., & Yamanaka, S. (2006). Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell, 126, 663–676. doi:10.1016/j.cell.2006.07.024
  • Uversky, V. N., & Dunker, A. K. (2010). Understanding protein non-folding. Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, 1804, 1231–1264. doi:10.1016/j.bbapap.2010.01.017
  • van den Berg, D. L. C., Snoek, T., Mullin, N. P., Yates, A., Bezstarosti, K., Demmers, J., … Poot, R. A. (2010). An Oct4-centered protein interaction network in embryonic stem cells. Cell Stem Cell, 6, 369–381. doi:10.1016/j.stem.2010.02.014
  • Xu, W., Edwards, M. R., Borek, D. M., Feagins, A. R., Mittal, A., Alinger, J. B., … Amarasinghe, G. K. (2014). Ebola virus VP24 targets a unique NLS binding site on karyopherin alpha 5 to selectively compete with nuclear import of phosphorylated STAT1. Cell Host & Microbe, 16, 187–200. doi:10.1016/j.chom.2014.07.008
  • Yamagishi, R., Okuyama, T., Oba, S., Shimada, J., Chaen, S., & Kaneko, H. (2015). Comprehensive analysis of the dynamic structure of nuclear localization signals. Biochemistry and Biophysics Reports, 4, 392–396. doi:10.1016/j.bbrep.2015.11.001
  • Yasuhara, N., Shibazaki, N., Tanaka, S., Nagai, M., Kamikawa, Y., Oe, S., … Yoneda, Y. (2007). Triggering neural differentiation of ES cells by subtype switching of importin-α. Nature Cell Biology, 9, 72–79. doi:10.1038/ncb1521
  • Yasuhara, N., Yamagishi, R., Arai, Y., Mehmood, R., Kimoto, C., Fujita, T., … Yoneda, Y. (2013). Importin alpha subtypes determine differential transcription factor localization in embryonic stem cells maintenance. Developmental Cell, 26, 123–135.10.1016/j.devcel.2013.06.022
  • Young, J. C., Major, A. T., Miyamoto, Y., Loveland, K. L., & Jans, D. A. (2011). Distinct effects of importin α2 and α4 on Oct3/4 localization and expression in mouse embryonic stem cells. The FASEB Journal, 25, 3958–3965. doi:10.1096/fj.10-176941