Views

CrossRef citations to date

Altmetric

Report

Selection of target-binding proteins from the information of weakly enriched phage display libraries by deep sequencing and machine learning

Tomoyuki Itoa Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, Sendai, JapanView further author information

Thuy Duong Nguyenb Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, JapanView further author information

Yutaka Saitob Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan;c AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), Tokyo, Japan;d Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan;e Center for Advanced Intelligence Project, RIKEN, Tokyo, JapanView further author information

Yoichi Kurumidab Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, JapanView further author information

Hikaru Nakazawaa Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, Sendai, JapanView further author information

Sakiya Kawadaa Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, Sendai, JapanView further author information

Hafumi Nishif Department of Applied Information Sciences, Graduate School of Information Sciences, Tohoku University, Sendai, Japan;g Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan;h Faculty of Core Research, Ochanomizu University, Tokyo, JapanView further author information

Koji Tsudad Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan;e Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan;i Research and Services Division of Materials Data and Integrated Systems, National Institute for Materials Science, Tsukuba, JapanView further author information

Tomoshi Kamedab Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan;e Center for Advanced Intelligence Project, RIKEN, Tokyo, JapanCorrespondence[email protected]

https://orcid.org/0000-0001-9508-5366 View further author information

Mitsuo Umetsua Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, Sendai, Japan;e Center for Advanced Intelligence Project, RIKEN, Tokyo, JapanCorrespondence[email protected]

https://orcid.org/0000-0003-4390-0263 View further author information

show all

Figures & data

Figure 1. Three-dimensional structure of the entire sequence of 2u2f. The two randomized loops are in red.

Scaffold protein structure where the randomized region is colored red.

Figure 2. Workflow of biopanning. At each round, 1) target-bound phages were selected, 2) E. coli was infected with selected phages, and 3) phages were amplified in E. coli. Sub-libraries are surrounded by colored ellipses.

The workflow of four rounds selecting target-bound phages from initial phages and amplified phages.

Figure 3. Distribution of unique sequences in each sub-library. The frequency of unique sequences is shown for single reads in gray, 2–10 reads in blue, 11–100 reads in green, 101–200 reads in yellow, 201–1000 reads in brown, and >1000 reads in red.

The frequency of unique sequences in sub-libraries which are phage pools or E. coli collected during biopanning.

Figure 4. Amino acid frequencies and rank distribution of the sequences predicted by machine learning. (a) Amino acid frequencies of top 10,000 sequences predicted by machine learning, visualized by WebLogo.^Citation41 (b) Amino acid frequencies of clustered sequences. (c) Rank distribution of each cluster. Black arrows indicate clusters containing the top 1,000 sequences.

The top 10,000 sequences predicted by machine learning were clustered into nine distinct sequence patterns. Each cluster had a distribution with different averages.

Figure 5. Binding function of wild-type 2u2f and obtained 2u2f variants. (a) Enzyme-linked immunosorbent assay of the candidate 2u2f variants after purification on galectin-3 (Gal), NeutrAvidin (NAV), or blocking buffer (Skim). (b) EC₅₀ values of wild-type 2u2f and four functional variants with affinity to galectin-3. The plots show the absorbance of galectin-3 minus that of NAV. The EC₅₀ values were determined by using Hill equation.

In an enzyme-linked immunosorbent assay of the candidate 2u2f variants, four 2u2f variants are specifically bound to the target molecule with the EC 50 values of 93 nanomolar, 80 nanomolar, 277 nanomolar, and 201 nanomolar.

Figure 6. CD spectra of the functional 2u2f variants. Wild-type 2u2f is shown in blue, 1E2 in Orange, 1H2 in red, 3B5 in gray, and 4H5 in magenta.

Circular dichroism spectra showing that the secondary structures of four prospective variants were similar to those of wild-type 2u2f.

Supplemental material

Supplemental Material

Download Zip (3.4 MB)

Selection of target-binding proteins from the information of weakly enriched phage display libraries by deep sequencing and machine learning

Supplemental Material

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Selection of target-binding proteins from the information of weakly enriched phage display libraries by deep sequencing and machine learning

Figures & data

Supplemental Material

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date