134
Views
12
CrossRef citations to date
0
Altmetric
Computers & Computing

Dynamic Deep Genomics Sequence Encoder for Managed File Transfer

ORCID Icon, ORCID Icon & ORCID Icon

References

  • E. Pettersson, J. Lundeberg, and A. Ahmadian, “Generations of sequencing technologies,” Genomics, Vol. 93, no. 2, pp. 105–111, 2009.
  • S. A. Byron, K. R. Van Keuren-Jensen, D. M. Engelthaler, J. D. Carpten, and D. W. Craig, “Translating RNA sequencing into clinical diagnostics: Opportunities and challenges,” Nat. Rev. Genet., Vol. 17, no. 5, pp. 257–271, 2016.
  • A. Bayle, et al., “Whole exome sequencing in molecular diagnostics of cancer decreases over time: Evidence from a cost analysis in the French setting,” Eur. J. Heal. Econ., Vol. 22, no. 6, pp. 855–864, 2021.
  • C. Bleidorn, “Sequencing techniques,” in Phylogenomics, C. Bleidorn, Ed. Madrid: Springer, 2017, pp. 43–60.
  • A. N. Kho, et al., “Practical challenges in integrating genomic data into the electronic health record,” Genet. Med., Vol. 15, no. 10, pp. 772–778, 2013.
  • H. Tang, et al., “Protecting genomic data analytics in the cloud: State of the art and opportunities,” BMC Med. Genom., Vol. 9, no. 1, pp. 1–9, 2016.
  • B. Langmead, and A. Nellore, “Cloud computing for genomic data analysis and collaboration,” Nat. Rev. Genet., Vol. 19, no. 4, pp. 208–219, 2018.
  • H. Xu, “Big data challenges in genomics,” in Handbook of Statistics, Vol. 43, Arni S. R. Srinivasa Rao, and C. R. Rao, Ed. Amsterdam: Elsevier, 2020, pp. 337–348.
  • K. Tomczak, P. Czerwińska, and M. Wiznerowicz, “The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge,” Contemp. Oncol., Vol. 19, no. 1A, pp. A68, 2015.
  • Z. Wang, M. A. Jensen, and J. C. Zenklusen, “A practical guide to the cancer genome atlas (TCGA),” in Statistical Genomics, E. Mathé, and S. Davis, Ed. New York: Humana Press/Springer, 2016, pp. 111–141.
  • V. Marx, “The big challenges of big data,” Nature, Vol. 498, no. 7453, pp. 255–260, 2013.
  • R. T. Fielding, and G. Kaiser, “The Apache HTTP server project,” IEEE Internet Comput., Vol. 1, no. 4, pp. 88–90, 1997.
  • S. Praveen, and S. Sidhwa. “Introduction to Filezilla,” 2017.
  • D. Wannipurage, I. Ranawaka, E. Abeysinghe, M. Christie, S. Marru, and M. Pierce. “A Multi-Protocol, Secure, and Dynamic Data Storage Integration Frameworkfor Multi-tenanted Science Gateway Middleware,” arXiv Prepr. arXiv2107.03882, 2021.
  • K. Bhuvaneshwar, et al., “A case study for cloud based high throughput analysis of NGS data using the globus genomics system,” Comput. Struct. Biotechnol. J., Vol. 13, pp. 64–74, 2015.
  • A. Ng, P. Greenfield, and S. Chen. “A study of the impact of compression and binary encoding on SOAP performance,” in Proceedings of the Sixth Australasian Workshop on Software and System Architectures (AWSA2005), 2005, pp. 46–56.
  • M. D. S. Q. Z. Nine, and T. Kosar, “A two-phase dynamic throughput optimization model for big data transfers,” IEEE Trans. Parallel Distrib. Syst., Vol. 32, no. 2, pp. 269–280, 2021.
  • D. Yin, E. Yildirim, S. Kulasekaran, B. Ross, and T. Kosar, “A data throughput prediction and optimization service for widely distributed many-task computing,” IEEE Trans. Parallel Distrib. Syst., Vol. 22, no. 6, pp. 899–909, 2010.
  • J. Kim, E. Yildirim, and T. Kosar. “A highly-accurate and low-overhead prediction model for transfer throughput optimization,” in S.C. Companion: High Performance Computing, Networking Storage and Analysis, 2012, pp. 787–795.
  • E. Yildirim, D. Yin, and T. Kosar, “Prediction of optimal parallelism level in wide area data transfers,” IEEE Trans. Parallel Distrib. Syst., Vol. 22, no. 12, pp. 2033–2045, 2011.
  • S. Deorowicz, and S. Grabowski, “Compression of DNA sequence reads in FASTQ format,” Bioinformatics., Vol. 27, no. 6, pp. 860–862, 2011.
  • S. W. Hodson, S. W. Poole, T. M. Ruwart, and B. W. Settlemyer. “Moving large data sets over high-performance long distance networks,” Oak Ridge Natl. Lab, 2011. Available: http://info.ornl.gov/sites/publications/files/Pub28508.pdf
  • J. L. Watson. “Changing destinies: An overview of the Human Genome Project,” 1999.
  • R. A. Power, J. Parkhill, and T. de Oliveira, “Microbial genome-wide association studies: Lessons from human GWAS,” Nat. Rev. Genet., Vol. 18, no. 1, pp. 41–50, 2017.
  • T. A. Manolio, and F. S. Collins, “The HapMap and genome-wide association studies in diagnosis and therapy,” Annu. Rev. Med., Vol. 60, pp. 443–456, 2009.
  • I. Lappalainen, et al., “The European genome-phenome archive of human data consented for biomedical research,” Nat. Genet., Vol. 47, no. 7, pp. 692–695, 2015.
  • A. K. Parekh, and R. G. Gallager, “A generalized processor sharing approach to flow control in integrated services networks: The single-node case,” IEEE/ACM Trans. Netw., Vol. 1, no. 3, pp. 344–357, 1993.
  • M. Alizadeh, et al. “Data center TCP (DCTCP),” in Proceedings of the ACM SIGCOMM 2010 Conference, 2010, pp. 63–74.
  • S. Floyd, “TCP and explicit congestion notification,” ACM SIGCOMM Comput. Commun. Rev., Vol. 24, no. 5, pp. 8–23, 1994.
  • B. Vamanan, J. Hasan, and T. N. Vijaykumar, “Deadline-aware datacenter tcp (d2tcp),” ACM SIGCOMM Comput. Commun. Rev., Vol. 42, no. 4, pp. 115–126, 2012.
  • Y. Ren, J. Li, S. Shi, L. Li, G. Wang, and B. Zhang, “Congestion control in named data networking – a survey,” Comput. Commun., Vol. 86, pp. 1–11, 2016.
  • B. Teitelbaum, S. Hares, L. Dunn, R. Neilson, V. Narayan, and F. Reichmeyer, “Internet2 QBone: building a testbed for differentiated services,” IEEE Netw., Vol. 13, no. 5, pp. 8–16, 1999.
  • Y. Rathore, M. K. Ahirwar, and R. Pandey, “A brief study of data compression algorithms,” Int. J. Comput. Sci. Inf. Secur., Vol. 11, no. 10, pp. 86, 2013.
  • A. J. Pinho, and D. Pratas, “MFCompress: a compression tool for FASTA and multi-FASTA data,” Bioinformatics., Vol. 30, no. 1, pp. 117–118, 2014.
  • J. Seward. “Bzip2 and libbzip2: a program and library for data compression,” 1998. htpp://sources.redhat.com/bzip2.
  • S. Grumbach, and F. Tahi, “A new challenge for compression algorithms: genetic sequences,” Inf. Process. Manag., Vol. 30, no. 6, pp. 875–886, 1994.
  • A. Legout, N. Liogkas, E. Kohler, and L. Zhang, “Clustering and sharing incentives in bittorrent systems,” ACM SIGMETRICS Perform. Eval. Rev., Vol. 35, no. 1, pp. 301–312, 2007.
  • C. Wilks, D. Maltbie, M. Diekhans, and D. Haussler, “Cghub: Kick-starting the worldwide genome web,” Proc. Asia Pac. Adv. Netw., Vol. 35, pp. 1–13, 2013.
  • R. T. Fielding, and R. N. Taylor, “Principled design of the modern web architecture,” ACM Trans. Internet Technol., Vol. 2, no. 2, pp. 115–150, 2002.
  • J. C. Mogul, F. Douglis, A. Feldmann, and B. Krishnamurthy. “Potential benefits of delta encoding and data compression for HTTP,” in Proceedings of the ACM SIGCOMM’97 conference on Applications, technologies, architectures, and protocols for computer communication, 1997, pp. 181–194.
  • L. Mamatas, I. Matta, P. Papadimitriou, and Y. Koucheryavy. Wired/Wireless Internet Communications: 14th IFIP WG 6.2 International Conference, WWIC 2016, Thessaloniki, Greece, May 25–27, 2016, Proceedings, vol. 9674. Springer, 2016.
  • R. Vaishya, M. Javaid, I. H. Khan, and A. Haleem, “Artificial intelligence (AI) applications for COVID-19 pandemic,” Diabetes Metab. Syndr. Clin. Res. Rev., Vol. 14, no. 4, pp. 337–339, 2020.
  • J. S. Marron, “Big data in context and robustness against heterogeneity,” Econ. Stat., Vol. 2, pp. 73–80, 2017.
  • E. N. Gilbert, and E. F. Moore, “Variable-length binary encodings,” Bell Syst. Tech. J., Vol. 38, no. 4, pp. 933–967, 1959.
  • S. D. Wickramaratne, and M. D. S. Mahmud. “Bi-directional gated recurrent unit based ensemble model for the early detection of sepsis,” in 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2020, pp. 70–73.
  • A. Bhuvaneswari, J. T. J. Thomas, and P. Kesavan, “Embedded bi-directional GRU and LSTM learning models to predict disasterson Twitter data,” Proc. Comput. Sci., Vol. 165, pp. 511–516, 2019.
  • A. Shewalkar, D. Nyavanandi, and S. A. Ludwig, “Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU,” J. Artif. Intell. Soft Comput. Res., Vol. 9, no. 4, pp. 235–245, 2019.
  • S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon. “Cbam: convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–19.
  • A. Kaur, A. P. S. Chauhan, and A. K. Aggarwal, “An automated slice sorting technique for multi-slice computed tomography liver cancer images using convolutional network,” Expert Syst. Appl., Vol. 186, p. 115686, 2021.
  • K. Cho, et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv Prepr. arXiv1406.1078, 2014.
  • P. Zhou, et al. “Attention-based bidirectional long short-term memory networks for relation classification,” in Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers), 2016, pp. 207–212.
  • Y. Zheng, B. K. Iwana, and S. Uchida, “Mining the displacement of max-pooling for text recognition,” Pattern Recognit., Vol. 93, pp. 558–569, 2019.
  • D. Karolchik, et al., “The UCSC genome browser database,” Nucleic Acids Res., Vol. 31, no. 1, pp. 51–54, 2003.
  • E. W. Sayers, et al., “Database resources of the national center for biotechnology information,” Nucleic Acids Res., Vol. 39, no. Suppl. 1, pp. D38–D51, 2010.
  • B. Chapman, and J. Chang, “Biopython: python tools for computational biology,” ACM Sigbio Newsl., Vol. 20, no. 2, pp. 15–19, 2000.
  • B. Rekepalli, and A. Vose. “Petascale genomic sequence search,” 2011.
  • D. Karolchik, et al., “The UCSC table browser data retrieval tool,” Nucleic Acids Res., Vol. 32, no. suppl_1, pp. D493–D496, 2004.
  • J. Zhou, and K. E. Rudd, “Ecogene 3.0,” Nucleic Acids Res., Vol. 41, no. D1, pp. D613–D624, 2012.
  • K. R. Rosenbloom, et al., “The UCSC genome browser database: 2015 update,” Nucleic Acids Res., Vol. 43, no. D1, pp. D670–D681, 2015.
  • C. M. Hudson, and K. P. Williams, “The tmRNA website,” Nucleic Acids Res., Vol. 43, no. D1, pp. D138–D140, 2015.
  • T. D. Lee, H. Yang, J. Whang, and S. C. Lu, “Cloning and characterization of the human glutathione synthetase 5’-flanking region,” Biochem. J., Vol. 390, no. 2, pp. 521–528, 2005.
  • D. L. Wheeler, et al., “Database resources of the National Center for Biotechnology,” Nucleic Acids Res., Vol. 31, no. 1, pp. 28–33, 2003.
  • T. Barrett, et al., “NCBI GEO: Archive for functional genomics data sets – update,” Nucleic Acids Res., Vol. 41, no. D1, pp. D991–D995, 2012.
  • M. Aledhari, M. Di Pierro, M. Hefeida, and F. Saeed, “A deep learning-based data minimization algorithm for fast and secure transfer of big genomic datasets,” IEEE Trans. Big Data, Vol. 7, no. 2, pp. 271–284, 2021.
  • M. Kappelmann-Fenzl, “Reference genome,” in Next Generation Sequencing and Data Analysis, M. Kappelmann-Fenzl, Ed. Deggendorf, Germany: Springer, 2021, pp. 105–109.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.