145
Views
1
CrossRef citations to date
0
Altmetric
Articles

Unsupervised Feature Selection Approach for Cancer Prediction

, &
Pages 1891-1896 | Published online: 03 Feb 2021
 

ABSTRACT

Detecting cancer at an early stage is an important application, which reduces the risk of death for a cancerous patient. Data from tissues is generated using microarray DNA technology and these data contains genes/attributes of very large number and samples of relatively small number, making it tough to predict the classifier. Hence, several machine learning algorithms with more emphasis on feature selection were proposed to solve the problem of handling large number of genes/attributes. Most of the feature selection (FS) algorithms mentioned in the literature are Supervised learning algorithms. Authors proposed an Unsupervised feature selection algorithm, making it more independent of the class label. Simple Feature ranking algorithm using Single Value Decomposition (SVD)-Entropy is the first step in the proposed feature selection algorithm. SVD-Entropy based method selects attributes independent of the each other, hence reduces complexity involved in multiple association of attributes in large datasets. At second stage, Correlation among attributes is used to remove attributes/features that are highly correlated to each other. Once the features are selected, a logistic regression model for predicting the class label. The model predicts whether its a cancer or non-cancer causing tissue. The proposed algorithm proved to be an efficient approach in terms of accurately predicting the cancer causing tissues. Experiments were carried on three different datasets like Ovarian, Lung and Breast cancer. The accuracy scores achieved for these datasets are 100, 75 and 96.4 percent, respectively, proving the efficiency of the approach.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Ummadi Janardhan Reddy

U Janardhan Reddy obtained in BTech (Computer Science and Engineering) from JNT University, Hyderabad in 2007. MTech (Computer Science and Engineering) from JNT University, Kakinada, in 2011. He is pursuing his doctoral degree in the area of data mining in the Department of computer science and engineering, JNTUA, Anantapur, India. He is an assistant professor in Vignan's University Guntur, Andhra Pradesh, India. His areas of interest include data mining, machine learning algorithm techniques in the field of bioinformatics and healthcare.

B. Venkata Ramana Reddy

B Venkata Ramana Reddy graduated in BTech (Computer Science and Engineering) from JNT University, Hyderabad in 2007. M.Tech (Computer Science and Engineering) from JNT University, Kakinada, in 2011. He is pursuing his doctoral degree in the area of data mining in the Department of Computer Science and Engineering, JNTUA, Anantapur, India. He is an assistant professor in Vignan's University Guntur, Andhra Pradesh, India. His areas of interest include data mining, machine learning algorithm techniques in the field of bioinformatics and healthcare. Email: [email protected]

B. Eswara Reddy

B Eswara Reddy has obtained BTech from Sri Krishna Devaraya University in 1995, MTech in software engineering from JNTU Hyderabad in 1999 and PhD in computer science and engineering in 2008 from JNTU Hyderabad. He has over than 20 years of teaching and research experience. He is a member of IEEE, CSI, ISTE, IE(I), ISCA, IAENG. He has acted as both UG and PG Board of Studies Chairman. He has served as NSS programme officer, officer in charge of examinations and computer center, IEEE student branch counselor, program chair for the international conference on Emerging Trends in Electrical, Communication and Information Technologies (ICECIT), coordinator for MSIT and Incubation center, Head of the Department, Vice Principal, President Teacher's Association and presently serving as Principal, JNTUA College of Engineering, Kalikiri. He published over 120 research papers in international conferences and journals. Eight scholars received PhD degree under his guidance. He has co-authored two engineering text books – Data Mining: Principles and Approaches Elsevier publishers, ISBN: 978-93-82291-49-7 and Programming with Java, Pearson/Sanguine Publishers, ISBN: 978-81-317-5834-2. He received the grant of 10,66,000/- from UGC and completed Major Research Project(MRP) titled ‘Cloud computing framework for rural health care in Indian scenario. He has visited New York, USA in April, 2016 for presenting the research paper in IEEE Big Data Security conference held at Columbia University. His areas of interest include pattern recognition and image analysis, data mining, cloud computing. Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 100.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.