Search in:

Applied Artificial Intelligence

An International Journal

Volume 31, 2017 - Issue 9-10

Submit an article Journal homepage

Free access

615

Views

CrossRef citations to date

Altmetric

Original Articles

Feature Selection via Pareto Multi-objective Genetic Algorithms

Newton SpolaôrLaboratory of Computational Intelligence (LABIC), University of São Paulo (USP), São Carlos, Brazil;Laboratory of Bioinformatics (LABI), Western Paraná State University (UNIOESTE), Foz do Iguaçu, BrazilCorrespondence[email protected]

http://orcid.org/0000-0003-0748-3693 View further author information

Ana Carolina LorenaScience and Technology Institute (STI), Federal University of São Paulo (UNIFESP), São José dos Campos, Brazil

http://orcid.org/0000-0002-6140-571X View further author information

Huei Diana LeeLaboratory of Bioinformatics (LABI), Western Paraná State University (UNIOESTE), Foz do Iguaçu, Brazil

http://orcid.org/0000-0002-2189-1047 View further author information

Pages 764-791 | Published online: 27 Feb 2018

Cite this article
https://doi.org/10.1080/08839514.2018.1444334
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF

Figures & data

Table 1. Data represented according to the feature-value format.

Display Table

Figure 1. Modules of the method developed in this work and their interactions. In particular, the monitor module manages the remaining components, which in turn perform feature selection and multi-objective optimization by a genetic algorithm.

Table 2. Importance measures employed. These measures are associated with distinct categories and can deal with Quantitative (QT) and Qualitative (QL) feature values.

Download CSV Display Table

Figure 2. Frequency of related papers per year of publication.

Figure 3. Percentage of related papers using specific numbers of datasets.

Figure 4. Categories of feature importance measures used in related work.

Figure 5. Experimental flow adopted in the evaluation of FS.

Table 3. Summary of the datasets used in this study: Australian (A), Crx (C), Dermatology (D), German (G), Ionosphere (I), Lung cancer (L), Promoter (P), Sonar (S), Soybean small (Y), Vehicle (V), Wisconsin Breast cancer (B) and Wine (W).!

Display Table

Figure 6. Comparison procedure based on error rate (axis X) and size of feature subsets (axis Y) (Lee, Monard, and Wu Citation2006).

Table 4. Percentage of Reduction (PR) in the number of features for each dataset. Cells with average PR greater than 50% are highlighted in bold.

Download CSV Display Table

Table 5. Performance of J48 and SVM models for each dataset. Cells regarding FS algorithms that obtained PR () greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Download CSV Display Table

Table 6. Performance of NB and NN models for each dataset. Cells regarding FS algorithms that obtained PR () greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Download CSV Display Table

Figure 7. Graphic summary of the predictive performance results. In particular, each bar consists in the area of a polygon in which each axis represents the average accuracy achieved by a specific classifier and FS algorithm.

Figure 8. Total number of classifiers built using features selected by the MOGA and SOGA filters for each category related to the trade-off between percentage of reduction in the number of features and classification accuracy.

Table 7. Comparison of SVM models built after FS. This comparison involves categorizing the models into six categories regarding the compromise between dimensionality reduction and prediction performance: excellent (▴▴▴), very good (▴▴), good (▴), poor (), very poor (▾) and all features (—).

Display Table

Figure 9. Graphic comparison of SVM models built after FS with statistical test results.

Figure 10. Comparison between MOGA in terms of the number of non- dominated solutions.

Lee, H. D., M. C. Monard, and F. C. Wu. 2006. A simple evaluation model for feature subset selection algorithms. Inteligência Artificial 10 (32):9–17.

Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Feature Selection via Pareto Multi-objective Genetic Algorithms

Table 1. Data represented according to the feature-value format.

Table 2. Importance measures employed. These measures are associated with distinct categories and can deal with Quantitative (QT) and Qualitative (QL) feature values.

Table 3. Summary of the datasets used in this study: Australian (A), Crx (C), Dermatology (D), German (G), Ionosphere (I), Lung cancer (L), Promoter (P), Sonar (S), Soybean small (Y), Vehicle (V), Wisconsin Breast cancer (B) and Wine (W).!

Table 4. Percentage of Reduction (PR) in the number of features for each dataset. Cells with average PR greater than 50% are highlighted in bold.

Table 5. Performance of J48 and SVM models for each dataset. Cells regarding FS algorithms that obtained PR () greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Table 6. Performance of NB and NN models for each dataset. Cells regarding FS algorithms that obtained PR () greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Feature Selection via Pareto Multi-objective Genetic Algorithms

Figures & data

Table 1. Data represented according to the feature-value format.

Table 2. Importance measures employed. These measures are associated with distinct categories and can deal with Quantitative (QT) and Qualitative (QL) feature values.

Table 3. Summary of the datasets used in this study: Australian (A), Crx (C), Dermatology (D), German (G), Ionosphere (I), Lung cancer (L), Promoter (P), Sonar (S), Soybean small (Y), Vehicle (V), Wisconsin Breast cancer (B) and Wine (W).!

Table 4. Percentage of Reduction (PR) in the number of features for each dataset. Cells with average PR greater than 50% are highlighted in bold.

Table 5. Performance of J48 and SVM models for each dataset. Cells regarding FS algorithms that obtained PR (Table 4) greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Table 6. Performance of NB and NN models for each dataset. Cells regarding FS algorithms that obtained PR (Table 4) greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 5. Performance of J48 and SVM models for each dataset. Cells regarding FS algorithms that obtained PR () greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.

Table 6. Performance of NB and NN models for each dataset. Cells regarding FS algorithms that obtained PR () greater than 50% are highlighted by an asterisk (*). Cells in italics indicate models with accuracy worse or equal to the Majority Class Error.