4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A Two-stage Text Feature Selection Algorithm for Improving Text Classification

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          As the number of digital text documents increases on a daily basis, the classification of text is becoming a challenging task. Each text document consists of a large number of words (or features) that drive down the efficiency of a classification algorithm. This article presents an optimized feature selection algorithm designed to reduce a large number of features to improve the accuracy of the text classification algorithm. The proposed algorithm uses noun-based filtering, a word ranking that enhances the performance of the text classification algorithm. Experiments are carried out on three benchmark datasets, and the results show that the proposed classification algorithm has achieved the maximum accuracy when compared to the existing algorithms. The proposed algorithm is compared to Term Frequency-Inverse Document Frequency, Balanced Accuracy Measure, GINI Index, Information Gain, and Chi-Square. The experimental results clearly show the strength of the proposed algorithm.

          Related collections

          Most cited references42

          • Record: found
          • Abstract: not found
          • Article: not found

          Wrappers for feature subset selection

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Machine learning in automated text categorization

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Analysis of Dimensionality Reduction Techniques on Big Data

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                ACM Transactions on Asian and Low-Resource Language Information Processing
                ACM Trans. Asian Low-Resour. Lang. Inf. Process.
                Association for Computing Machinery (ACM)
                2375-4699
                2375-4702
                May 2021
                May 2021
                : 20
                : 3
                : 1-19
                Affiliations
                [1 ]Sri Ramachandra College of Engineering and Technology, Sri Ramachandra Institute of Higher Education and Research, Chennai, Tamil Nadu
                [2 ]Department of Mathematics and Computer Science, Brandon University Research Center for Interneural Computing, China Medical University, Taichung, Taiwan, Republic of China
                [3 ]School of Information Technology, VIT, Vellore, Tamil Nadu
                Article
                10.1145/3425781
                0d4497ed-5035-4307-a05c-7e618f9245a8
                © 2021
                History

                Comments

                Comment on this article