A Novel Fuzzy-Bayesian Classification Method for Automatic Text Categorization

Authors

  • Swathi V  Student, Bachelor of Computer Science and Engineering, Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore, Tamil Nadu, India
  • Swetha S Kumar  Professor, Department of Computer Science & Engineering, Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore, Tamil Nadu, India
  • Dr. P. Perumal  Professor, Department of Computer Science & Engineering, Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore, Tamil Nadu, India

Keywords:

Text Mining, Categorization, Machine Learning, Discrimination, Feature Selection.

Abstract

Text categorization is mostly required to label the documents automatically with the predefined set of topics. It has been achieved by the large number of advanced machine learning algorithms. In the proposed system, fuzzy rule along with Bayesian classification method is proposed for automatic text categorization using the class-specific features. The proposed method selects the particular feature subset for each class. Then, these class features are applied for the classification. To achieve this, Baggenstoss’s PDF Projection Theorem is followed to reconstruct PDF in raw data space from the class-specific PDF in low-dimensional feature space and build the fuzzy based Bayes classification rule. The noticeable significance of this method is that most feature selection criteria such as information gain and maximum discrimination which can be easily incorporated into the proposed method. The proposed classification performance is evaluated on different datasets and compared with the different feature selection methods. The experimental results illustrate that the effectiveness of the proposed method and further indicates its wide applications in text categorization.

References

  1. Bo Tang, Haibo He, Paul M. Baggenstoss and Steven Kay, “A Bayesian Classification Approach Using Class-Specific Features for Text Categorization”, IEEE Transactions on Knowledge and Data Engineering Vol. 28, Issue. 6, pp. 1602 – 1606, 2016.
  2. Bo Tang, Steven Kay and Haibo He, “Toward Optimal Feature Selection in Naïve Bayes for Text Categorization”, IEEE Transactions on Knowledge and Data Engineering, Vol.28, Issue. 9, pp. 2508 – 2521, 2016.
  3. Bo Tang and Haibo He, “ENN: Extended Nearest Neighbor Method for Pattern Recognition”, IEEE Computational Intelligence Magazine, Vol. 10, Issue. 3, pp. 52 – 60, 2015.
  4. Xiao-Bing Xue and Zhi-Hua Zhou, “Distributional Features for Text Categorization”, IEEE Transactions on Knowledge and Data Engineering, Vol. 21, Issue. 3, pp. 428 – 442, 2009.
  5. G. Forman, “An extensive empirical study of feature selection metrics for text classification,” The Journal of machine learning research, vol. 3, pp. 1289–1305, 2003.
  6. J.Sreemathy and P. S. Balamurugan, “An Efficient Text Classification using KNN and Naive Bayesian”, International Journal on Computer Science and Engineering (IJCSE), Vol. 4, No. 03, pp. 392 – 396, 2012.
  7. J Upendra Singh and Saqib Hasan, “Survey Paper on Document Classification and Classifiers”, International Journal of Computer Science Trends and Technology (IJCST) – Vol. 3, Issue. 2, pp. 83 – 87, 2015.
  8. Mital Vala and Jay Gandhi, “Survey of Text Classification Technique and Compare Classifier”, International Journal of Innovative Research in Computer and Communication Engineering, Vol. 3, Issue 11, pp. 10809 - 10813, 2015.
  9. Rajni Jindal, Ruchika Malhotra and Abha Jain, “Techniques for text classification: Literature review and current trends”, Webology, Vol. 12, No. 2, pp. 1 - 28, 2015.
  10. Anuradha Purohit, Deepika Atre, Payal Jaswani and Priyanshi Asawara, “Text Classification in Data Mining”, International Journal of Scientific and Research Publications, Vol. 5, Issue. 6, pp. 1 – 7, 2015.
  11. Kujguhj Bhumika, Sukhjit Singh Sehra, and Anand Nayyar, “A Review Paper on Algorithms used for Text Classification”, International Journal of Application or Innovation in Engineering & Management (IJAIEM), Vol. 2, Issue. 3, pp. 90 – 99, 2013.

Downloads

Published

2017-04-30

Issue

Section

Research Articles

How to Cite

[1]
Swathi V, Swetha S Kumar, Dr. P. Perumal, " A Novel Fuzzy-Bayesian Classification Method for Automatic Text Categorization, International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011, Volume 3, Issue 3, pp.233-239, March-April-2017.