Analysis and Implementation of Text Mining for Different Documents

K. Maheswari; P. Packia Amutha Priya

doi:10.32628/ICASCT2518

Authors

K. Maheswari Department of Computer Applications, Kalasalingam University, Krishnankoil, Tamil Nadu, India
P. Packia Amutha Priya Department of Computer Applications, Kalasalingam University, Krishnankoil, Tamil Nadu, India

Keywords:

Text Mining, Data Mining, frequency of words and text file

Abstract

The process of making structured data from unstructured and semi structured text is called text mining. Text mining is defined as bag of words. The environment is set up with various documents in a database. The preprocessing of removing unwanted numeric values, uppercase, lower case, frequent words, punctuation is considered. In this work, the frequency of words occurred at least fifty times in a document is identified. The experimental results of the word frequency in a document occurred twenty times, twenty five times, fifty times and hundred times was analyzed and represented visually.

References

Ah-Hwee Tan, “Text Mining:The state of the art and the challenges”, International Journal of Advanced Research in Computer and Communication Engineering Vol. 1, Issue 6, August 2012
Ingo Feinerer, Kurt Hornik , David Meyer “Text Mining Infrastructure in R”,? Journal of Statistical Software March 2008, Volume 25, Issue 5.
Mustafa M. Shaikh, Ashwini A. Pawar, Vibha B. Lahane, Pattern Discovery Text Mining for Document Classification, International Journal of Computer Applications, Volume 117 ,No. 1,May 2015,PP:6-12.
Abhishek Kaushik, and Sudhanshu Naithani,? “A Comprehensive Study of Text Mining Approach”, IJCSNS, VOL.16? No. 2, February? 2016, PP: 69 ? 76.
Yu Zhang, Mengdong Chen, and Lianzhong Liu, “A review on text mining”, published in IEEE Xplore digital library, Software Engineering and Service Science (ICSESS), 2015 6th IEEE International Conference on 23-25 Sept. 2015.
Abhilasha Singh Rathor? and Dr. Pankaj Garg, “Analysis on Text Mining Techniques”, IJARCSSE , Volume 6, Issue 2,February 2016, ISSN: 2277 128X, pp: 132- 137.
Michele Fattoria, Giorgio Pedrazzib, and Roberta Turrab, “Text mining applied to patent mapping: a practical business case” World Patent Information, published in Elsevier, Volume 25, Issue 4, December 2003, Pages 335?342.
Zhou Yong, Li Youwen and Xia Shixiong, “An Improved KNN Text Classification Algorithm Based on Clustering”, JOURNAL OF COMPUTERS, VOL. 4, NO. 3, MARCH 2009, pp: 230- 237.
Songbo Tan,”Neighbor-weighted K-nearest neighbor for unbalanced text corpus”, Expert Systems with Applications,Volume 28, Issue 4, May 2005, Pages 667?671

Analysis and Implementation of Text Mining for Different Documents

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite