Web Page Noise Removal - A Survey

Authors

  • Dr. S. Vijayarani  Assistant Professor, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu, India
  • K. Geethanjali  M.Phil Research Scholar, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu, India

Keywords:

Web Content, Web page, Global Noise, Local Noise, Filtering.

Abstract

Web mining is used to extract useful information from websites which includes web documents and hyperlinks of web sites. The World Wide Website contains a wide range of web pages which are very useful to many users. Web pages are composed of different kinds of data, such as text, audio, video and images. In addition to this, nowadays, web pages contain a large amount of unnecessary data, e.g., advertisement posters, navigation bars and disclaimer/copyright notices. These types of unnecessary data are called as noisy data. This has created the distractions to the user and also increases the time to perform searches and browsing tasks. To perform in-depth analysis of web data or web content mining, the first and essential step is to remove the noises which are existing in the web pages, and then we can extract useful information from the web pages. Removing noise from the web page is challenging task in web content mining. This main objective of this paper is to discuss the basics of web content mining, types of noises, techniques used for noise removal and different models used in the literature.

References

  1. Anchal Garg, Bikrampal Kaur "Web Page Performance Enhancement by Removing Noise" International Journal of Computer Applications (0975 - 8887) Volume 103 - No.6, October 2014
  2. S. S. Bhamare, Dr. B. V. Pawar "Survey on Web Page Noise Cleaning for Web Mining" (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 4 (6), 2013, 766-770
  3. Jaideep Srivastava, Prasanna Desikan, Vipin Kumar "Web Mining Concepts, Applications, and ResearchDirections"
  4. Surabhi Lingwal "Noise Reduction and Content Retrieval from Web Pages" International Journal of Computer Applications (0975 - 8887) Volume 73- No.4, July 2013
  5. Mrs.Madhushree B, Yogita K Patel "A Review on Noise Removal from Web pages for Web Content Mining" International Institution for Technological Research and Development? Volume 1, Issue 1, 2015
  6. Sekhar Babu Boddu? Assistant Professor, Department of MCA,? KL University, Guntur, Andhrapradesh, India -522501? sekhar99@gmail.com "ELIMINATE THE NOISY DATA FROM WEB PAGES USING DATA MINING TECHNIQUES? " GESJ: Computer Science and Telecommunications 2013|No.2(38) ISSN 1512-1232??
  7. Hassan F. Eldirdiery, A. H. Ahmed "Detecting and Removing Noisy Data on Web Document using Text Density Approach" International Journal of Computer Applications (0975 - 8887) Volume 112 - No. 5, February 2015
  8. Mrs Madhushree B, Yogita K Patel "A Review on Noise Removal from Web pages for Web Content Mining" International Institution for Technological Research and Development? Volume 1, Issue 1, 2015
  9. Rajni Sharma, Max Bhatia Department of Computer Science and Engineering Lovely Professional University, Phagwara "Eliminating the Noise from Web Pages using Page Replacement Algorithm " Rajni Sharma et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (3) , 2014, 3066-3068
  10. Sandeep Kaur1, Abhishek Tyagi2 Department of Computer Science Engineering, Lovely Professional University Jalandhar, Punjab, India1 Email: Sandeepkaur.1489@yahoo.com Assistant Professor, Department of Computer Science Engineering, Lovely Professional University Jalandhar, Punjab, India2 Email: Abhishek.16857@lpu.co.in "Noise Reduction and Content Extraction from Web Pages Using DOM Based Page Segmentation" Sandeep Kaur et al, Int.J.Computer Technology & Applications, Vol 5 (6),2022-2027
  11. Yogita K patel 1, Mr.Narendrasinh Limbad2 "Noise Removal from Web pages for Web Content Mining" IJARIIE-ISSN(O)-2395-4396 Vol-2 Issue-3 2016
  12. Anchal Garg M.Tech CSE, Bikrampal Kaur Ph.D "Web Page Performance Enhancement by Removing Noise" International Journal of Computer Applications (0975 - 8887) Volume 103 - No.6, October 2014
  13. V?t Baisa "Web Content Cleaning" Masaryk University Faculty of Informatics Hui Xiong, Member, IEEE, Gaurav Pandey, Michael Steinbach, Member, IEEE, and Vipin Kumar, Fellow, IEEE? "Enhancing Data Analysis with Noise Removal" IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING , VOL. X, NO. X, XXX 200X
  14. Shine N. Das, Pramod K. Vijayaraghavan, Midhun Mathew "Eliminating Noisy Information in Web Pages using featured DOM tree" International Journal of Applied Information Systems (IJAIS) - ISSN: 2249-0868 Foundation of Computer Science FCS, New York, USA Volume 2- No.2, May 2012 - www.ijais.org

Downloads

Published

2017-10-31

Issue

Section

Research Articles

How to Cite

[1]
Dr. S. Vijayarani, K. Geethanjali, " Web Page Noise Removal - A Survey, International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011, Volume 3, Issue 7, pp.172-181, September-October-2017.