Data Partitioning Method for Mining Frequent Itemset Using MapReduce

Authors(3) :-R. Divya Bharathi, A. S. Karthik Kannan, E. Jai Vinitha

Existing parallel mining algorithm lacks in communication and mining overhead. To overcome this problem a data partitioning method using MapReduce model is proposed. In this model, three MapReduce tasks are implemented to improve the performance of frequent itemset mining in parallel. In second MapReduce job the mapper perform LSH based approach that integrates the item grouping and partitioning process. The reducer performs FP-Growth based on the partition data to generate all frequent patterns in the data. The main idea of data partitioning is to group relevant transactions and reduce the number of the relevant transaction. Extensive experiments using IBM Quest Market Basket Synthetic Datasets to show that data partitioning is efficient, robust and scalable on Hadoop.

Authors and Affiliations

R. Divya Bharathi
Department of IT, M.Tech, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India
A. S. Karthik Kannan
Department of IT, Assistant Professor, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India
E. Jai Vinitha
Department of IT, M.Tech, Mepco Schlenk Engineering College, Sivakasi, Tamilnadu, India

Frequent Itemset Mining, Mapreduce Model, Parallel Mining, Data Partitioning.

  1. Yaling Xun, Jifu Zhang,Xiao Qin,” FiDoop-DP:Data Partitioning in frequent itemset mining on Hadoop Clusters IEEE Transcations on Parallel and distributed system, vol28, jan.2017.
  2. M. J. Zaki, “Parallel and distributed association mining: A survey,” IEEE Concurrency, vol. 7, no. 4, pp. 14?25, Oct. 1999.
  3. Pramudiono and M. Kitsuregawa, “Fp-tax: Tree structure based generalized association rule mining,” in Proc. 9th ACM SIGMOD Workshop Res. Issues Data Mining Knowl. Discovery, 2004, pp. 60?63.
  4. M.-Y. Lin, P.-Y. Lee, and S.-C. Hsueh, “Apriori-based frequent itemset mining algorithms on mapreduce,” in Proc. 6th Int. Conf. Ubiquitous Inform. Manag. Commun., 2012, pp. 76:1?76:8.
  5. X. Lin, “Mr-apriori: Association rules algorithm based on mapreduce,” in Proc. IEEE 5th Int. Conf. Softw. Eng. Serv. Sci., 2014, pp. 141?144.
  6. S. Hong, Z. Huaxuan, C. Shiping, and H. Chunyan, “The study of improved FP-growth algorithm in mapreduce,” in Proc. 1st Int.Workshop Cloud Comput. Inform. Security, 2013, pp. 250?253.
  7. M. Liroz-Gistau, R. Akbarinia, D. Agrawal, E. Pacitti, and P. Valduriez, “Data partitioning for minimizing transferred data in mapreduce,” in Proc. 6th Int. Conf. Data Manag. Cloud, Grid P2P Syst., 2013, pp. 1?12.
  8. L. Zhou, Z. Zhong, J. Chang, J. Li, J. Huang, and S. Feng, “Balanced parallel FP-growth with mapreduce,” in Proc. IEEE Youth Conf. Inform. Comput. Telecommun., 2010, pp. 243?246.
  9. W. Lu, Y. Shen, S. Chen, and B. C. Ooi, “Efficient processing of k nearest neighbor joins using mapreduce,” Proc. VLDB Endowment, vol. 5, no. 10, pp. 1016?1027, 2012.
  10. J. Leskovec, A. Rajaraman, and J. D. Ullman, Mining Massive Datasets. Cambridge, U.K.: Cambridge Univ. Press, 2014.
  11. ?Z.Broder,M. Charikar,A. M. Frieze,? and M. Mitzenmacher, “Min-wise? independent permutations,” J. Comput. Syst. Sci., vol. 60, no. 3, pp. 630? 659, 2000.
  12. L. Christopher. (2001). Artool Project [J].[Online].Available???? http://www.cs.umb.edu/laur/ARtool/? accessed Oct. 19, 2012

Publication Details

Published in : Volume 3 | Issue 5 | May-June 2017
Date of Publication : 2017-04-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 146-153
Manuscript Number : ICASCT2525
Publisher : Technoscience Academy

Print ISSN : 2395-6011, Online ISSN : 2395-602X

Cite This Article :

R. Divya Bharathi, A. S. Karthik Kannan, E. Jai Vinitha, " Data Partitioning Method for Mining Frequent Itemset Using MapReduce, International Journal of Scientific Research in Science and Technology(IJSRST), Print ISSN : 2395-6011, Online ISSN : 2395-602X, Volume 3, Issue 5, pp.146-153, May-June-2017. Available at doi : 10.32628/ICASCT2525
Journal URL : http://ijsrst.com/ICASCT2525

Article Preview

Contact Us