A Novel Approach on Similarity Search and Similarity Joins by Metric Indexing

Authors

  • Ch. Naga Sai  PG Scholar, Dept of MCA, St. Ann’s College of Engineering and Technology, Chirala, Andhra Pradesh, India
  • M. Sarada  Assistant Professor, Dept of MCA, St.Ann’s College of Engineering and Technology, Chirala, Andhra Pradesh, Indi

Keywords:

Similarity Search, Similarity Join

Abstract

In this project think about comparability join and inquiry by metric ordering. Conventional techniques on single-quality information have pruning to this project just on single traits and can't efficiently bolster multi-characteristic information. To address this issue, this project propose a prefix tree file which has all encompassing pruning capacity on various qualities. This project propose a cost model to evaluate the prefix tree which can control the prefix tree development. In view of the prefix tree, this project device a channel confirmation structure to help comparability pursuit and join by metric ordering. The channel step prunes countless outcomes and distinguishes a few applicants utilizing the prefix tree and the check step confirms the possibility to produce the last and this project. For likeness go along with, this project demonstrate that building an ideal prefix tree is NP-finished and build up an avaricious calculation to accomplish elite. For similitude seek, since one prefix tree can't bolster all conceivable pursuit inquiries, this project stretch out the cost model to help closeness inquiry and devise a financial plan based calculation to develop numerous top notch prefix trees. This project additionally device a mixture confirmation calculation to enhance the check step. Trial comes about show.

References

  1. R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, pages 131–140, 2007.
  2. S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, pages 5–16, 2006.
  3. N. N. Dalvi, V. Rastogi, A. Dasgupta, A. D. Sarma, and T. Sarl´os. Optimal hashing schemes for entity matching. In WWW, pages 295–306, 2013.
  4. D. Deng, G. Li, and J. Feng. A pivotal prefix based filtering algorithm for string similarity search. In SIGMOD Conference, pages 673–684, 2014.
  5. D. Deng, G. Li, J. Feng, and W.-S. Li. Top-k string similarity search with edit-distance constraints. In ICDE, pages 925–936, 2013.
  6. D. Deng, G. Li, S. Hao, J. Wang, and J. Feng. Massjoin: A mapreduce-based method for scalable string similarity joins. In ICDE, pages 340–351, 2014.
  7. M. Garey and D. Johnson. A guide to the theory of NP-completeness. WH Freeman and Company, 1979.
  8. L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, pages 491–500, 2001.
  9. M. Hadjieleftheriou, N. Koudas, and D. Srivastava. Incremental maintenance of length normalized indexes for approximate string matching. In SIGMOD Conference, pages 429–440, 2009.
  10. J. M. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In SIGMOD Conference, pages 267–276, 1993.

Downloads

Published

2018-04-30

Issue

Section

Research Articles

How to Cite

[1]
Ch. Naga Sai, M. Sarada, " A Novel Approach on Similarity Search and Similarity Joins by Metric Indexing, International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011, Volume 4, Issue 7, pp.54-59, March-April-2018.