A Novel Approach on Similarity Search and Similarity Joins by Metric Indexing

Authors(2) :-Ch. Naga Sai, M. Sarada

In this project think about comparability join and inquiry by metric ordering. Conventional techniques on single-quality information have pruning to this project just on single traits and can't efficiently bolster multi-characteristic information. To address this issue, this project propose a prefix tree file which has all encompassing pruning capacity on various qualities. This project propose a cost model to evaluate the prefix tree which can control the prefix tree development. In view of the prefix tree, this project device a channel confirmation structure to help comparability pursuit and join by metric ordering. The channel step prunes countless outcomes and distinguishes a few applicants utilizing the prefix tree and the check step confirms the possibility to produce the last and this project. For likeness go along with, this project demonstrate that building an ideal prefix tree is NP-finished and build up an avaricious calculation to accomplish elite. For similitude seek, since one prefix tree can't bolster all conceivable pursuit inquiries, this project stretch out the cost model to help closeness inquiry and devise a financial plan based calculation to develop numerous top notch prefix trees. This project additionally device a mixture confirmation calculation to enhance the check step. Trial comes about show.

Authors and Affiliations

Ch. Naga Sai

M. Sarada

Similarity Search, Similarity Join

  1. R. J. Bayardo, Y. Ma, and R. Srikant. Scaling up all pairs similarity search. In WWW, pages 131–140, 2007.
  2. S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In ICDE, pages 5–16, 2006.
  3. N. N. Dalvi, V. Rastogi, A. Dasgupta, A. D. Sarma, and T. Sarl´os. Optimal hashing schemes for entity matching. In WWW, pages 295–306, 2013.
  4. D. Deng, G. Li, and J. Feng. A pivotal prefix based filtering algorithm for string similarity search. In SIGMOD Conference, pages 673–684, 2014.
  5. D. Deng, G. Li, J. Feng, and W.-S. Li. Top-k string similarity search with edit-distance constraints. In ICDE, pages 925–936, 2013.
  6. D. Deng, G. Li, S. Hao, J. Wang, and J. Feng. Massjoin: A mapreduce-based method for scalable string similarity joins. In ICDE, pages 340–351, 2014.
  7. M. Garey and D. Johnson. A guide to the theory of NP-completeness. WH Freeman and Company, 1979.
  8. L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, pages 491–500, 2001.
  9. M. Hadjieleftheriou, N. Koudas, and D. Srivastava. Incremental maintenance of length normalized indexes for approximate string matching. In SIGMOD Conference, pages 429–440, 2009.
  10. J. M. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In SIGMOD Conference, pages 267–276, 1993.

Publication Details

Published in : Volume 4 | Issue 7 | March-April 2018
Date of Publication : 2018-04-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 54-59
Manuscript Number : IJSRST184559
Publisher : Technoscience Academy

Print ISSN : 2395-6011, Online ISSN : 2395-602X

Cite This Article :

Ch. Naga Sai, M. Sarada, " A Novel Approach on Similarity Search and Similarity Joins by Metric Indexing", International Journal of Scientific Research in Science and Technology(IJSRST), Print ISSN : 2395-6011, Online ISSN : 2395-602X, Volume 4, Issue 7, pp.54-59, March-April-2018.
Journal URL : http://ijsrst.com/IJSRST184559

Article Preview