Manuscript Number : IJSRST162389
Performance Analysis of an Ontology Based Crawler Operating in a Distributed Environment
Authors(3) :-Wael A. Gab-ALLAH, Ben Bella S. Tawfik, Hamed M. Nassar Crawlers are being increasingly utilized to retrieve information from distributed information sources, such as the Web. We have implemented one that makes use of some novel algorithms and techniques, namely, a novel IR architecture, an efficient query expansion algorithm based on WordNet, a new crawling technique based on ontology and a new rapid filtering algorithm based on semantic similarity. The experimental results of the implemented crawler, named Ontology Based Distributed Information Retrieval (OBDIR) system, show superiority to those obtained from systems based on the standard Breadth First (BF) search technique. In this paper we analyze the performance of the OBDIR system. We develop a probabilistic model that captures the operational dimensions of the system. The model makes heavy use of Bayes’ theorem and can help establish a foundational theory for DIR. We study such performance metrics as recall and precision, and allude to other performance tools such as accuracy and ROC space. The study shows that by carefully choosing the keywords the performance of the crawler is enhanced greatly.
Wael A. Gab-ALLAH Information retrieval, Web search, Focused crawler, Ontology
Publication Details
Published in : Volume 2 | Issue 3 | May-June 2016 Article Preview
Faculty of Computers & Informatics, Suez Canal University, Ismailia, Egypt
Ben Bella S. Tawfik
Faculty of Computers & Informatics, Suez Canal University, Ismailia, Egypt
Hamed M. Nassar
Faculty of Computers & Informatics, Suez Canal University, Ismailia, Egypt
Date of Publication : 2017-12-31
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 334-339
Manuscript Number : IJSRST162389
Publisher : Technoscience Academy
Journal URL : https://ijsrst.com/IJSRST162389
Citation Detection and Elimination |
|
|