Browse
We’re here to help

Find guidance on Author Services

Search
Browse
We’re here to help

Find guidance on Author Services

Home
All Journals
IETE Journal of Research
List of Issues
Volume 69, Issue 4
LEARNING-based Focused WEB Crawler

Search in:

Advanced search

IETE Journal of Research Volume 69, 2023 - Issue 4

Submit an article Journal homepage

419

Views

CrossRef citations to date

Altmetric

Articles

LEARNING-based Focused WEB Crawler

Naresh KumarDepartment of Computer Science and Engineering, Maharaja Surajmal Institute of Technology, New Delhi, IndiaCorrespondence[email protected]

https://orcid.org/0000-0001-9984-506X View further author information

Dhruv AggarwalDepartment of Computer Science and Engineering, Maharaja Surajmal Institute of Technology, New Delhi, India

https://orcid.org/0000-0002-9336-0309 View further author information

Pages 2037-2045 | Published online: 22 Feb 2021

Cite this article
https://doi.org/10.1080/03772063.2021.1885312
CrossMark

Sample our Engineering & Technology journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/03772063.2021.1885312?needAccess=true

Abstract

As the number of pages being published every day increases enormously, there is a consistent need to design an efficient crawler mechanism that can result in appropriate and efficient search results for the everyday query. Every day people face the problem of inappropriate or incorrect answers among search results. So, there is a strong need to develop enhanced methods to provide precise search results for the user in an acceptable time frame. Through this project, we exhibit an effective approach to building a crawler considering factors that have never been considered before. The main focus of the project would be designing an intelligent crawler that learns itself to improve the effective ranking of URLs using a focused crawler. Moreover, there exist many crawlers which first head to the seed URL, read the pages, and download the pages for further indexing to the search engines. In this, there is a problem that if a website/page which does not update regularly, is still crawled by the crawler even though it had already been downloaded in its previous visit. Due to this, there is a great loss of bandwidth, network, time, and storage. So, we aim to minimize these problems by making an effective system with a revisited policy for web crawlers. First, websites are divided into three categories frequently, frequent, static in the first crawl, and then the crawler decides its time that at what time it has to crawl again for that website.

Keywords:

Focused web crawler
KNN
Learning crawler
Lexical relation
Ontological
Pagescore
Semantic learning

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Additional information

Notes on contributors

Naresh Kumar

Naresh Kumar holds a PhD from Kurukshetra University, Kurukshetra and MTech (Computer Science and Engineering) degree from YMCA University of Science and Technology, Faridabad. He is currently working at Maharaja Surajmal Institute of Technology, New Delhi where he is working as an associate professor. His area of research interest includes web crawlers, search engines, and meta search engines. He has published over 41 research papers.

Dhruv Aggarwal

Dhruv Aggarwal is currently pursuing PGDM from the Institute of Management Technology, Ghaziabad. He holds a BTech degree in computer science and engineering from Maharaja Surajmal Institute of Technology, New Delhi. Email: [email protected]

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 100.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Information for

Authors
R&D professionals
Editors
Librarians
Societies

Open access

Overview
Open journals
Open Select
Dove Medical Press
F1000Research

Opportunities

Reprints and e-prints
Advertising solutions
Accelerated publication
Corporate access solutions

Help and information

Help and contact
Newsroom
All journals
Books

Keep up to date

Sign me up

Taylor and Francis Group Facebook page

Taylor and Francis Group X Twitter page

Taylor and Francis Group Linkedin page

Taylor and Francis Group Youtube page

Taylor and Francis Group Weibo page

Registered in England & Wales No. 3099067
5 Howick Place | London | SW1P 1WG

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research