GitHub / yashpatel2911 / Web-Search-Engine
The web search engine was a try to make a mini version of the other popular search web searches engines such as Google, Bing, or YouTube. The web search engine that we built is developed using various data structures to perform efficiently to result accurately. First of all, we collected the web pages using web crawler using python. The web crawler fetches all the web pages to create a database. After that, we converted all the web pages into text files so that it is easier to go through the text file. Lastly, we build a database for the text-files linked to the words that the text-file contains. We implemented the Inverted Index to build the database. So we used java data Structure that uses key-value pair called HashMap to implement an Inverted Index.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashpatel2911%2FWeb-Search-Engine
Stars: 1
Forks: 0
Open issues: 0
License: None
Language: Python
Size: 40 KB
Dependencies parsed at: Pending
Created at: over 4 years ago
Updated at: almost 3 years ago
Pushed at: over 4 years ago
Last synced at: about 2 years ago
Topics: eclipse-ide, hashtable, html-to-text, inverted-index, java, web-crawler-python