An open API service providing repository metadata for many open source software ecosystems.

GitHub / yashpatel2911 / Web-Search-Engine

The web search engine was a try to make a mini version of the other popular search web searches engines such as Google, Bing, or YouTube. The web search engine that we built is developed using various data structures to perform efficiently to result accurately. First of all, we collected the web pages using web crawler using python. The web crawler fetches all the web pages to create a database. After that, we converted all the web pages into text files so that it is easier to go through the text file. Lastly, we build a database for the text-files linked to the words that the text-file contains. We implemented the Inverted Index to build the database. So we used java data Structure that uses key-value pair called HashMap to implement an Inverted Index.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashpatel2911%2FWeb-Search-Engine

Stars: 1
Forks: 0
Open issues: 0

License: None
Language: Python
Size: 40 KB
Dependencies parsed at: Pending

Created at: over 4 years ago
Updated at: almost 3 years ago
Pushed at: over 4 years ago
Last synced at: about 2 years ago

Topics: eclipse-ide, hashtable, html-to-text, inverted-index, java, web-crawler-python

    Loading...