GitHub / divithraju / divith-raju-SearchEngine-Wikipedia
search engine optimizationA complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
Stars: 2
Forks: 0
Open issues: 0
License: None
Language: Python
Size: 16.6 KB
Dependencies parsed at: Pending
Created at: over 2 years ago
Updated at: 10 months ago
Pushed at: over 2 years ago
Last synced at: 8 days ago
Topics: algorithms, data, dataengineering, inverted-index, linux, merge-sort, nlp, project, project-repository, python3, serchengine, software-engineering, ubuntu, wikipedia