Ranked-File-Search

Information retrieval (IR) is concerned with finding material (e.g., documents) of an unstructured nature (usually text) in response to an information need (e.g., a query) from large collections. One approach to identify relevant documents is to compute scores based on the matches between terms in the query and terms in the documents. For example, a document with words such as ball , team , score , championship is likely to be about sports. It is helpful to define a weight for each term in a document that can be meaningful for computing such a score. I use popular information retrieval metrics such as term frequency, inverse document frequency, and their product, term frequency-inverse document frequency (TF-IDF), that are used to define weights for terms.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shreyas15%2FRanked-File-Search
PURL: pkg:github/shreyas15/Ranked-File-Search

Stars: 1
Forks: 0
Open issues: 0

License: None
Language: Java
Size: 974 KB
Dependencies parsed at: Pending

Created at: over 8 years ago
Updated at: almost 2 years ago
Pushed at: over 8 years ago
Last synced at: almost 2 years ago

Topics: apache, cloudera, hadoop-mapreduce, java, search-engine

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub / shreyas15 / Ranked-File-Search