An open API service providing repository metadata for many open source software ecosystems.

GitHub / csebuetnlp / normalizer

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/csebuetnlp%2Fnormalizer
PURL: pkg:github/csebuetnlp/normalizer

Stars: 28
Forks: 5
Open issues: 0

License: None
Language: Python
Size: 15.6 KB
Dependencies parsed at: Pending

Created at: almost 4 years ago
Updated at: almost 2 years ago
Pushed at: almost 2 years ago
Last synced at: almost 2 years ago

Topics: bangla-text-normalization, bengali-text-normalization, text-normalization, text-preprocessing, text-processing

    Loading...