Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / csebuetnlp / normalizer

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/csebuetnlp%2Fnormalizer

Stars: 28
Forks: 5
Open Issues: 0

License: None
Language: Python
Repo Size: 15.6 KB
Dependencies: 3

Created: over 2 years ago
Updated: 9 months ago
Last pushed: 9 months ago
Last synced: 9 months ago

Topics: bangla-text-normalization, bengali-text-normalization, text-normalization, text-preprocessing, text-processing

Files
    Loading...
    Readme
    Loading...
    Dependencies
    setup.py pypi