An open API service providing repository metadata for many open source software ecosystems.

GitHub / fabriziosalmi / text-boundaries

A Python-based tool for preprocessing, cleaning, and analyzing text datasets, designed to filter, deduplicate, sort data, and generate statistical insights.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabriziosalmi%2Ftext-boundaries
PURL: pkg:github/fabriziosalmi/text-boundaries

Stars: 0
Forks: 1
Open issues: 0

License: None
Language: Python
Size: 6.94 MB
Dependencies parsed at: Pending

Created at: about 1 year ago
Updated at: 10 months ago
Pushed at: 10 months ago
Last synced at: about 2 months ago

Topics: data-automation, data-deduplication, data-preprocessing, data-sorting, data-statistics-generation, data-validation, dataset-boundaries, dataset-cleaning, machine-learning, natural-language-processing, text-data-analysis

    Loading...