An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-for-robots"

david-smejkal/wiki2txt

A tool to extract plain (unformatted) multilingual / language-agnostic text, redirects, links and categories from wikipedia backups (dumps). Designed to prepare clean training data for AI Training / Machine Learning software.

Language: Python - Size: 215 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 6 - Forks: 1