An open API service providing repository metadata for many open source software ecosystems.

GitHub / shamspias / gpt3-data-preprocessing

This repository containing code for preprocessing text data from PDF and DOCX files for use with GPT-3. It includes steps such as tokenization, removal of stop words and punctuation, and formatting for GPT-3 input.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shamspias%2Fgpt3-data-preprocessing

Stars: 6
Forks: 1
Open issues: 0

License: None
Language: Python
Size: 11.7 KB
Dependencies parsed at: Pending

Created at: over 2 years ago
Updated at: 2 months ago
Pushed at: over 2 years ago
Last synced at: 27 days ago

Topics: artificial-intelligence, data-preprocessing, data-preprocessing-pipelines, data-science, gpt-3, machine-learning

    Loading...