Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / P0L3 / PDF2TXT
Repository for content extraction from PDF and HTML files.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/P0L3%2FPDF2TXT
Stars: 0
Forks: 0
Open Issues: 0
License: gpl-3.0
Language: HTML
Repo Size: 97.1 MB
Dependencies:
14
Created: 5 months ago
Updated: 24 days ago
Last pushed: 24 days ago
Last synced: 23 days ago
Topics: beautifulsoup4, nltk-python, pdfminersix
Files
Loading...
Readme
Loading...
Dependencies
Dockerfile
docker
- python 3.8.0 build
docker-compose.yml
docker
- pdf_txt 1.4.0
requirements.txt
pypi
- beautifulsoup4 ==4.12.2
- doi2bib ==0.4.0
- nltk ==3.8.1
- pandas ==2.0.3
- pdfminer.six ==20231228
- tqdm ==4.66.1