GitHub / lizfischer / document-segmentation
Browser-based app for segmenting & OCRing PDF pages based on whitespace rules. To assist researchers (especially in the humanities) with turning their materials into machine-actionable datasets.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lizfischer%2Fdocument-segmentation
PURL: pkg:github/lizfischer/document-segmentation
Stars: 0
Forks: 1
Open issues: 21
License: gpl-3.0
Language: Python
Size: 1.13 GB
Dependencies parsed at:
19
Created at: almost 3 years ago
Updated at: over 1 year ago
Pushed at: over 1 year ago
Last synced at: over 1 year ago
Topics: digital-humanities, image-processing, machine-vision
- python 3.8 build
- redis latest
- SQLAlchemy *
- Werkzeug *
- alembic *
- celery *
- flask *
- flask-migrate *
- flask-sqlalchemy *
- flower *
- numpy *
- opencv-python *
- pandas *
- pdf2image *
- pillow *
- pytesseract *
- redis *
- tqdm *
- wtforms *