itext-pdfocr-java

pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itext%2Fitext-pdfocr-java

Stars: 36
Forks: 9
Open issues: 7

License: other
Language: Java
Size: 266 MB
Dependencies parsed at: Pending

Created at: almost 5 years ago
Updated at: 24 days ago
Pushed at: 25 days ago
Last synced at: 21 days ago

Topics: archival, character, data, diacritic, extractable, glyphs, hindi, image, iso-compliant, ligatures, mandarin, ocr, optical, pdf, portuguese, recognition, scan, searchable, spanish, tesseract

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub / itext / itext-pdfocr-java