GitHub / itext / itext-pdfocr-java
pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itext%2Fitext-pdfocr-java
Stars: 35
Forks: 8
Open issues: 5
License: other
Language: Java
Size: 266 MB
Dependencies parsed at: Pending
Created at: almost 5 years ago
Updated at: about 2 months ago
Pushed at: about 2 months ago
Last synced at: 3 days ago
Topics: archival, character, data, diacritic, extractable, glyphs, hindi, image, iso-compliant, ligatures, mandarin, ocr, optical, pdf, portuguese, recognition, scan, searchable, spanish, tesseract