An open API service providing repository metadata for many open source software ecosystems.

Topic: "tesseract"

tesseract-ocr/tesseract

Tesseract Open Source OCR Engine (main repository)

Language: C++ - Size: 51.1 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 66,586 - Forks: 9,858

naptha/tesseract.js

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Language: JavaScript - Size: 104 MB - Last synced at: 6 days ago - Pushed at: 20 days ago - Stars: 36,479 - Forks: 2,290

ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Language: Python - Size: 63.6 MB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 28,638 - Forks: 1,943

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language: Python - Size: 328 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 7,114 - Forks: 600

tesseract-ocr/tessdata

Trained models with fast variant of the "best" LSTM models + legacy models

Size: 3.1 GB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 6,783 - Forks: 2,291

aisingapore/TagUI

Free RPA tool by AI Singapore

Language: JavaScript - Size: 75.9 MB - Last synced at: 13 days ago - Pushed at: 2 months ago - Stars: 5,915 - Forks: 613

tebelorg/RPA-Python

Python package for doing RPA

Language: Python - Size: 323 KB - Last synced at: about 2 hours ago - Pushed at: 2 months ago - Stars: 5,190 - Forks: 699

thiagoalessio/tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.

Language: PHP - Size: 1.09 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 2,961 - Forks: 552

otiai10/gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Language: Go - Size: 1.08 MB - Last synced at: 17 days ago - Pushed at: about 2 months ago - Stars: 2,845 - Forks: 293

Dicklesworthstone/llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Language: Python - Size: 1.41 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 2,628 - Forks: 179

rmtheis/android-ocr 📦

Experimental optical character recognition app

Language: Java - Size: 16.3 MB - Last synced at: 4 months ago - Pushed at: about 7 years ago - Stars: 2,230 - Forks: 895

sirfz/tesserocr

A Python wrapper for the tesseract-ocr API

Language: Python - Size: 536 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,092 - Forks: 257

ianzhao05/textshot

Python tool for grabbing text via screenshot

Language: Python - Size: 64.5 KB - Last synced at: 28 days ago - Pushed at: 5 months ago - Stars: 1,762 - Forks: 260

Pulover/PuloversMacroCreator

Automation Utility - Recorder & Script Generator

Language: AutoHotkey - Size: 142 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 1,760 - Forks: 244

Akylas/OSS-DocumentScanner

Android document document scanning app

Language: C++ - Size: 31.8 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1,148 - Forks: 38

ryfeus/lambda-packs

Precompiled packages for AWS Lambda

Language: Python - Size: 1.76 GB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 1,121 - Forks: 239

GauravSingh9356/J.A.R.V.I.S

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

Language: Python - Size: 9.59 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 885 - Forks: 211

dannnylo/rtesseract

Ruby library for working with the Tesseract OCR.

Language: Ruby - Size: 1.37 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 855 - Forks: 86

CCExtractor/ccextractor

CCExtractor - Official version maintained by the core team

Language: C - Size: 122 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 771 - Forks: 461

adaptech-cz/Tesseract4Android

Fork of tess-two rewritten from scratch to support latest version of Tesseract OCR.

Language: C - Size: 32.4 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 756 - Forks: 117

tesseract-ocr/tesstrain

Train Tesseract LSTM with make

Language: Python - Size: 13.2 MB - Last synced at: 29 days ago - Pushed at: 11 months ago - Stars: 663 - Forks: 204

simplezhli/Tesseract-OCR-Scanner 📦

[停止维护]基于Tesseract-OCR实现自动扫描识别手机号

Language: Java - Size: 3.12 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 649 - Forks: 160

ushelp/EasyOCR 📦

Java OCR 识别组件(基于Tesseract OCR 引擎)。能自动完成图片清理、识别 CAPTCHA 验证码图片内容的一体化工作。Java Image cleanup, OCR recognition component (based Tesseract OCR engine, automatically cleanup image and identification CAPTCHA verification code picture content).

Language: Java - Size: 629 KB - Last synced at: 21 days ago - Pushed at: almost 4 years ago - Stars: 617 - Forks: 245

jonathanpalma/react-native-tesseract-ocr

Tesseract OCR wrapper for React Native

Language: Java - Size: 5.83 MB - Last synced at: 2 days ago - Pushed at: 15 days ago - Stars: 580 - Forks: 174

junhoyeo/BetterOCR

🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.

Language: Python - Size: 10.6 MB - Last synced at: 20 days ago - Pushed at: 3 months ago - Stars: 538 - Forks: 32

tesseract-ocr/tessdata_fast

Fast integer versions of trained LSTM models

Size: 328 MB - Last synced at: 27 days ago - Pushed at: 9 months ago - Stars: 529 - Forks: 151

SubhamTyagi/android-ocr

Tesseract based OCR for android

Language: Java - Size: 19.2 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 468 - Forks: 72

isee15/Card-Ocr

身份证识别OCR

Language: Jupyter Notebook - Size: 187 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 442 - Forks: 133

prabhakar267/image2text

:clipboard: Python wrapper to grab text from images and save as text files using Tesseract Engine

Language: Python - Size: 5.42 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 406 - Forks: 140

silenceper/qanswer 📦

【Deprecated】🥇🥇🥇 冲顶大会等游戏答题助手,提供答题辅助决策 ,帮助顺利吃鸡

Language: Go - Size: 968 KB - Last synced at: 11 months ago - Pushed at: over 7 years ago - Stars: 324 - Forks: 62

zapolnoch/node-tesseract-ocr

A Node.js wrapper for the Tesseract OCR API

Language: JavaScript - Size: 516 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 311 - Forks: 38

cseas/ocr-table

Extract tables from scanned image PDFs using Optical Character Recognition.

Language: Python - Size: 12.8 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 273 - Forks: 67

LeoFCardoso/pdf2pdfocr

A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!

Language: Python - Size: 625 KB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 261 - Forks: 33

danpla/dpscreenocr

Program to recognize text on screen

Language: C++ - Size: 3.07 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 256 - Forks: 18

ropensci/tesseract

Bindings to Tesseract OCR engine for R

Language: R - Size: 182 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 247 - Forks: 27

SwiftyTesseract/SwiftyTesseract 📦

A Swift wrapper around Tesseract for use in iOS, macOS, and Linux applications

Language: Swift - Size: 301 MB - Last synced at: 4 days ago - Pushed at: about 3 years ago - Stars: 243 - Forks: 78

scott0123/Tesseract-macOS

Objective C wrapper for the open source OCR Engine Tesseract (macOS)

Language: Objective-C - Size: 41.9 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 242 - Forks: 49

bitdata/ocrtable

Recognize tables and text from scanned images that contain tables. 从包含表格的扫描图片中识别表格和文字

Language: C++ - Size: 1.33 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 238 - Forks: 63

scherroman/mugen

A command-line music video generator based on rhythm

Language: Python - Size: 23.3 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 229 - Forks: 41

the-black-knight-01/Tabulo

Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)

Language: Python - Size: 10.6 MB - Last synced at: about 3 hours ago - Pushed at: over 2 years ago - Stars: 197 - Forks: 40

trekhleb/links-detector

📖 👆🏻 Links Detector makes printed links clickable via your smartphone camera. No need to type a link in, just scan and click on it.

Language: TypeScript - Size: 53.4 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 184 - Forks: 26

sushant10/HQ_Bot

📲 Bot to help solve HQ trivia

Language: Python - Size: 2.59 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 173 - Forks: 90

skylander86/lambda-text-extractor

AWS Lambda functions to extract text from various binary formats.

Language: Python - Size: 111 MB - Last synced at: 6 months ago - Pushed at: over 7 years ago - Stars: 173 - Forks: 42

koreader/koreader-base

Base framework offering a Lua scriptable environment for creating document readers

Language: Lua - Size: 13.4 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 167 - Forks: 114

nonwill/GoldenDict-OCR

GoldenDict++:内置大量的官方版本问题的修正;先期添加了一个简单的插件机制,并基于该机制接入了多个 OCR 划词 和 音频播放 引擎;后期在增强易用性的基础上为提高查询效率、减少运行时 CPU 及 内存 占用、降低代码维护难度,完全重构了所有的实现;将来的目标是将功能扩展和词典格式处理抽象为完整的插件实现,以进一步增强应用的扩展性和可维护性。

Size: 78.1 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 164 - Forks: 4

alimranahmed/LaraOCR

Laravel Optical Character Reader(OCR) package using ocr engines(Tesseract)

Language: PHP - Size: 183 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 157 - Forks: 40

TDiblik/main-gate-alpr 📦

Recognize license plates (and numbers) using fine-tuned yolov8, OCR (tesseract) and Hikvision camera

Language: Python - Size: 1.96 GB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 152 - Forks: 13

writecrow/ocr2text

Convert a PDF via OCR to a TXT file in UTF-8 encoding

Language: Python - Size: 57.6 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 148 - Forks: 29

peirick/Tesseract-OCR_for_Windows

Visual Studio Projects for Tessearct and dependencies

Language: C - Size: 52.1 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 141 - Forks: 88

dilippuri/Aadhaar-Card-OCR

Extract text information from Aadhaar Card using tesseract-ocr :sunglasses:

Language: Python - Size: 421 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 131 - Forks: 71

kaelzhang/penteract-ocr

⭐️ The native node.js bindings to the Tesseract OCR project.

Language: C++ - Size: 1.01 MB - Last synced at: 3 days ago - Pushed at: over 6 years ago - Stars: 124 - Forks: 13

fauu/Kamite

Japanese immersion assistant for learners (Windows/Linux)

Language: Java - Size: 21.8 MB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 119 - Forks: 3

ndavd/ncube

Generalized Hypercube Visualizer

Language: Rust - Size: 64 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 118 - Forks: 7

bweigel/aws-lambda-tesseract-layer

A layer for AWS Lambda containing the tesseract C libraries and tesseract executable.

Language: TypeScript - Size: 39.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 117 - Forks: 35

ilic5000/pabkvizgenerator

Anansi is a computer vision (cv2 and FFmpeg) + OCR (EasyOCR and tesseract) python-based crawler for finding and extracting questions and correct answers from video files of popular TV game shows in the Balkan region.

Language: Python - Size: 486 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 107 - Forks: 6

hertzg/tesseract-server

A small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utilizing the power of Google Tesseract.

Language: TypeScript - Size: 2.16 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 105 - Forks: 23

victorqribeiro/ocr

Simple app to extract text from pictures using Tesseract

Language: HTML - Size: 256 KB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 105 - Forks: 9

Monogramm/erpnext_ocr

:snake: :alembic: Optical Character Recognition using tesseract within Frappe.

Language: Python - Size: 938 KB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 100 - Forks: 53

dalelyunas/manga-translator

Automatically translates manga pages

Language: Python - Size: 1.97 MB - Last synced at: 5 days ago - Pushed at: over 4 years ago - Stars: 98 - Forks: 32

008karan/PAN_OCR

Building OCR using YOLO and Tesseract

Language: Python - Size: 909 KB - Last synced at: 20 days ago - Pushed at: over 3 years ago - Stars: 94 - Forks: 48

onmyway133/MathSolver

⌨️Camera calculator with Vision

Language: Swift - Size: 18.7 MB - Last synced at: 11 days ago - Pushed at: almost 5 years ago - Stars: 92 - Forks: 22

scribeocr/scribeocr

Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.

Language: JavaScript - Size: 222 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 91 - Forks: 13

shelfio/aws-lambda-tesseract

6 MB Tesseract (with English training data) to fit inside AWS Lambda

Language: Shell - Size: 41 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 90 - Forks: 16

SkeathyTomas/genshin_artifact_auxiliary

A Genshin Impact artifact rater sticking upon artifacts inside the game window. 刻晴办公桌 | 原神 | 圣遗物评分。集成在游戏窗口之上的原神圣遗物导出、评分工具,无需游戏内外来回切换对比,游戏中快速计算与查阅结果。

Language: Python - Size: 5.25 MB - Last synced at: 9 months ago - Pushed at: 12 months ago - Stars: 89 - Forks: 7

testica/text-scanner

OCR Android app using tesseract

Language: Java - Size: 15.2 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 87 - Forks: 45

lexmartinez/ocr-electron-vue

:card_index: A Simple OCR Application built on Electron, Vue.js & Tesseract.js

Language: JavaScript - Size: 25.8 MB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 81 - Forks: 23

Hirato/lamiae

Lamiae - A Most Prestigious RPG Engine/Simulator derived from Cube 2 (Sauerbraten) and friends

Language: C++ - Size: 505 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 80 - Forks: 16

aryaminus/memento

Organize your meme image cluster in a better format using OCR from the meme to sort them using tesseract along with editing memes by segmenting them using OpenCV within a directory

Language: Python - Size: 144 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 80 - Forks: 6

Shreeshrii/tess5train-fonts

Files and Scripts to run Tesseract 5 LSTM Training using fonts

Language: HTML - Size: 82.8 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 80 - Forks: 21

Arthelon/imgclip

Command line utility that extracts text from an image into the system clipboard.

Language: JavaScript - Size: 32.2 KB - Last synced at: 1 day ago - Pushed at: about 6 years ago - Stars: 80 - Forks: 10

vjgpt/Vehicle-Number-Plate-Reading

Read Vehicle Number Plate and store the data in a CSV file with date and time.

Language: Python - Size: 331 KB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 79 - Forks: 57

farhanchoudhary/PAN_Card_OCR_Project

To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format

Language: Python - Size: 650 KB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 79 - Forks: 66

hansalemaos/cyandroemu

Android Automation Framework for Python on emulators (BlissOs, BlueStacks, LDPlayer, Memu, Mumu, Android Studio ...) and rooted devices WITHOUT ADB!

Language: Cython - Size: 42.3 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 77 - Forks: 2

JTinkers/ScribeBot 📦

A highly scriptable automation system full of cool features. Automate everything with a little bit of Lua.

Language: C# - Size: 14.3 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 77 - Forks: 21

gustavomazzoni/cordova-plugin-tesseract

Cordova Plugin for OCR process using Tesseract

Language: Java - Size: 1.97 MB - Last synced at: 2 days ago - Pushed at: about 5 years ago - Stars: 77 - Forks: 34

tesseract-ocr/tesseract-ocr.github.io

Tesseract documentation

Language: Ruby - Size: 105 MB - Last synced at: 18 days ago - Pushed at: over 3 years ago - Stars: 75 - Forks: 64

marcincichocki/breach-protocol-autosolver

Solve breach protocol minigame in second(s). Windows/Linux/GeForce Now/Google Stadia. Every language.

Language: TypeScript - Size: 128 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 74 - Forks: 5

nmapx/revolut-stocks-list 📦

Extract Revolut stocks list from the list screenshot(s).

Language: Go - Size: 1.95 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 74 - Forks: 22

a943512/PyAibote

Python package for doing RPA

Language: Python - Size: 4.41 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 73 - Forks: 16

Lynnesbian/OCRbot

An OCR (Optical Character Recognition) bot for Mastodon (and compatible) instances

Language: Python - Size: 216 KB - Last synced at: 3 days ago - Pushed at: almost 4 years ago - Stars: 72 - Forks: 10

Makstein/SnowbreakGachaExport

尘白禁区抽卡记录导出工具 Snowbreak Gacha Log Exporter, WIP

Language: C# - Size: 64.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 64 - Forks: 5

hhhrrrttt222111/handReacting

Text to Handwriting converter made using React.

Language: JavaScript - Size: 1.3 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 64 - Forks: 19

lucab85/PDFtoTXT

Python code to read text from a PDF file (OCR).

Language: Python - Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 63 - Forks: 20

deajan/pmOCR 📦

A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR conversion on file activity

Language: Shell - Size: 1.21 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 60 - Forks: 15

maddevsio/idmatch

Match faces on id cards with OCR capabilities.

Language: Python - Size: 3.99 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 60 - Forks: 15

SwiftyTesseract/SwiftyTesseractRTE 📦

SwiftyTesseract Real-Time Engine

Language: Swift - Size: 195 MB - Last synced at: 4 days ago - Pushed at: almost 5 years ago - Stars: 60 - Forks: 23

doxakis/How-to-use-tesseract-ocr-4.0-with-csharp

How to use Tesseract OCR 4.0 with C#

Size: 45.3 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 60 - Forks: 31

sivakumar-mahalingam/fastmrz

⚡Extracting the Machine Readable Zone (MRZ) from passport or any document images

Language: Python - Size: 67.7 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 59 - Forks: 13

mftnakrsu/Comparison-of-OCR

Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

Language: Python - Size: 47.9 KB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 59 - Forks: 5

nikhilkumarsingh/tesseract-python

Examples to implement OCR(Optical Character Recognition) using tesseract using Python

Language: Python - Size: 57.6 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 57 - Forks: 51

khurram18/SceneTextRecognitioniOS

A scene text recognition demo app using Vision framework and tesseract

Language: Objective-C - Size: 27.2 MB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 57 - Forks: 15

dannnylo/tesseract-ocr-elixir

This package is a wrapper of Tesseract OCR. Helping to read characters on an image.

Language: Elixir - Size: 41 KB - Last synced at: 7 days ago - Pushed at: almost 3 years ago - Stars: 56 - Forks: 10

ropensci/rtika

R Interface to Apache Tika

Language: R - Size: 133 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 54 - Forks: 8

GoogleCloudPlatform/dlp-pdf-redaction

This solution provides an automated, serverless way to redact sensitive data from PDF files using Google Cloud Services like Data Loss Prevention (DLP), Cloud Workflows, and Cloud Run.

Language: HCL - Size: 284 KB - Last synced at: 21 days ago - Pushed at: about 2 months ago - Stars: 53 - Forks: 27

t0mer/ocr-docker

ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR

Language: CSS - Size: 96.1 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 53 - Forks: 13

aryaminus/saram

Get OCR in txt form from an image or pdf extension supporting multiple files from directory using pytesseract with auto rotation for wrong orientation. PYPI:

Language: Python - Size: 34.2 KB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 52 - Forks: 17

alexschultz/ReadToMe

Language: Jupyter Notebook - Size: 213 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 14

StabRise/spark-pdf

PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it

Language: Scala - Size: 5.72 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 49 - Forks: 3

sudonitin/Audio-book-generator

Convert your ebooks to audiobooks. 📖->🎧

Language: Python - Size: 3.96 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 49 - Forks: 14

onmyway133/BigBigNumbers

🔢Say the number out loud

Language: Swift - Size: 24.4 MB - Last synced at: 4 days ago - Pushed at: almost 5 years ago - Stars: 49 - Forks: 13