Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: scanned-documents

Udayraj123/OMRChecker

Evaluate OMR sheets fast and accurately using a scanner 🖨 or your phone 🤳.

Language: Python - Size: 29.2 MB - Last synced: about 3 hours ago - Pushed: about 4 hours ago - Stars: 671 - Forks: 293

papermerge/papermerge-core

In this repository is the source code of Papermerge DMS backend core, REST API server, and frontend UI

Language: Python - Size: 11.9 MB - Last synced: about 9 hours ago - Pushed: 1 day ago - Stars: 257 - Forks: 47

ispras/dedoc

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Document logical extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser)

Language: Python - Size: 221 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 77 - Forks: 12

ciur/papermerge

Open Source Document Management System for Digital Archives (Scanned Documents)

Language: Python - Size: 25.1 MB - Last synced: 17 days ago - Pushed: about 1 month ago - Stars: 2,340 - Forks: 246

ad-si/awesome-scanning

A curated list of awesome projects to simplify and improve paper and document scanning.

Size: 83 KB - Last synced: 17 days ago - Pushed: 30 days ago - Stars: 347 - Forks: 21

ahmetozlu/signature_extractor

A super lightweight image processing algorithm for detection and extraction of overlapped handwritten signatures on scanned documents using OpenCV and scikit-image.

Language: Python - Size: 3.9 MB - Last synced: 22 days ago - Pushed: about 1 year ago - Stars: 427 - Forks: 132

papermerge/papermerge-cli

Papermerge DMS command line utility

Language: Python - Size: 106 KB - Last synced: 25 days ago - Pushed: 3 months ago - Stars: 4 - Forks: 3

MaxineXiong/Scraping-Scanned-PDF-Docs-using-OCR-with-RPA

This repository contains automation solutions that efficiently extracts text from scanned PDF documents with consistent layouts. Utilizing Tesseract OCR engine, the UiPath RPA robot achieves nearly 90% accuracy, streamlining the process and significantly reducing manual workload.

Size: 4.96 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

papermerge/documentation

Documentation for Papermerge DMS - Installation, Help, User Manual, REST API

Language: HTML - Size: 38.5 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 13 - Forks: 5

karolzak/boxdetect

BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.

Language: Python - Size: 7.43 MB - Last synced: 3 days ago - Pushed: over 1 year ago - Stars: 95 - Forks: 20

baltpeter/scanprep

Small utility to prepare scanned documents. Supports separating PDF files by separator pages and removing blank pages.

Language: Python - Size: 512 KB - Last synced: 12 days ago - Pushed: 13 days ago - Stars: 28 - Forks: 9

paulocressoni/scanned_pdf_ocr

Apply OCR on scanned PDF files to extract text from the PDF images.

Language: Shell - Size: 5.86 KB - Last synced: about 2 months ago - Pushed: over 4 years ago - Stars: 1 - Forks: 1

rohanrav/document-scanner

Document scanner created using openCV and python.

Language: Python - Size: 623 KB - Last synced: 2 months ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0

brakmic/OpenCV

:camera: Computer-Vision Demos

Language: C# - Size: 1.28 MB - Last synced: about 2 months ago - Pushed: over 8 years ago - Stars: 263 - Forks: 60

4lex4/scantailor-advanced

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.

Language: C++ - Size: 7.82 MB - Last synced: 3 months ago - Pushed: 8 months ago - Stars: 1,087 - Forks: 128

atgreen/paperless

Emacs-assisted PDF document filing

Language: Emacs Lisp - Size: 215 KB - Last synced: 23 days ago - Pushed: 4 months ago - Stars: 129 - Forks: 11

apurvmishra99/pdf-to-scan

Make your PDFs look like they were scanned

Language: Python - Size: 9.77 KB - Last synced: 2 days ago - Pushed: about 4 years ago - Stars: 81 - Forks: 7

susam/tucl

The first-ever paper on the Unix shell written by Ken Thompson in 1976 scanned, transcribed, and redistributed with permission

Language: Makefile - Size: 4.4 MB - Last synced: 4 months ago - Pushed: over 1 year ago - Stars: 350 - Forks: 21

maxim2266/go-ocr 📦

A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

Language: Go - Size: 41 KB - Last synced: 2 months ago - Pushed: about 4 years ago - Stars: 35 - Forks: 8

vijayengineer/PDFTextSpeechConverter

Converts scanned documents and ordinary documents into speech mp3 using Amazon Polly

Language: Python - Size: 1.18 MB - Last synced: 7 months ago - Pushed: over 3 years ago - Stars: 4 - Forks: 1

deckerego/docidx

A document indexing daemon that can populate Elasticsearch indexes with the contents and metadata of a number of document types including PDF, image scans, etc. Used to power Facile Search, however can be re-used for anything that requires search indexing for scanned documents.

Language: Java - Size: 1.63 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 1 - Forks: 2

MaxineXiong/Scraping-Scanned-Docs-using-OCR-with-RPA

This repository contains an automation solution that efficiently extracts text from scanned PDF documents with consistent layouts. Utilizing OCR-based screen scraping, the UiPath RPA robot achieves nearly 98% accuracy, streamlining the process and significantly reducing manual workload.

Size: 1.37 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

alucic2/cluster_htrc

Identifying the boundaries of main content of fiction and non-fiction works in the HathiTrust Extracted Features dataset.

Language: Jupyter Notebook - Size: 240 KB - Last synced: 9 months ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

svitlana1209/OCR-search

Searching for a text using OCR, detection and recognition of tables in scanned documents.

Language: Python - Size: 43 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 3 - Forks: 1

PDFTron/pdftron-android-ocr-scanner-sample

Android Scanner with OCR support using PDFTron

Language: Kotlin - Size: 123 MB - Last synced: 10 months ago - Pushed: almost 3 years ago - Stars: 24 - Forks: 7

BlackStar1313/ICDAR-2019-RRC-SROIE

ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction

Language: Python - Size: 32.9 MB - Last synced: 7 months ago - Pushed: almost 2 years ago - Stars: 29 - Forks: 9

binDebug3/scanner_automation

A program to automate simple and repetitive tasks while scanning documents by Dallin Stewart

Language: Python - Size: 124 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

beast/react-native-scan-doc

A document scanner that automatically trims the edge with perspective transform

Language: Java - Size: 278 KB - Last synced: about 1 year ago - Pushed: almost 6 years ago - Stars: 41 - Forks: 9

hnjm/papermerge Fork of ciur/papermerge

Open Source Document Management System for Digital Archives (Scanned Documents)

Size: 24.6 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

skconan/Scanned-Document-Rotation-Correction

The project creates the models and service API for predicting scanned document images' angles ranging between -90° to 90° from the vertical.

Language: Python - Size: 5.11 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 1

deckerego/docmag

The web UI for Facile Search. Together with DocIndex, this UI can help you search the myriad of scanned documents you have been accumulating over the years. Using the power of Docker & Elasticsearch you can run a powerful search engine that lets you convert scanned (image-based) PDFs to searchable text, group documents by letterhead, run fuzzy searches by date and view document metadata.

Language: Groovy - Size: 2.46 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 3 - Forks: 0

Hawk453/OCR_FOR_PDFS

Optical Character Recognition for Scanned Documents

Language: Python - Size: 6.84 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 2 - Forks: 0

goodday451999/Character-Segmentation-of-Scanned-Text

Segmentation of Scanned Text upto Character Level

Language: Python - Size: 603 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 7 - Forks: 6

timberger/Searchable-Image-PDF-Creat-O-Mat

This batch script creates a searchable PDF of a PDF with one or more scanned pages which contain images.

Language: Batchfile - Size: 28.3 KB - Last synced: 12 months ago - Pushed: over 1 year ago - Stars: 6 - Forks: 0

imakashsahu/Images-or-Scanned-Documents-into-Searchable-PDFs

This is a Flask Based Project to convert Images, Scanned Documents or Multiple Page PDF into Searchable PDF

Language: CSS - Size: 14.5 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 4 - Forks: 1

paulveillard/cybersecurity-internet-scanning

An ongoing & curated collection of awesome software best practices and techniques, libraries and frameworks, E-books and videos, websites, blog posts, links to github Repositories, technical guidelines and important resources about Internet Scanning in Cybersecurity

Size: 20.5 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 0

milahu/document-photo-auto-threshold

auto-correct contrast and brightness of photographed document

Language: Python - Size: 6.84 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

dsabarinathan/DocumentTableSeg

Implementation of scanned document table segmentation with U-net

Language: Python - Size: 57.6 KB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 6 - Forks: 2

legenscandary/scan

An automatic scan server software for scanners with document feeder. It creates multi-page PDFs with selectable text (OCR) by just one button press.

Language: Shell - Size: 161 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

Viscomsoft/Scanner-Pro-SDK-ActiveX-x64

TWAIN Scanning SDK for 64 bit and 32 bit MS Access, VB.NET, C#, Delphi and Visual C++ and 32 bit Visual Basic 6 and VFP.

Language: C# - Size: 172 KB - Last synced: 12 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 1

drxwat/scanned_digits_recognition

Scanned digits detector and classifier (CNN, OpenCV)

Language: Jupyter Notebook - Size: 27.3 MB - Last synced: about 1 year ago - Pushed: about 7 years ago - Stars: 1 - Forks: 1

bearrundr/scantailor-custom

scantailor customization add some new functions

Language: C++ - Size: 2.88 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

Viscomsoft/Scanner-TWAIN-SDK-ActiveX

For Windows Developers who need to capture image from scanner, digital camera or capture card that has a TWAIN device driver with C++, C#, VB.NET , VB, Delphi, Vfp, MS Access.

Language: Visual Basic .NET - Size: 490 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 1

rbrito/pkg-pdfbeads

Debian packaging of pdfbeads

Language: Ruby - Size: 76.2 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0