GitHub topics: pdf2text
zhangshi0512/DevTools
A lightweight Python-based Software Package for daily use
Language: Python - Size: 5.13 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

modesty/pdf2json
converts binary PDF to JSON and text, for server-side PDF processing and command-line use.
Language: Java - Size: 121 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 2,094 - Forks: 382

NikhilTeja21/Audio-Books
This project converts PDF files into audiobooks with synchronized subtitles in .vtt format. It uses FastAPI for the backend and Microsoft's Edge TTS for text-to-speech conversion.
Language: Python - Size: 7.81 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

seinecle/nocodefunctions-web-app
The code base of the front-end of nocodefunctions.com
Language: CSS - Size: 37.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 39 - Forks: 7

TheLime1/CheatoMate
A collection of scripts to "help" you with your programming exams and assignments.
Language: Python - Size: 214 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 1

AzozzALFiras/Pdf-OCR
A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.
Language: HTML - Size: 16.6 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 3 - Forks: 1

sahil352005/ChatWithPdf-Images
A Streamlit-based app that allows users to upload PDFs or images, extract text, and engage in interactive Q&A. Using Google Generative AI, this app enables insightful conversations based on document contents. Ideal for those seeking quick answers from their files in a simple, intuitive interface.
Language: Python - Size: 34.2 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

seinecle/nocodefunctions-io
io for nocodefunctions: csv, txt, pdf, and xlsx so far
Language: Java - Size: 174 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

worldbank/wb-nlp-tools
Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.
Language: Python - Size: 2.73 MB - Last synced at: 27 days ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 7

andrealenzi11/py-poppleract
Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents
Language: Python - Size: 202 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 10 - Forks: 2

davibusanello/pdf2txt
A simple CLI to to convert PDF files into TXT using OCR
Language: Python - Size: 23.4 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yakovypg/Ypdf
We present Ypdf, a PDF document processing application that combines the best features of existing solutions and provides the most popular and requested functionality for free to its users.
Language: C# - Size: 4.17 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 19 - Forks: 5

FastPDFTeam/pdf-to-word-converter
Fast PDF to Word Converter is the Fastest Batch PDF Converter easily converting PDF to fully editable Office Word,Text,RTF,HTML and more
Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

chiraag-kakar/PyAutomation
Simple and Useful Automation Tools built with the help of modules available with Python published at PyPI.
Language: Python - Size: 1.03 MB - Last synced at: 14 days ago - Pushed at: over 4 years ago - Stars: 11 - Forks: 1

DrMcCoy/pdftextorizer
Interactively extract text from multi-column PDFs
Language: Python - Size: 178 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

johbar/go-poppler Fork of timsat/go-poppler
Limited, yet memory-leak-free Go wrapper for a Poppler PDF library
Language: Go - Size: 35.2 KB - Last synced at: 7 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

BinhQuocLy/Pdf2Quiz Fork of thejungwon/Pdf2Quiz
A Pdf2Quiz NLP model.
Language: Python - Size: 4.16 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

TanishqChamoli/Newspaper_Mining
Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.
Language: Python - Size: 16.5 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

StephanyBatista/ExtractOcrApi
A API in .Net Core to extract documents OCR with many libs linux
Language: C# - Size: 25.4 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 6 - Forks: 4

fer-aguirre/pdf-2-ner
Web application for information extraction and named entity recognition for PDF files (work-in-progress).
Language: Jupyter Notebook - Size: 294 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

1994nikunj/textify-pdf
Textify-PDF: Extracting Text from PDF Files
Language: Python - Size: 105 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Isaccseven/pdf2text
Extract text from pdf using ocr
Language: Python - Size: 7.81 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

imesut/PdfReg
PdfReg is a web tool, which gets text at selected regions of pdf document.
Language: JavaScript - Size: 1.28 MB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

senavs/pdfto
:heavy_check_mark: A Python Flask API to manage PDF files.
Language: Python - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

views63/pdf2text
pdf to text
Language: Rust - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0
