An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: pdf2text

zhangshi0512/DevTools

A lightweight Python-based Software Package for daily use

Language: Python - Size: 5.13 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

modesty/pdf2json

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.

Language: Java - Size: 121 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 2,094 - Forks: 382

NikhilTeja21/Audio-Books

This project converts PDF files into audiobooks with synchronized subtitles in .vtt format. It uses FastAPI for the backend and Microsoft's Edge TTS for text-to-speech conversion.

Language: Python - Size: 7.81 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

seinecle/nocodefunctions-web-app

The code base of the front-end of nocodefunctions.com

Language: CSS - Size: 37.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 39 - Forks: 7

TheLime1/CheatoMate

A collection of scripts to "help" you with your programming exams and assignments.

Language: Python - Size: 214 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 1

AzozzALFiras/Pdf-OCR

A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.

Language: HTML - Size: 16.6 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 3 - Forks: 1

sahil352005/ChatWithPdf-Images

A Streamlit-based app that allows users to upload PDFs or images, extract text, and engage in interactive Q&A. Using Google Generative AI, this app enables insightful conversations based on document contents. Ideal for those seeking quick answers from their files in a simple, intuitive interface.

Language: Python - Size: 34.2 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

seinecle/nocodefunctions-io

io for nocodefunctions: csv, txt, pdf, and xlsx so far

Language: Java - Size: 174 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

worldbank/wb-nlp-tools

Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.

Language: Python - Size: 2.73 MB - Last synced at: 27 days ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 7

andrealenzi11/py-poppleract

Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents

Language: Python - Size: 202 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 10 - Forks: 2

davibusanello/pdf2txt

A simple CLI to to convert PDF files into TXT using OCR

Language: Python - Size: 23.4 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yakovypg/Ypdf

We present Ypdf, a PDF document processing application that combines the best features of existing solutions and provides the most popular and requested functionality for free to its users.

Language: C# - Size: 4.17 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 19 - Forks: 5

FastPDFTeam/pdf-to-word-converter

Fast PDF to Word Converter is the Fastest Batch PDF Converter easily converting PDF to fully editable Office Word,Text,RTF,HTML and more

Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

chiraag-kakar/PyAutomation

Simple and Useful Automation Tools built with the help of modules available with Python published at PyPI.

Language: Python - Size: 1.03 MB - Last synced at: 14 days ago - Pushed at: over 4 years ago - Stars: 11 - Forks: 1

DrMcCoy/pdftextorizer

Interactively extract text from multi-column PDFs

Language: Python - Size: 178 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

johbar/go-poppler Fork of timsat/go-poppler

Limited, yet memory-leak-free Go wrapper for a Poppler PDF library

Language: Go - Size: 35.2 KB - Last synced at: 7 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

BinhQuocLy/Pdf2Quiz Fork of thejungwon/Pdf2Quiz

A Pdf2Quiz NLP model.

Language: Python - Size: 4.16 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

TanishqChamoli/Newspaper_Mining

Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.

Language: Python - Size: 16.5 MB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

StephanyBatista/ExtractOcrApi

A API in .Net Core to extract documents OCR with many libs linux

Language: C# - Size: 25.4 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 6 - Forks: 4

fer-aguirre/pdf-2-ner

Web application for information extraction and named entity recognition for PDF files (work-in-progress).

Language: Jupyter Notebook - Size: 294 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

1994nikunj/textify-pdf

Textify-PDF: Extracting Text from PDF Files

Language: Python - Size: 105 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Isaccseven/pdf2text

Extract text from pdf using ocr

Language: Python - Size: 7.81 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

imesut/PdfReg

PdfReg is a web tool, which gets text at selected regions of pdf document.

Language: JavaScript - Size: 1.28 MB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

senavs/pdfto

:heavy_check_mark: A Python Flask API to manage PDF files.

Language: Python - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

views63/pdf2text

pdf to text

Language: Rust - Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 0