GitHub topics: pymupdf | Ecosyste.ms: Repos

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language: Python - Size: 329 MB - Last synced at: about 20 hours ago - Pushed at: about 21 hours ago - Stars: 7,127 - Forks: 601

Krasjet/pdf.tocgen

A CLI toolset to generate table of contents for PDF files automatically.

Language: Python - Size: 430 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 737 - Forks: 25

jdonohue44/NOAA-Weather-Modification-Forms-LLM-Extractor

Extract key information from 1,000s of NOAA Form 17-4 (Initial Report On Weather Modification Activities) using LLM.

Language: Python - Size: 982 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

(eBook，PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.

Language: Python - Size: 104 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,989 - Forks: 275

HemalDholakiya12/PDFChat

A web app that allows users to upload PDFs and interact with them through a Q&A interface. The application extracts text from PDFs, generates embeddings, stores them in a FAISS database, and retrieves relevant information to provide context-aware answers using a large language model .

Language: JavaScript - Size: 119 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2 - Forks: 0

dipanshudhage/Crop-and-Fertiliser-Recommendation-System

The Crop and Fertilizer Recommendation System leverages machine learning to assist farmers in selecting the best crops and fertilizers based on soil nutrient data. By analyzing soil test reports (images/PDFs), the system provides AI-driven recommendations for optimal crop growth and fertilizer use, tailored to the farmer’s specific soil conditions.

Language: Python - Size: 4.12 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

lefkovitzj/PyPdfApp

A PDF manipulation and access application developed in Python predominantly built using the PyMuPDF and CustomTkinter modules.

Language: Python - Size: 1.05 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

JoseLVillaronga/teccam_pdf

Teccam PDF es una aplicación web en Python/Flask que extrae texto de documentos PDF y páginas web, lo convierte automáticamente a Markdown y lo almacena en MongoDB. Ofrece interfaz responsive con modo claro/oscuro, gestión de permisos (público/privado), marcadores de posición de lectura y despliegue como servicio systemd.

Language: HTML - Size: 41 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

vikas-kashyap97/Resume-Screening

AI-Powered Research Summarizer is a web app that uses Google’s Gemini 1.5 Pro to generate tailored, clear summaries of research papers. It supports PDF uploads, multiple summary styles, and exports to DOCX or PDF.

Language: Python - Size: 109 KB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

AhmedTrb/PDF_highlight_extractor

A python application built with PySide6 and PyMuPDF that extracts highlighted text from PDF files and categorizes then based on the color, allowing users to save and organize highlighted content in a markdown file.

Language: Python - Size: 35.2 KB - Last synced at: about 21 hours ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

pawankumar94/graphscribe-table-extractor

Graphscribe is an intelligent, LLM-powered document understanding system designed to extract structured insights from complex visual content such as statistical diagrams, charts, and graphs.

Language: Python - Size: 19.6 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

code-418-dpr/SportHub-parser

Парсер PDF-файла ЕКП Минспорта РФ для проекта SportHub

Language: Python - Size: 2.26 MB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

genieincodebottle/parsemypdf

Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

Language: Python - Size: 2.75 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 61 - Forks: 20

NaS-Research/knowledge-model

Our knowledge system systematically ingests, processes, and indexes open-access life science publications. It supports internal research by providing precise question-answering and efficient retrieval from a continuously updated repository of scientific literature

Language: Python - Size: 95.4 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

GokulGowthamS/AskDocs_GEN-AI

AskDocs Generative AI

Language: Python - Size: 2.1 MB - Last synced at: 25 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

vb64/markdown-pdf

Markdown to pdf renderer

Language: Python - Size: 539 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 78 - Forks: 6

alexandertiopan1212/SmartScan-AI

SmartScan-AI is a Streamlit app for invoice & PO extraction, matching, and AI-powered document Q&A.

Language: Python - Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

ArtifexSoftware/pdf2docx

Open source Python library for converting PDF to DOCX.

Language: Python - Size: 21.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2,867 - Forks: 406

esnanta/docu-query

Proyek ini merupakan prototipe awal chatbot berbasis AI yang dirancang untuk menyajikan informasi terkait regulasi.

Language: Jupyter Notebook - Size: 1.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

cloudy-sfu/TOC-to-bookmarks

Automatically create bookmarks from "table of content" for *.pdf books

Language: Python - Size: 609 KB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

germabyte/pdf-ocr-remover

This program helps you remove the invisible text layer (also known as the OCR layer) from PDF files.

Language: Python - Size: 8.79 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

lucasrla/remarks

Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG

Language: Python - Size: 3.8 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 369 - Forks: 25

errejotaeme/diagrama

Herramienta para generar diagramas

Language: Python - Size: 6.36 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Ananthakrishnan12/Resume-Analyzer-Using-BERT

Resume Analyzer Using BERT

Language: Python - Size: 8.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Zain-Bin-Arshad/pdf-viewer

A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.

Language: Python - Size: 338 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 82 - Forks: 21

shushilgirish/BigData_DataProcess_andMarkDownViewer Fork of khavnekar-y/AI-Information-Extractor

Automated Document Processing and Markdown Generation System

Language: Python - Size: 2.44 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

shefreenkaur/NLP_Query_Documents

This repository contains two implementations of an NLP document query system that processes PDF documents and ranks them based on relevance to user queries.

Language: Python - Size: 0 Bytes - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

jfriedlein/h2aFreeplane_pdf-highlightedText_to_Freeplane_synch

Freeplane script to organise highlighted text and notes from pdf files as Freeplane mindmap

Language: Tcl - Size: 113 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

boyac/pyGamgee

PyGamgee runs DeepSeek LLM with Ollama, using PyMuPDF for PDF extraction and FAISS for fast vector search. With LangChain RAG and conversation memory, it enables efficient, private document understanding—fully offline.

Language: Python - Size: 1.21 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

jfriedlein/h2a_pdf-highlightedText_to_annotation

Python tool to extract highlighted text from a pdf file and write this text into the content of each annotation

Language: Python - Size: 5.21 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

marek-jakub/siters

A simple .pdf file reader, written in Python.

Language: Python - Size: 12.7 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Lingesh81051/Similar-Template-Document-Matching-and-Fraud-Detection

An automated system for a health insurance company to streamline document processing, including template matching and fraud detection, resulting in reduction of processing time.

Language: Python - Size: 1.37 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

StarodubovAV/Python_Projects

This is repo for various python projects

Language: Jupyter Notebook - Size: 6.99 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

DioCrafts/ai-book-summarizer

📚 AI-Powered Book PDF Knowledge Extractor & Summarizer Transform your PDF books into structured knowledge effortlessly! This tool leverages AI to analyze books page by page, extracting key insights, definitions, and concepts, and organizes them into Markdown summaries for easier study

Language: Python - Size: 29.6 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

openandclose/pdfslash

Crop PDF margins from interactive interpreter

Language: Python - Size: 1.02 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

shefreenkaur/Web-Scraping-and-Word-Frequencies

This project analyzes word frequencies in BC Legislative documents using Stanford CoreNLP and Python. The program extracts text from PDF documents, processes it using natural language processing techniques, and generates a comprehensive word frequency analysis.

Language: Python - Size: 3.11 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Al-shwaib/Book-Preparation-for-Printing

A web application for preparing books and magazines for offset printing. Automatically arranges PDF pages for commercial A3 printing, supporting both Arabic (RTL) and English (LTR) books. تطبيق ويب لتحضير الكتب والمجلات للطباعة على مطابع الأوفست. يقوم تلقائياً بترتيب صفحات PDF للطباعة التجارية على ورق A3، مع دعم الكتب العربية والإنجليزية.

Language: Python - Size: 40 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

elias-jhsph/scienceai

An AI-powered scientific literature search engine that uses OpenAI's language models to analyze research papers. It enables users to extract data, ask complex questions, and perform ad hoc literature reviews, handling hundreds of papers simultaneously without needing metadata.

Language: Python - Size: 144 KB - Last synced at: 21 days ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

xxao/pero

Unified Python drawing API

Language: Python - Size: 5.54 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 34 - Forks: 4

BigDataIA-Spring2025-4/DAMG7245_Assignment01

A Streamlit-based app with a FastAPI backend for extracting structured data (text, images, tables) from websites and PDFs. Processed data is stored in AWS S3 and rendered in a markdown-standardized format. APIs are deployed on Google Cloud Run Service

Language: Jupyter Notebook - Size: 90.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

nature-of-eu-rules/data-preprocessing

Document preprocessing scripts for the Nature of EU Rules project

Language: Python - Size: 123 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

amirlogic/pymupdf-webapp

PyMuPDF webapp based on CherryPy

Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Muneeb1030/FineTune-Tiny-Llama

Fine-tuning the Tiny Llama model to mimic my professor's writing style using the Llama Factory. The project involves data collection, preprocessing, preparation, fine-tuning, and evaluation.

Language: Jupyter Notebook - Size: 390 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

nsourlos/kindle_to_pdf

Transfer your Kindle highlights and notes (mobi or PDF) to PDF files

Language: Python - Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Prakshal0809/RAG-Chatbot

Developed a RAG-based chatbot for seamless integration with an e-hospital platform, enhancing response accuracy by 30% through reliable, trusted medical data sources. Processed over 500+ pages of medical data, enabling real-time symptom analysis and disease suggestions.

Language: TypeScript - Size: 147 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ks6088ts-labs/extractor-python 📦

A data extract tool written in Python

Language: Python - Size: 159 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Guo-dalu/pdf-helper

This Python tool enables batch processing of PDFs using PyMuPDF, offering OCR text extraction and compression for handling multiple image-based PDFs efficiently.

Language: Python - Size: 26.7 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

philippe2023/RAG-Question-Answering-App

An AI-powered Question Answering application that uses Retrieval-Augmented Generation (RAG) to provide accurate and context-aware answers from uploaded PDF documents.

Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Tech-C-P/ConversAI

ConversAI is an innovative conversational AI framework designed for intelligent text extraction and querying across various document formats and web content, leveraging advanced natural language processing techniques.

Language: Python - Size: 1.02 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

hase3b/SCPRAG

This repository implements a Retrieval-Augmented Generation (RAG) system for the Supreme Court of Pakistan, utilizing different LLMs, embedding models, and retrieval and generation enhancement strategies. It processes SCP judgments, applies chunking, and generates legal summaries and answers based on relevant case data.

Language: Jupyter Notebook - Size: 57.4 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

OrenGrinker/pdfLLM

The PDF Question Answering App uses Streamlit for a user-friendly interface where users can upload PDFs and ask questions. It employs LlamaIndex to index PDF content and PyMuPDF4LLM to parse files, enabling efficient, accurate answers based on the document’s text.

Language: Python - Size: 6.84 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

kezb90/PDF_To_Word

A Python-based tool that converts PDF files into editable Word documents, preserving text, images, and layout. Uses PyPDF2, PyMuPDF (fitz), python-docx, and Pillow to accurately transfer content from PDF to .docx. Ideal for transforming complex PDFs into Word format for easy editing.

Language: Python - Size: 8.79 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

venkatarangan/ProductsDigest

A Python-based web scraper that fetches details from specified product webpages, especially Amazon product pages.

Language: Python - Size: 962 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ChristophWenk/PDFSorter

Sort and rename PDFs according to their content

Language: Python - Size: 56.6 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

olonok69/Nim_LlamaIndex

Integracion LLamaIndex with NVIDIA NIM

Language: Jupyter Notebook - Size: 2.42 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Ap6pack/PDF-Search-Plus

A Python application that extracts text and images from PDFs, applies OCR to images using Tesseract, and stores the results in a SQLite database. The application features a GUI for searching both text and OCR-extracted content and previewing PDF files.

Language: Python - Size: 39.1 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

vickypandey14/Convert-PDF-into-Image-By-Python

This Python script converts each page of a PDF document into separate image files. It utilizes the PyMuPDF library (fitz) to handle PDF operations and the Python Imaging Library (PIL) for image processing.

Language: Python - Size: 248 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

renan-siqueira/python-pdf-tool

This project facilitates the extraction of text from PDF files using various Python libraries. It is designed to be flexible, allowing the choice among different text extraction libraries and supporting both single PDF file and directory containing multiple PDF files.

Language: Python - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

Srivacthi/Acronym-List-Generator

Generates an Acronym List for your PDF quickly and locally for over 200 pages of text

Language: Python - Size: 13.7 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

benitomartin/scraping-to-sql

Open Source Contribution to Justicio Project

Language: Jupyter Notebook - Size: 6.46 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

timothy-bartlett/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language: Python - Size: 288 MB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

mriffey1/vendor-hall-exhibitors

Converts a PDF map of Gen Con's Exhibitor with their booth # to Google Sheets

Language: Python - Size: 2.76 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

zayigo/BUL-Insight

Elaborazione e archiviazione dei dati del piano Banda Ultra Larga

Language: Python - Size: 153 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

gustavo-bordin/fdp

FDP is a programming language created to make PDF text extraction easy

Language: Python - Size: 115 KB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

FlorianLD/invoice_data_extraction

POC for an automated system extracting invoice data from mail attachments using computer vision, and sending the extracted data to a Google Sheet for further analysis by business teams.

Language: Jupyter Notebook - Size: 3.19 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

petalaleite/boomer_pdf_scraping

This app scraps through especific pdf data em extract them to a new spreadsheet using Pandas.

Language: Python - Size: 476 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

paolpal/PDFWizard

Toolkit for pdf editing.

Language: Python - Size: 45.9 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

jonalfarlinga/pdiff

A simple utility for diffing PDFs.

Language: JavaScript - Size: 289 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

chettiargautam/PDF-Utilities

A repository that contains some personal and shared code for PDF Processing Utilities. This is only for educational purposes please do not redistribute.

Language: Python - Size: 12.7 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Erdos1729/automated-snapshot-of-annotated-content-in-pdfs

This repository will automate the process of saving snapshots of highlighted content within multiple pdf files.

Language: Python - Size: 2.62 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

coycs/pdf-streamlit

PDF tools, written with Python, deployed on Streamlit

Language: Python - Size: 15.6 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

s4shreya/abc-ask-me-anything

It is a Full stack web application where user can upload pdf document and ask questions related to its content.

Language: JavaScript - Size: 175 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

omn1vor/omni_pdf_to_png

А simple wrapper for PyMuPDF

Language: Python - Size: 1.95 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

pymupdf/PyMuPDF-Utilities

Demos, examples and utilities using PyMuPDF

Language: Jupyter Notebook - Size: 163 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 450 - Forks: 130

pymupdf/PyMuPDF-Optional-Material

Help file downloads, early ZIP binaries, wheels for retired Python 2.7, 3.5.

Size: 2.76 GB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 14 - Forks: 3

lefkovitzj/PySimplePDF

A simple PDF Viewing application written in Python using PyMuPDF, Pillow and CustomTkinter.

Language: Python - Size: 8.79 KB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

afeefa-qureshi/Encrypt-Decrypt-PDF-files-using-Python

This Python project provides a simple yet powerful tool to encrypt and decrypt PDF files. It utilizes the PyPDF2 and PyMuPDF libraries to perform encryption and decryption operations, making it easy to secure sensitive PDF documents or access password-protected files.

Language: Python - Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

TheWatcherMultiversal/pdfgui_tools

pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.

Language: Python - Size: 3.31 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 2

Sazizi2025/PDF-Founder

Are you short on time?! Can't you search all the PDFs one by one for the content you want?! Well, PDF-Founder is here...

Language: Python - Size: 517 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

STROAD/Merge2PDF

Merge to PDF

Language: Python - Size: 142 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RomyJr/Retrocession_Detector

This application facilitates the comparison of two PDF files. Differences are presented in a table, color-coded as red (deletions), green (additions), and orange (moved text). Users can save the results in Excel format. It is designed to check whether annotations have been taken into account during the comparison process.

Language: Python - Size: 140 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RomyJr/PDF_TXT_Word_research

This application simplifies PDF keyword searches, allowing users to easily find specific terms in files or folders. Results are displayed clearly, and the history feature enables quick review and filtering of past searches. Users can click on document links in the history to open them directly in the default PDF viewer.

Language: Python - Size: 105 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bilalhameed248/PDF-Document-Extraction

Python PDF-to-HTML Converter: Transforming PDF Documents into Structured HTML Tags. - Feb 2022 - Jun 2023

Language: Python - Size: 73.2 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

proviveknayan/document-keyword-search

Search PDF for specific keywords using Python 3. A simple Python program that searches all PDF documents in a folder for a set of keywords and lists all documents along with the keywords present in them.

Language: Jupyter Notebook - Size: 2.35 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

wh01Samyak/PDF_proofreader

Open source PDF proofreader

Language: Python - Size: 71.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

antoniotejada/srdine

Generates enhanced Dungeons and Dragons 5e SRD pdf

Language: Python - Size: 48.8 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

gautam132002/invoice-pdf-data-extraction

Automated extraction of specific information from invoices, achieving over 95% accuracy.

Language: Python - Size: 4.03 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

TeoJJss/image-playground

Language: HTML - Size: 6.84 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

aphp/edspdf-mupdf

MuPDF extension for EDS-PDF

Language: Python - Size: 1.56 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

devxzh/PDFTools

基于pyqt5, pymupdf实现的批量添加目录书签，增强pdf，拆分合并pdf的小工具

Language: Python - Size: 40.1 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 39 - Forks: 6

Pranay-03/pdf-compression

This project involves accessing all the files from a google drive folder and compressing them with out the loss in quality

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

stroblme/UNote

Fills the lack of an open-source PDF Editor with the capability to draw and add notes

Language: Python - Size: 79.6 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 1

MaarkNassef/PDFGame

This project is a web application built using the Flask framework that allows users to upload a PDF file containing text and converts it into a new PDF file where each page of the original PDF is represented as an image. The application will use the PyMuPDF library to read and convert the text pages into images and also to write the new PDF file.

Language: Python - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0