An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: pymupdf-fitz

pawankumar94/graphscribe-table-extractor

Graphscribe is an intelligent, LLM-powered document understanding system designed to extract structured insights from complex visual content such as statistical diagrams, charts, and graphs.

Language: Python - Size: 19.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Jatin-s16/Resume-check-portal-for-candidates

A Streamlit-based application that enables job seekers to evaluate and enhance their resumes by analyzing alignment with specific job descriptions, providing actionable insights for improvement.

Language: Jupyter Notebook - Size: 267 MB - Last synced at: 17 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

atthharvva/PDF-Form-Reader

This Python script extracts information from PDF forms using OCR (Optical Character Recognition) and saves the extracted data into an Excel file. It is particularly designed for processing forms with checkboxes and textual fields. The script can handle variations in form structure and allows for easy customization to accommodate other PDF form type

Language: Python - Size: 4.53 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ParthaPRay/pdf_text_extraction_json_section_subsection

This repo contains codes for extraction of PDF text to JSON to show section number, section title, section body content, footnote

Language: Python - Size: 2.01 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

FrancisLauriano/chatsoftex

Plataforma desenvolvida em Python que visa automatizar e agilizar o processo de avaliação de projetos de inovação tecnológica, utilizando inteligência artificial e critérios padronizados com base na Lei do Bem.

Language: Python - Size: 87.2 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

das-amlan/PDF_Image_Extractor_Web_App

This is a simple web app that allows users to upload a PDF file, extract images from the PDF, and display the images in the web app.

Language: Python - Size: 14.5 MB - Last synced at: 25 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 2

LSH-1082/HyoPy

파이썬 크롤링을 통한 개인별 폴더 생성, 개인별 pdf분할

Language: Python - Size: 26.4 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

devbm7/QGen

Question Generator System

Language: Python - Size: 107 KB - Last synced at: 15 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 1

vickypandey14/Convert-PDF-into-Image-By-Python

This Python script converts each page of a PDF document into separate image files. It utilizes the PyMuPDF library (fitz) to handle PDF operations and the Python Imaging Library (PIL) for image processing.

Language: Python - Size: 248 KB - Last synced at: 20 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

OtenMoten/pdf-alchemist

It's designed for transmuting PDFs into HTML. Harness the power of OCR, image processing, and web technologies to unlock the secrets within your PDF documents.

Language: Python - Size: 2.22 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

IglesiasT/comparador-pdfs

Language: Python - Size: 82.7 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

raju-2003/KSP-DATATHON-24

Data Privacy in Law Enforcement - KSP DATATHON - 2024 - FIR Redactor

Language: Python - Size: 792 KB - Last synced at: 25 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

ashutosh6500/Resume-Parser-AWS-Event-Driven-Workflow

This is simple event driven mini project based on different AWS services like Lambda,EC2,Dynamodb,S3,SNS etc

Language: Python - Size: 20.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mcagriaksoy/diff_merge_pdf

A tool for compare, merge, display difference and make OCR between the PDFs.

Language: Python - Size: 1.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Sazizi2025/PDF-Founder

Are you short on time?! Can't you search all the PDFs one by one for the content you want?! Well, PDF-Founder is here...

Language: Python - Size: 517 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

RomyJr/Retrocession_Detector

This application facilitates the comparison of two PDF files. Differences are presented in a table, color-coded as red (deletions), green (additions), and orange (moved text). Users can save the results in Excel format. It is designed to check whether annotations have been taken into account during the comparison process.

Language: Python - Size: 140 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RomyJr/PDF_TXT_Word_research

This application simplifies PDF keyword searches, allowing users to easily find specific terms in files or folders. Results are displayed clearly, and the history feature enables quick review and filtering of past searches. Users can click on document links in the history to open them directly in the default PDF viewer.

Language: Python - Size: 105 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

bilalhameed248/PDF-Document-Extraction

Python PDF-to-HTML Converter: Transforming PDF Documents into Structured HTML Tags. - Feb 2022 - Jun 2023

Language: Python - Size: 73.2 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

helgesander02/TKFruitMG

An ERP system that uses customtkinter as the GUI base, with a postgreSQL database and reportlab, win32print, and pymupdf-fitz design.

Language: Python - Size: 39.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0