GitHub topics: pdf-parser
FlazeFy/Gudangku-Laravel
GudangKu helps you manage your belongings, from home supplies and food stock to furniture. Set reminders to remind you to cleaning or maybe time to restocking some of your home supplies. In this apps also have generate reports to create shopping or maintenance list. Start organizing your inventory with GudangKu’s features. Created using Laravel
Language: PHP - Size: 1.31 MB - Last synced at: about 2 hours ago - Pushed at: about 3 hours ago - Stars: 3 - Forks: 0

Aumlo123/pdfdoom
DOOM in a PDF (as ascii art)
Size: 1000 Bytes - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

iamarunbrahma/vision-parse
Parse PDFs into markdown using Vision LLMs
Language: Python - Size: 374 KB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 361 - Forks: 50

dromara/yft-design
基于fabric.js的开源版【稿定设计】。一款美观且功能强大的在线设计工具,具备海报设计和图片编辑功能。适用于多种场景,如海报生成、电商产品图制作、文章长图设计、视频/公众号封面编辑等 。A beautiful and powerful online design tool
Language: TypeScript - Size: 50.8 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,256 - Forks: 251

Besthope-Official/predoc
Preprocess document service for RAG (Retriveal Augumented Generation)
Language: Python - Size: 23.4 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 2

Stranger123444/u
An interactive command-line tool designed to quickly navigate directories and perform various file operations efficiently. Its simple syntax and intuitive commands make it a favorite among developers for streamlining workflow tasks.
Size: 1000 Bytes - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language: Python - Size: 124 MB - Last synced at: 4 days ago - Pushed at: 10 days ago - Stars: 32,851 - Forks: 2,616

py-pdf/pypdf
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Language: Python - Size: 21 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 9,012 - Forks: 1,461

oidlabs-com/Lexoid
Multimodal document parser for high quality data understanding and extraction
Language: Python - Size: 46.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 44 - Forks: 6

Stravah/eosin
Custom Bank Statement Parsing based on pure text positioning.
Language: Python - Size: 5.22 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 3 - Forks: 1

aescarias/pdfnaut
A Python library for exploring PDFs with ease.
Language: Python - Size: 773 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

diegoabeltran16/OpenPages-pipeline
Open-source tool for turning technical documents into AI-ready formats. Built for better access to knowledge.
Language: Python - Size: 1.78 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

sylphxltd/pdf-reader-mcp
An MCP server built with Node.js/TypeScript that allows AI agents to securely read PDF files (local or URL) and extract text, metadata, or page counts. Uses pdf-parse.
Language: TypeScript - Size: 474 KB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 15 - Forks: 2

chinmaymisra/personal-finance-tracker
Upload Axis Bank statements as PDFs, automatically parse transactions, and view them cleanly in a modern UI. Handles invalid files and non-supported banks gracefully. Built using React (Vite) and FastAPI.
Language: Python - Size: 143 KB - Last synced at: 9 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

drmingler/smart-llm-loader
smart-llm-loader is a lightweight yet powerful Python package that transforms any document into LLM-ready chunks. Spend less time on preprocessing headaches and more time building what matters. From RAG systems to chatbots to document Q&A, SmartLLMLoader handles the heavy lifting so you can focus on creating exceptional AI applications.
Language: Python - Size: 1.09 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 65 - Forks: 2

code-418-dpr/SportHub-parser
Парсер PDF-файла ЕКП Минспорта РФ для проекта SportHub
Language: Python - Size: 2.26 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

ispras/dedoc
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
Language: Python - Size: 235 MB - Last synced at: 21 days ago - Pushed at: 22 days ago - Stars: 233 - Forks: 27

liweiphys/layra
LAYRA is a ready-to-use visual RAG system with a complete web UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.
Language: TypeScript - Size: 2.61 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 427 - Forks: 42

lazyFrogLOL/llmdocparser
A package for parsing PDFs and analyzing their content using LLMs.
Language: Python - Size: 1.21 MB - Last synced at: 9 days ago - Pushed at: 9 months ago - Stars: 269 - Forks: 9

sankeer28/PDF-Searcher
Live website to parse multiple PDFs using PDF.js to find matching text
Language: JavaScript - Size: 29.3 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

drmingler/docling-api
Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
Language: Python - Size: 3.48 MB - Last synced at: 21 days ago - Pushed at: 2 months ago - Stars: 502 - Forks: 54

datalogics/apdfl-cplusplus-samples
Sample code for the Datalogics C++ interface of the Adobe PDF Library
Language: C++ - Size: 11.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 8 - Forks: 7

datalogics/apdfl-csharp-dotnet-samples
Sample code for the Datalogics .NET interface of the Adobe PDF Library
Language: C# - Size: 298 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 8 - Forks: 9

datalogics/apdfl-csharp-dotnet-framework-samples
Sample code for the Datalogics .NET Framework interface of the Adobe PDF Library
Language: C# - Size: 562 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3 - Forks: 9

datalogics/apdfl-java-maven-samples
Sample code for the Datalogics Java interface of the Adobe PDF Library setup to build with Maven
Language: Java - Size: 1.16 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4 - Forks: 11

titipata/scipdf_parser
Python PDF parser for scientific publications: content and figures
Language: Python - Size: 29.2 MB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 402 - Forks: 64

michelcrypt4d4mus/pdfalyzer
Analyze PDFs. With colors. And Yara.
Language: Python - Size: 93.5 MB - Last synced at: 23 days ago - Pushed at: 5 months ago - Stars: 260 - Forks: 19

yobix-ai/extractous
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Language: Rust - Size: 2.88 MB - Last synced at: 28 days ago - Pushed at: 5 months ago - Stars: 1,051 - Forks: 43

adithya-s-k/marker-api
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
Language: Python - Size: 35 MB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 833 - Forks: 92

BitMiracle/Docotic.Pdf.Samples
C# and VB.NET samples for Docotic.Pdf library
Language: Visual Basic .NET - Size: 53.5 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 78 - Forks: 39

luccaHirae/invoice-extract-server
API para extração de dados de faturas
Language: TypeScript - Size: 75.2 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

VishwaGauravIn/pdf-parser-client-side
A lightweight easy to use package to parse text from PDF files on client side without any server dependency.
Language: TypeScript - Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 12 - Forks: 0

ashutoshvarma/pyxpdf
Fast and memory-efficient Python PDF Parser based on xpdf sources
Language: Cython - Size: 12.2 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 17

aleff-github/PDF-Parser-VirusTotal-Based 📦
PDF Parser based on VirusTotal API
Language: Python - Size: 709 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 0

eli64s/pdflex
CLI for merging PDF contexts.
Language: Python - Size: 465 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

FayazK/Document-Metadata-Extractor
A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.
Language: Python - Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

codereverser/casparser
Parser for Consolidated Account Statements (CAS) generated from CAMS/Karvy/Kfintech
Language: Python - Size: 7.85 MB - Last synced at: 11 days ago - Pushed at: 2 months ago - Stars: 142 - Forks: 66

aidayang/MinerU-OneClick
MinerU免安装部署一键启动整合包
Size: 49.8 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 7 - Forks: 0

ridi/content-parser
Content data parser for Ridibooks services
Language: JavaScript - Size: 49.2 MB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 23 - Forks: 7

cuiyuheng/docling Fork of docling-project/docling
🥚 Transform PDF to JSON or Markdown with ease and speed 🐣
Size: 28.5 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

datalogics/apdfl-vb-dotnet-samples
Adobe PDF Library Samples in Visual Basic for .NET
Language: Visual Basic .NET - Size: 174 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 4

cuiyuheng/olmocr Fork of allenai/olmocr
Toolkit for linearizing PDFs for LLM datasets/training
Size: 30.9 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

datalogics/apdfl-kotlin-samples
Adobe PDF Library Samples in Kotlin
Language: Kotlin - Size: 135 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 6

minjunk/welstory-menu-pdf-parser 📦
웰스토리 메뉴 PDF Parser
Language: TypeScript - Size: 130 KB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 2

k16shikano/hpdft
tools to poke pdf using haskell
Language: Haskell - Size: 403 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 43 - Forks: 0

tarfin-labs/easy-pdf
Pdf wrapper for laravel
Language: PHP - Size: 204 KB - Last synced at: 22 days ago - Pushed at: 2 months ago - Stars: 17 - Forks: 3

seinecle/nocodefunctions-io
io for nocodefunctions: csv, txt, pdf, and xlsx so far
Language: Java - Size: 174 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

dills122/cardboard-crack
Web app for parsing/viewing Soccer Card Checklists
Language: JavaScript - Size: 1.3 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

judaicalink/rdf_generator
A library to generate rdf files in turtle format for Judaicalink.
Language: Python - Size: 27.3 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

ishaangupta-YB/nextjs-pdf-parser
Next.js template for seamless PDF parsing using pdf2json and custom drag nd drop file-uploader. Ideal for developers seeking a ready-to-use solution for PDF content extraction in their Next.js projects.
Language: TypeScript - Size: 200 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 3

sypht-team/sypht-java-client
A Java client for the Sypht API
Language: Java - Size: 108 KB - Last synced at: 29 days ago - Pushed at: almost 4 years ago - Stars: 87 - Forks: 1

sypht-team/sypht-python-client
A python client for the Sypht API
Language: Python - Size: 165 KB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 162 - Forks: 5

RiccardoSenica/pdf-text-parsing
PDF-parsing demo
Language: TypeScript - Size: 167 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Alapipapi/MinerU Fork of opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language: Python - Size: 103 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Daniel-Alvarenga/Boot Fork of VitorCarvalho67/Boot
Digital platform tailored for the educational environment, designed to facilitate the dissemination of internship opportunities and promote student engagement
Language: Vue - Size: 8.16 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 8 - Forks: 0

SimpleApp/PDFParser
Swift PDFParser for PDF parsing and text mining. Includes a TrueType font parser
Language: Swift - Size: 146 KB - Last synced at: 5 months ago - Pushed at: almost 6 years ago - Stars: 37 - Forks: 10

adrienjoly/HsbcStatementParser
Transforms PDF bank statements from HSBC into a list of operations in JSON or TSV format.
Language: JavaScript - Size: 21.5 KB - Last synced at: 16 days ago - Pushed at: over 9 years ago - Stars: 17 - Forks: 6

J-sephB-lt-n/pdf-bank-statement-parser
Tool for converting First National Bank (FNB) bank statement PDFs into useful structured data
Language: Python - Size: 65.4 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 1

easonlai/chat_with_pdf_table
The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables.
Language: Jupyter Notebook - Size: 85.9 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 4

bansalsahab/Parser
pdf heading parser
Language: Python - Size: 12.7 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

clarekang/form-pdf2json
NodeJS library to convert JSON to PDF or vice versa
Language: JavaScript - Size: 2.67 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 9 - Forks: 2

ashot-israelyan/nextjs-pdf-openai-chat
A demo AI application for uploading PDF files and chatting withChatGPT regarding the content
Language: TypeScript - Size: 2.42 MB - Last synced at: 6 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

aqiftekhar/OpenAIChatBot
This is a healthcare Chatbot implemented using Open AI that also recieve PDF Documents and Images and prescribe based on summary
Language: TypeScript - Size: 73.2 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

dunso/pdf-parser
Convert PDF content and layout information with pdf.js
Language: JavaScript - Size: 2.18 MB - Last synced at: 9 days ago - Pushed at: over 5 years ago - Stars: 21 - Forks: 7

yintellect/auto-law-review
Automate the case review on legal case documents.
Language: Jupyter Notebook - Size: 30.5 MB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 11 - Forks: 3

race-tech/f1-data-updater
A repository made to update automatically the f1 database used in the f1-api.
Language: Rust - Size: 194 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

lucasjvds/Scanipy
Scanipy stands for "scan it with Python"—it's your smart Python library for scanning and parsing complex PDF files like books, reports, articles, and academic papers. Utilizing cutting-edge Deep Learning algorithms, Scanipy transforms your PDFs into a treasure trove of extractable information: tables, images, equations, and text.
Language: Python - Size: 273 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 1

davendw49/sciparser
PDF parsing toolkit for preparing academic text corpus
Language: Python - Size: 113 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 45 - Forks: 2

yvnggodemis/pdf-parse
PDF Parser built in Rust
Language: Rust - Size: 146 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

antea-p/flashcard_maker
Flashcard maker written in TypeScript, utilizing OpenAI API to create great cloze flashcards.
Language: TypeScript - Size: 25.4 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

greatjourney589/rogu-platform
React&Firebase platform for Ecommerce&Game
Language: JavaScript - Size: 176 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

patrixshah/ResumeScreening
Resume Screening: An AI Driven User Profile Screening Tool
Language: TypeScript - Size: 340 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

ashutoshvarma/libxpdf
Static library built from source of www.xpdfreader.com with most of dependencies built within
Language: C++ - Size: 613 KB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 4

Kanchii/avenue-brokerage-to-excel
A simple script to convert Avenue's brokerage statements to excel, extracting some data
Language: Python - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

nlitsme/pyPdfCrack
Investigation in PDF encryption
Language: Python - Size: 34.2 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 7

lesterchan/linkedin-pdf-resume-parser
Parse LinkedIn PDF Resume and extract out name, email, education and work experiences.
Language: PHP - Size: 289 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 25 - Forks: 11

devleejb/pdf-parser
PDF to JSON in my computer!
Language: JavaScript - Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

mehmet-kozan/pdf-parse
Pure javascript cross-platform module to extract texts from PDFs.
Language: JavaScript - Size: 7.78 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

shawakash/alphaFreq 📦
Assignment for Probability and Random Process
Language: TypeScript - Size: 16.7 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Siddhantsingh1230/SnapCV
A Simple NLP Web App to create summaries of your CVs
Language: CSS - Size: 343 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

CORDEA/pdf_image_extractor
Extract images from PDF
Language: Dart - Size: 182 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

knands42/TextProcessor-Regex 📦
Explore the regex world with FluentAPI pattern
Language: TypeScript - Size: 178 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

MattLondon101/NLP-Parser
Extract form input from PDFs and group keywords into subtopics with Latent Dirichlet Allocation (LDA).
Language: Python - Size: 660 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sypht-team/sypht-node-client
A Nodejs client for the Sypht API
Language: JavaScript - Size: 62.5 KB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 13 - Forks: 4

tuffstuff9/nextjs-pdf-parser
Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.
Language: TypeScript - Size: 44.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 2

Siddhantsingh1230/SnapCV_Backend
A Node Backend Server for SnapCV
Language: HTML - Size: 32.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

PeterMosmans/apdfhelper
Fix links in PDF files, rewrite links, extract text annotations, remove pages
Language: Python - Size: 98.6 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sypht-team/sypht-ruby-client
A Ruby client for the Sypht API
Language: Ruby - Size: 53.7 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

tomludlow2/php_nhs_payslip_parser
Uses the https://github.com/smalot/pdfparser Parser to open NHS Payslips in PHP. Then parses them to extract the relevant contents into a php assoc array
Language: PHP - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

sypht-team/sypht-elixir-client
An Elixir client for the Sypht API https://sypht.com
Language: Elixir - Size: 47.9 KB - Last synced at: 26 days ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 0

datalogics/adobe-pdf-library-samples
Sample code for the Datalogics C++, Java, and .NET interfaces of the Adobe PDF Library
Size: 43.3 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 77 - Forks: 62

bratergit/hacktoberfest2020
Hacktoberfest 2020 - Faça um programa desktop que rode no terminal que dado um pdf da toro investimentos com as corretagens do dia. Mostre o Cálculo do Imposto de Renda para day trade do mini dolar e mini índice da bovespa.
Language: JavaScript - Size: 92.8 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 2

lulucasalves/lumi-back
Backend application test
Language: TypeScript - Size: 790 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

GrigorisLionis/ika-stats-parser
PDF parser of IKA work related statistics data
Language: Python - Size: 134 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

CodeTrace-MY/PDF-Text-Extraction
Algorithm to extract labels and readings from industrial engineer drawings
Language: Python - Size: 643 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

jogemu/pdf2tree
Parse PDF and group elements based on enclosing lines. A node.js module that promisifies the pdf2json parser and structures the data in a way that is suitable for tables with merged cells.
Language: JavaScript - Size: 12.7 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

AnyaChickenMcnuggets/PrimoRPAPdfToCsv
Size: 228 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

BuildmodeOne/canisius-parser
A pdf parser to extract the meal plan from the "Katholische Canisiusstiftung" in Ingolstadt
Language: TypeScript - Size: 634 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

cschen1205/spring-pdf-search-engine
PDF Search Engine implemented in Java and Spring Boot
Language: Java - Size: 77 MB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 5

leandroroser/prettyparser
Parallel processing and parsing PDF and TXT files, and Python objects with text (str, list) using rules (regular expressions).
Language: Python - Size: 106 KB - Last synced at: 27 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0
