GitHub topics: document-image-analysis
Unstructured-IO/unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
Language: HTML - Size: 192 MB - Last synced at: about 10 hours ago - Pushed at: about 10 hours ago - Stars: 11,768 - Forks: 972

deepdoctection/deepdoctection
A Repo For Document AI
Language: Python - Size: 29 MB - Last synced at: about 10 hours ago - Pushed at: about 11 hours ago - Stars: 2,868 - Forks: 161

athallahaiqal/document-ai
A simple FastAPI application that allows users to upload PDF or DOCX documents in a database, get a summary generated by a local LLM via Ollama, and ask natural language questions about their content.
Language: Python - Size: 63.5 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

enoch3712/ExtractThinker
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Language: Python - Size: 20.4 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 1,274 - Forks: 129

chulwoopack/docstrum
Language: Jupyter Notebook - Size: 97.1 MB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 69 - Forks: 21

hpanwar08/detectron2 Fork of facebookresearch/detectron2
Detectron2 for Document Layout Analysis
Language: Python - Size: 4.53 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 178 - Forks: 62

chulwoopack/voronoi_based_docu_complexity_analysis
Language: Jupyter Notebook - Size: 267 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

chulwoopack/document_complexity
Analyze document image complexity based on segmentation results
Language: Python - Size: 2.93 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

chulwoopack/Mask_RCNN_SegDog
Language: Jupyter Notebook - Size: 609 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

chulwoopack/gravity-map
Visual Domain Knowledge-based Multimodal Zoning Textual Region Localization in Noisy Historical Document Images
Language: C++ - Size: 1.2 GB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

huyhoang17/kuzushiji_recognition
[Late Submission] Solution for Kuzushiji recognition (Kaggle competition)
Language: Python - Size: 90 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 2
