An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-image-analysis

Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

Language: HTML - Size: 192 MB - Last synced at: about 10 hours ago - Pushed at: about 10 hours ago - Stars: 11,768 - Forks: 972

deepdoctection/deepdoctection

A Repo For Document AI

Language: Python - Size: 29 MB - Last synced at: about 10 hours ago - Pushed at: about 11 hours ago - Stars: 2,868 - Forks: 161

athallahaiqal/document-ai

A simple FastAPI application that allows users to upload PDF or DOCX documents in a database, get a summary generated by a local LLM via Ollama, and ask natural language questions about their content.

Language: Python - Size: 63.5 KB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

Language: Python - Size: 20.4 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 1,274 - Forks: 129

chulwoopack/docstrum

Language: Jupyter Notebook - Size: 97.1 MB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 69 - Forks: 21

hpanwar08/detectron2 Fork of facebookresearch/detectron2

Detectron2 for Document Layout Analysis

Language: Python - Size: 4.53 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 178 - Forks: 62

chulwoopack/voronoi_based_docu_complexity_analysis

Language: Jupyter Notebook - Size: 267 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

chulwoopack/document_complexity

Analyze document image complexity based on segmentation results

Language: Python - Size: 2.93 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

chulwoopack/Mask_RCNN_SegDog

Language: Jupyter Notebook - Size: 609 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

chulwoopack/gravity-map

Visual Domain Knowledge-based Multimodal Zoning Textual Region Localization in Noisy Historical Document Images

Language: C++ - Size: 1.2 GB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

huyhoang17/kuzushiji_recognition

[Late Submission] Solution for Kuzushiji recognition (Kaggle competition)

Language: Python - Size: 90 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 2