An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-image-analysis

ERIK2012MIAO/chunk-data

📦 Split buffers and streams into smaller chunks for smooth HTTP uploads and accurate progress tracking.

Language: JavaScript - Size: 1.3 MB - Last synced at: about 21 hours ago - Pushed at: about 22 hours ago - Stars: 0 - Forks: 0

deepdoctection/deepdoctection

A Repo For Document AI

Language: Python - Size: 30.2 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3,068 - Forks: 182

Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

Language: HTML - Size: 194 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 13,204 - Forks: 1,082

athallahaiqal/document-ai

A simple FastAPI application that allows users to upload PDF or DOCX documents in a database, get a summary generated by a local LLM via Ollama, and ask natural language questions about their content.

Language: Python - Size: 63.5 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 1

iheb-brini/SegClarity

SegClarity: An attribution-based XAI workflow for layer-wise interpretability in semantic segmentation

Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 4 - Forks: 0

enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

Language: Python - Size: 20.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1,378 - Forks: 134

chulwoopack/docstrum

Language: Jupyter Notebook - Size: 97.1 MB - Last synced at: 8 months ago - Pushed at: over 7 years ago - Stars: 69 - Forks: 21

hpanwar08/detectron2 Fork of facebookresearch/detectron2

Detectron2 for Document Layout Analysis

Language: Python - Size: 4.53 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 178 - Forks: 62

chulwoopack/voronoi_based_docu_complexity_analysis

Language: Jupyter Notebook - Size: 267 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

chulwoopack/document_complexity

Analyze document image complexity based on segmentation results

Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

chulwoopack/Mask_RCNN_SegDog

Language: Jupyter Notebook - Size: 609 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

chulwoopack/gravity-map

Visual Domain Knowledge-based Multimodal Zoning Textual Region Localization in Noisy Historical Document Images

Language: C++ - Size: 1.2 GB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

huyhoang17/kuzushiji_recognition

[Late Submission] Solution for Kuzushiji recognition (Kaggle competition)

Language: Python - Size: 90 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 17 - Forks: 2