An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-understanding

infiniflow/ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language: Python - Size: 76.6 MB - Last synced at: about 3 hours ago - Pushed at: about 5 hours ago - Stars: 61,057 - Forks: 6,160

deepdoctection/deepdoctection

A Repo For Document AI

Language: Python - Size: 29.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2,891 - Forks: 164

tstanislawek/awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

Size: 5.56 MB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 1,440 - Forks: 161

microsoft/CompHRDoc

Datasets and Evaluation Scripts for CompHRDoc

Language: Python - Size: 1.39 MB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 46 - Forks: 6

GoogleCloudPlatform/document-ai-samples

Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud

Language: Jupyter Notebook - Size: 143 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 282 - Forks: 112

huggingface/chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Language: Python - Size: 146 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 159 - Forks: 11

MathamPollard/awesome-table-structure-recognition

A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.

Size: 45.9 KB - Last synced at: 14 days ago - Pushed at: 11 months ago - Stars: 197 - Forks: 10

callbacked/smoldocling256M-webgpu

Document Understanding in the Browser!

Language: TypeScript - Size: 68.4 KB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

SCUT-DLVCLab/Document-AI-Recommendations

Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.

Size: 7.24 MB - Last synced at: 24 days ago - Pushed at: 5 months ago - Stars: 194 - Forks: 7

docling-project/docling4j

Docling4j brings the functionalities of Docling in document understanding to Java® projects

Language: Java - Size: 32.2 KB - Last synced at: 28 days ago - Pushed at: 4 months ago - Stars: 12 - Forks: 0

ExtrieveTechnologies/QuickCapture_Android

QuickCapture Mobile Scanning SDK Specially designed for native ANDROID from Extrieve

Language: Java - Size: 395 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

Haruhiyuki/yuque-rag

将语雀知识库接入大语言模型,实现基于 RAG(检索增强生成)的智能问答系统,支持FastAPI,兼容OpenAI API与本地Ollama模型。

Language: Python - Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

AlibabaResearch/AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

Language: C++ - Size: 104 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 1,727 - Forks: 194

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Language: Python - Size: 14.7 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 725 - Forks: 57

bwnyasse/dart-documentai-samples

A hands-on CLI tool sample showcasing the integration of Dart with Google Cloud's DocumentAI.

Language: Dart - Size: 605 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

X-PLUG/mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language: Python - Size: 105 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2,177 - Forks: 126

andreagemelli/doc2graph

Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.

Language: Jupyter Notebook - Size: 466 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 121 - Forks: 20

Alpha-Innovator/DocGenome

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Language: Jupyter Notebook - Size: 15.2 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 132 - Forks: 6

wenwenyu/PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Language: Python - Size: 9.72 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 563 - Forks: 192

PAIR-Systems-Inc/little-dorrit-editor

Multimodal benchmark for evaluating handwritten editorial correction in printed text.

Language: Python - Size: 13.9 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ZeningLin/PEneo

[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.

Language: Python - Size: 10.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 29 - Forks: 7

phong-lt/LiGT_VQA

This repository includes the ReceiptVQA dataset and the Pytorch implementation of the LiGT method and other evaluated baselines.

Language: Python - Size: 45.9 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

LynnHaDo/Checkbox-Detection

Checkbox Detection Model for Scanned Documents

Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 65 - Forks: 3

LynnHaDo/Document-Layout-Analysis

Object Detection Model for Scanned Documents

Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 90 - Forks: 14

doc-analysis/ReadingBank

ReadingBank: A Benchmark Dataset for Reading Order Detection

Size: 1.21 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 104 - Forks: 3

uakarsh/TiLT-Implementation

Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.

Language: Jupyter Notebook - Size: 396 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 0

SCUT-DLVCLab/RFUND

[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"

Size: 723 KB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 19 - Forks: 0

NExTplusplus/TAT-DQA

TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning

Size: 1.01 MB - Last synced at: 6 months ago - Pushed at: 10 months ago - Stars: 22 - Forks: 1

TomQuez/LLM_document_understanding

Language: HTML - Size: 2.39 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

ExtrieveTechnologies/QuickCapture_IOS

QuickCapture Mobile Scanning SDK Specially designed for native IOS

Language: Objective-C - Size: 80.1 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

Lucas-Jeanniot/LAISA

LAISA (Local AI Search Application) is a desktop app which allows you to run completely local, private, and free LLM inference. LAISA supports basic RAG with pre-configured OpenSearch Databases, and local document parsing with PDFs.

Language: HTML - Size: 4.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

mycielski/textract_study

Analysing expense reports/invoices with AWS Textract and boto3.

Language: Python - Size: 25.4 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

jacobmarks/pytesseract-ocr-plugin

Run optical character recognition with PyTesseract from the FiftyOne App!

Language: Python - Size: 23.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

jpWang/LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

Language: Python - Size: 1.36 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 282 - Forks: 34

javier-marti-isasi/OCR-free-Document-Understanding-with-Donut-Transformer

This project tackles a real-world challenge of automating client document processing, with a focus on enhancing document classification, error detection, data extraction, and validation.

Language: Jupyter Notebook - Size: 9.48 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

dhorvay/document-understanding-ebook

(WIP) ✨ A comprehensive resource for understanding the world of software used in the Document Understanding field. 🧙✨

Language: Markdown - Size: 8.88 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

irgroup/labelstudio-to-fonduer

This small module connects Label Studio with Fonduer by creating a fonduer labeling function for gold labels from a label studio export. Documentation: https://irgroup.github.io/labelstudio-to-fonduer/

Language: Python - Size: 1.73 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

Related Keywords
document-understanding 37 ocr 11 document-ai 9 key-information-extraction 6 nlp 6 deep-learning 5 python 5 ai 4 computer-vision 4 document-analysis 4 table-structure-recognition 3 table-detection 3 rag 3 document-parser 3 visual-information-extraction 3 document-intelligence 3 machine-learning 3 pdf 3 pytorch 2 java 2 document-classification 2 pdf-converter 2 document-scanning-sdk 2 document-scanner-app 2 information-extraction 2 natural-language-processing 2 question-answering 2 samples 2 document-layout-analysis 2 retrieval-augmented-generation 2 multimodal 2 multimodal-deep-learning 2 llm 2 vision-language-model 2 ai-search 2 yolov8 2 object-detection 2 label-studio 1 vietnamese-language 1 graph-convolutional-network 1 llm-evaluation 1 benchmark 1 graph-neural-networks 1 graph-learning 1 paper-annotation 1 layout-analysis 1 gnn 1 geometric-deep-learning 1 table-understanding 1 multimodal-large-language-models 1 mllm 1 chart-understanding 1 google-cloud 1 dartlang 1 dart 1 retrieval 1 multi-modality 1 multi-modal 1 datasets 1 knowledge-base-construction 1 fonduer 1 data-annotation 1 ebook 1 awesome-document-understanding 1 multimodal-pre-trained-model 1 multilingual-models 1 tesseract-ocr 1 tesseract 1 plugin 1 fiftyone 1 textract 1 shell 1 script 1 invoices 1 expenses 1 boto3 1 aws-cli 1 aws 1 swift 1 objective-c 1 ios 1 html 1 benchmarking 1 vqa 1 transformers 1 pytorch-lightning 1 pytorch-implementation 1 copy-paste 1 visual-question-answering 1 dataloading 1 rag-related 1 document-structure-analysis 1 unstructured-data 1 rpa 1 robotic-process-automation 1 pdf-documents 1 intelligent-processing 1 awesome-list 1 awesome 1 tensorflow 1