An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-understanding

ExtrieveTechnologies/QuickCapture_Android

QuickCapture Mobile Scanning SDK Specially designed for native ANDROID from Extrieve

Language: Java - Size: 395 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

infiniflow/ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language: Python - Size: 75.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 57,975 - Forks: 5,713

deepdoctection/deepdoctection

A Repo For Document AI

Language: Python - Size: 28.9 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2,859 - Forks: 161

huggingface/chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Language: Python - Size: 146 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 158 - Forks: 11

MathamPollard/awesome-table-structure-recognition

A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.

Size: 45.9 KB - Last synced at: 12 days ago - Pushed at: 10 months ago - Stars: 190 - Forks: 9

Haruhiyuki/yuque-rag

将语雀知识库接入大语言模型,实现基于 RAG(检索增强生成)的智能问答系统,支持FastAPI,兼容OpenAI API与本地Ollama模型。

Language: Python - Size: 27.3 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 3 - Forks: 0

tstanislawek/awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

Size: 5.56 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 1,416 - Forks: 160

AlibabaResearch/AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

Language: C++ - Size: 104 MB - Last synced at: 21 days ago - Pushed at: 3 months ago - Stars: 1,727 - Forks: 194

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Language: Python - Size: 14.7 MB - Last synced at: 21 days ago - Pushed at: 4 months ago - Stars: 725 - Forks: 57

microsoft/CompHRDoc

Datasets and Evaluation Scripts for CompHRDoc

Language: Python - Size: 1.39 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 44 - Forks: 7

bwnyasse/dart-documentai-samples

A hands-on CLI tool sample showcasing the integration of Dart with Google Cloud's DocumentAI.

Language: Dart - Size: 605 KB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

docling-project/docling4j

Docling4j brings the functionalities of Docling in document understanding to Java® projects

Language: Java - Size: 32.2 KB - Last synced at: 15 days ago - Pushed at: 3 months ago - Stars: 10 - Forks: 0

X-PLUG/mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language: Python - Size: 105 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2,177 - Forks: 126

GoogleCloudPlatform/document-ai-samples

Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud

Language: Jupyter Notebook - Size: 142 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 277 - Forks: 111

andreagemelli/doc2graph

Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.

Language: Jupyter Notebook - Size: 466 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 121 - Forks: 20

Alpha-Innovator/DocGenome

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Language: Jupyter Notebook - Size: 15.2 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 132 - Forks: 6

wenwenyu/PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Language: Python - Size: 9.72 MB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 563 - Forks: 192

PAIR-Systems-Inc/little-dorrit-editor

Multimodal benchmark for evaluating handwritten editorial correction in printed text.

Language: Python - Size: 13.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ZeningLin/PEneo

[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.

Language: Python - Size: 10.1 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 29 - Forks: 7

phong-lt/LiGT_VQA

This repository includes the ReceiptVQA dataset and the Pytorch implementation of the LiGT method and other evaluated baselines.

Language: Python - Size: 45.9 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

LynnHaDo/Checkbox-Detection

Checkbox Detection Model for Scanned Documents

Language: Jupyter Notebook - Size: 3.09 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 65 - Forks: 3

LynnHaDo/Document-Layout-Analysis

Object Detection Model for Scanned Documents

Language: Jupyter Notebook - Size: 2.86 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 90 - Forks: 14

doc-analysis/ReadingBank

ReadingBank: A Benchmark Dataset for Reading Order Detection

Size: 1.21 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 104 - Forks: 3

SCUT-DLVCLab/Document-AI-Recommendations

Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.

Size: 7.24 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 181 - Forks: 7

uakarsh/TiLT-Implementation

Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.

Language: Jupyter Notebook - Size: 396 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 0

SCUT-DLVCLab/RFUND

[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"

Size: 723 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 19 - Forks: 0

NExTplusplus/TAT-DQA

TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning

Size: 1.01 MB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 22 - Forks: 1

TomQuez/LLM_document_understanding

Language: HTML - Size: 2.39 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

ExtrieveTechnologies/QuickCapture_IOS

QuickCapture Mobile Scanning SDK Specially designed for native IOS

Language: Objective-C - Size: 80.1 KB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

Lucas-Jeanniot/LAISA

LAISA (Local AI Search Application) is a desktop app which allows you to run completely local, private, and free LLM inference. LAISA supports basic RAG with pre-configured OpenSearch Databases, and local document parsing with PDFs.

Language: HTML - Size: 4.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

mycielski/textract_study

Analysing expense reports/invoices with AWS Textract and boto3.

Language: Python - Size: 25.4 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

jacobmarks/pytesseract-ocr-plugin

Run optical character recognition with PyTesseract from the FiftyOne App!

Language: Python - Size: 23.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 0

jpWang/LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

Language: Python - Size: 1.36 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 282 - Forks: 34

javier-marti-isasi/OCR-free-Document-Understanding-with-Donut-Transformer

This project tackles a real-world challenge of automating client document processing, with a focus on enhancing document classification, error detection, data extraction, and validation.

Language: Jupyter Notebook - Size: 9.48 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

dhorvay/document-understanding-ebook

(WIP) ✨ A comprehensive resource for understanding the world of software used in the Document Understanding field. 🧙✨

Language: Markdown - Size: 8.88 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

irgroup/labelstudio-to-fonduer

This small module connects Label Studio with Fonduer by creating a fonduer labeling function for gold labels from a label studio export. Documentation: https://irgroup.github.io/labelstudio-to-fonduer/

Language: Python - Size: 1.73 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

Related Keywords
document-understanding 36 ocr 11 document-ai 9 nlp 7 key-information-extraction 6 python 5 deep-learning 5 document-analysis 4 computer-vision 4 visual-information-extraction 3 ai 3 rag 3 pdf 3 table-detection 3 machine-learning 3 table-structure-recognition 3 document-parser 3 document-intelligence 3 document-classification 2 pdf-converter 2 pytorch 2 samples 2 vision-language-model 2 information-extraction 2 question-answering 2 natural-language-processing 2 yolov8 2 object-detection 2 multimodal 2 multimodal-deep-learning 2 document-scanning-sdk 2 ai-search 2 document-layout-analysis 2 document-scanner-app 2 llm 2 retrieval-augmented-generation 2 java 2 paper-annotation 1 graph-convolutional-network 1 graph-learning 1 fonduer 1 graph-neural-networks 1 benchmark 1 llm-evaluation 1 data-annotation 1 vietnamese-language 1 visual-question-answering 1 layout-analysis 1 gnn 1 geometric-deep-learning 1 table-understanding 1 multimodal-large-language-models 1 mllm 1 chart-understanding 1 pdf-to-json 1 knowledge-base-construction 1 documents 1 document-parsing 1 docling 1 label-studio 1 script 1 invoices 1 expenses 1 shell 1 textract 1 boto3 1 aws-cli 1 aws 1 swift 1 objective-c 1 ios 1 fiftyone 1 html 1 benchmarking 1 vqa 1 transformers 1 pytorch-lightning 1 plugin 1 tesseract 1 pytorch-implementation 1 tesseract-ocr 1 multilingual-models 1 multimodal-pre-trained-model 1 awesome-document-understanding 1 ebook 1 copy-paste 1 google-cloud 1 table-functional-analysis 1 table-extraction 1 webdataset 1 pdf-document 1 multi-modal-learning 1 distributed-training 1 datasets 1 dataloading 1 tensorflow 1 table-recognition 1 pubtabnet 1 publaynet 1 layoutlm 1