GitHub / opendatalab / MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opendatalab%2FMinerU
Stars: 31,097
Forks: 2,485
Open issues: 149
License: agpl-3.0
Language: Python
Size: 124 MB
Dependencies parsed at: Pending
Created at: about 1 year ago
Updated at: 3 days ago
Pushed at: 3 days ago
Last synced at: 3 days ago
Commit Stats
Commits: 1390
Authors: 37
Mean commits per author: 37.57
Development Distribution Score: 0.404
More commit stats: https://commits.ecosyste.ms/hosts/GitHub/repositories/opendatalab/MinerU
Topics: ai4science, document-analysis, extract-data, layout-analysis, ocr, parser, pdf, pdf-converter, pdf-extractor-llm, pdf-extractor-pretrain, pdf-extractor-rag, pdf-parser, python