An open API service providing repository metadata for many open source software ecosystems.

GitHub / MBAigner / PDFSegmenter

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MBAigner%2FPDFSegmenter
PURL: pkg:github/MBAigner/PDFSegmenter

Stars: 23
Forks: 3
Open issues: 0

License: mit
Language: Python
Size: 399 KB
Dependencies parsed at: Pending

Created at: about 5 years ago
Updated at: 3 months ago
Pushed at: almost 5 years ago
Last synced at: 4 days ago

Topics: annotations, cluster-analysis, csv, detection-model, document-processing, layout-analysis, page-segmentation, pdf, python, table

    Loading...