GitHub / BobLd / PdfPigMLNetBlockClassifier
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BobLd%2FPdfPigMLNetBlockClassifier
Stars: 28
Forks: 6
Open issues: 0
License: None
Language: C#
Size: 1.1 MB
Dependencies parsed at: Pending
Created at: over 5 years ago
Updated at: 9 days ago
Pushed at: about 5 years ago
Last synced at: 6 days ago
Topics: classifier, csharp, document-layout, document-layout-analysis, layout-analysis, lightgbm, machine-learning, ml-net, pdf, pdf-document, pdf-document-processor, pdfpig, publaynet
Funding Links https://github.com/sponsors/BobLd