An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: pdf-txt

davaryancha/pdf-splitter

PDF Splitter is a Python tool that takes a multi-page PDF file and splits it into individual PDF files, one for each page of the original document.

Language: Python - Size: 4.88 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

jmxt3/pdf_to_txt_converter

A Python script that converts PDF files to text using the docling library. This tool is designed to batch process PDF files, making it easy to extract text content from multiple documents at once.

Language: Python - Size: 1.77 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Cheereus/PdfSplitter

将pdf转为txt然后进行分词,并进行词频统计

Language: Python - Size: 570 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 31 - Forks: 3

ErykDarnowski/ts-test-extractor

Simple script for extracting questions, answers and so on from test PDFs (for a subject called TS I have at uni) to a more usable format.

Language: Python - Size: 44.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0