GitHub topics: pdf-scraping
scottgriv/python-pdf_web_scraper
Scrape a web page for pdf files and download them all locally.
Language: Python - Size: 375 KB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 2

edoardottt/multi-pdf-finder
Are you looking for a word in many pdf files? Do it one time. ⚡
Language: Shell - Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 16 - Forks: 3

casychow/pdf_scraper_extract_largest_num
Python module to scrape information from a PDF file with different data types (eg. tables, graphs) and extract the largest number it can find.
Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

gwu-libraries/uriscrape
Scrape URIs from Telegram channel transcripts in PDF files
Language: Python - Size: 69.3 KB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 0

Spyrosigma/ResuMeme
Upload your Resume and see yourself getting roasted.
Language: Python - Size: 57.6 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

tam0w/poverty_data
Attempting to analyse and estimate poverty indicators at the Indian district level. First ever district level dataset with a poverty indicator.
Language: Jupyter Notebook - Size: 184 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

prak112/esg-profile
Assessing stock-price fluctuations of companies based on their ESG-profiles
Language: Jupyter Notebook - Size: 2.11 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

ibotsuft/scripts
Scripts written by iBots team.
Language: Python - Size: 13.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mattkerlogue/google-covid-mobility-scrape 📦
Script for scraping Google's COVID19 Community Mobility Reports [ARCHIVED]
Language: R - Size: 17.5 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 33 - Forks: 14

kaigg96/Driving-Towards-Efficiency
Using Python and the Natural Resources Canada Fuel Consumption Ratings to view and predict vehicle efficiency.
Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

ethanpbrooks/Schwab-PDF-Scraper
PDF Statement Data Extractor and Analyzer. A Python script for extracting and analyzing financial data from PDF statements, with a focus on Schwab statements.
Language: Python - Size: 452 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

SteadyGiant/scrape-naic 📦
Scraping tables from the PDFs of NAIC Model Laws, Regulations, and Guidelines.
Language: R - Size: 1.68 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

hellpanderrr/pypdfscraper
Lightweight PDF scraper
Language: Python - Size: 782 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

iamcjt922/Funding-Analysis
A custom created application with a GUI utilizing Python and libraries PyPDF2 to scrape, scan and evaluate a person's funding capacity based on their PDF credit report.
Language: Python - Size: 160 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

TomasHubelbauer/pdf-scrape
Demonstrating PDF text and image extraction with correct bounds
Language: JavaScript - Size: 1.54 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

fayrose/MiddleEgyptianDataset
Parses 3 dictionaries from PDFs, reconstructs lost formatting using N-gram and visual computing methods, and serializes to a database for web display.
Language: C# - Size: 70.9 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1

TomasHubelbauer/globus
Scrapes the Globus PDF catalogue using Puppeteer
Language: JavaScript - Size: 25.3 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

gra-vel/covid-pichincha
Visualization of reported cases of COVID-19 in Pichincha, Ecuador
Language: Python - Size: 13.8 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

chris-bbrs/pdf-merging-and-scraping
PDF merging and scraping for nlp use
Language: Jupyter Notebook - Size: 8.79 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

coelicidium/marpl-project
A free as in freedom modular, flexible, customizable all-in-one suite for all your open science needs.
Size: 15.6 KB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

zach-hunt/PDFParsing
Data extraction from PDF tables
Language: Python - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

GGSIPUResultTracker/ggsipu_results_extractor
Python module to extract and dump results data from GGSIPU results pdf
Language: Python - Size: 5.7 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1
