GitHub topics: pdf-scraping

Repositories

NotAMadTheorist/GC-MS-of-Ginger-Oil-via-PDF-Scraping

This repository contains data files and programs written in Python 3.13 which aim to extract relevant GC-MS data from the text of an instrument-output PDF file. This was used for an experiment for CHEM 133.02 LAB.

Language: Python - Size: 7.86 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

scottgriv/python-pdf_web_scraper

Scrape a web page for pdf files and download them all locally.

Language: Python - Size: 375 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 11 - Forks: 2

edoardottt/multi-pdf-finder

Are you looking for a word in many pdf files? Do it one time. ⚡

Language: Shell - Size: 26.4 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 16 - Forks: 3

casychow/pdf_scraper_extract_largest_num

Python module to scrape information from a PDF file with different data types (eg. tables, graphs) and extract the largest number it can find.

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

gwu-libraries/uriscrape

Scrape URIs from Telegram channel transcripts in PDF files

Language: Python - Size: 69.3 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 0

Spyrosigma/ResuMeme

Upload your Resume and see yourself getting roasted.

Language: Python - Size: 57.6 KB - Last synced at: 29 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

tam0w/poverty_data

Attempting to analyse and estimate poverty indicators at the Indian district level. First ever district level dataset with a poverty indicator.

Language: Jupyter Notebook - Size: 184 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

prak112/esg-profile

Assessing stock-price fluctuations of companies based on their ESG-profiles

Language: Jupyter Notebook - Size: 2.11 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

ibotsuft/scripts

Scripts written by iBots team.

Language: Python - Size: 13.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mattkerlogue/google-covid-mobility-scrape 📦

Script for scraping Google's COVID19 Community Mobility Reports [ARCHIVED]

Language: R - Size: 17.5 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 33 - Forks: 14

kaigg96/Driving-Towards-Efficiency

Using Python and the Natural Resources Canada Fuel Consumption Ratings to view and predict vehicle efficiency.

Language: Jupyter Notebook - Size: 24.1 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

ethanpbrooks/Schwab-PDF-Scraper

PDF Statement Data Extractor and Analyzer. A Python script for extracting and analyzing financial data from PDF statements, with a focus on Schwab statements.

Language: Python - Size: 452 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

SteadyGiant/scrape-naic 📦

Scraping tables from the PDFs of NAIC Model Laws, Regulations, and Guidelines.

Language: R - Size: 1.68 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

hellpanderrr/pypdfscraper

Lightweight PDF scraper

Language: Python - Size: 782 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

iamcjt922/Funding-Analysis

A custom created application with a GUI utilizing Python and libraries PyPDF2 to scrape, scan and evaluate a person's funding capacity based on their PDF credit report.

Language: Python - Size: 160 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

TomasHubelbauer/pdf-scrape

Demonstrating PDF text and image extraction with correct bounds

Language: JavaScript - Size: 1.54 MB - Last synced at: 16 days ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

fayrose/MiddleEgyptianDataset

Parses 3 dictionaries from PDFs, reconstructs lost formatting using N-gram and visual computing methods, and serializes to a database for web display.

Language: C# - Size: 70.9 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1