Topic: "git-scraping"
jstrieb/github-stats
Better GitHub statistics images for your profile, with stats from private repos too
Language: Python - Size: 2.38 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,270 - Forks: 695
alex/nyt-2020-election-scraper
Language: HTML - Size: 681 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 1,762 - Forks: 292
factbook/factbook.json
World Factbook Country Profiles in JSON - Free Open Public Domain Data - No API Key Required ;-)
Size: 61.3 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 1,043 - Forks: 128
mackorone/spotify-playlist-archive
Daily snapshots of public Spotify playlists
Size: 92.6 GB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 460 - Forks: 315
femueller/cloud-ip-ranges
An up-to-date export of cloud provider IP address ranges
Size: 26 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 340 - Forks: 48
simonw/csv-diff
Python CLI tool and library for diffing CSV and JSON files
Language: Python - Size: 37.1 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 321 - Forks: 51
vinayak-mehta/conrad
Track conferences and meetups on your terminal.
Language: Python - Size: 1.32 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 255 - Forks: 58
swyxio/gh-action-data-scraping
this shows how to use github actions to do periodic data scraping
Language: JavaScript - Size: 20.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 235 - Forks: 37
simonw/ca-fires-history
Tracking fire data from www.fire.ca.gov
Size: 5.8 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 210 - Forks: 46
endoflife-date/release-data
Common Release Data for various projects in a consumable format, automatically updated.
Language: Python - Size: 18.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 184 - Forks: 56
mary-ext/atproto-scraping
Git scraping of AT Protocol/Bluesky instances
Language: TypeScript - Size: 2.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 154 - Forks: 7
simonw/git-scraper-template
Template repository for setting up a new git scraper
Language: Shell - Size: 18.6 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 118 - Forks: 9
tobilg/public-cloud-provider-ip-ranges
Unified datasets for public cloud provider IP ranges. Providers include AWS, Azure, CloudFlare, DigitalOcean, Fastly, Google Cloud and Oracle Cloud.
Language: Shell - Size: 498 MB - Last synced at: about 19 hours ago - Pushed at: about 21 hours ago - Stars: 92 - Forks: 13
vitorbaptista/google-covid19-mobility-reports
Data extraction of Google's COVID-19 Mobility Reports
Language: HTML - Size: 125 MB - Last synced at: 8 months ago - Pushed at: over 5 years ago - Stars: 81 - Forks: 11
mary-ext/bluesky-labeler-scraping
Git scraping of Bluesky labelers/label providers
Language: TypeScript - Size: 3.78 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 68 - Forks: 3
simonw/scrape-hacker-news-by-domain
Scrape HN to track links from specific domains
Language: JavaScript - Size: 2.27 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 67 - Forks: 11
tobilg/aws-iam-data
This repository contains the full dataset of AWS IAM data (services, actions, resource types and conditions keys). It's updated on a daily basis at 4AM UTC.
Language: TypeScript - Size: 425 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 65 - Forks: 0
datadesk/california-coronavirus-scrapers
The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.
Language: Jupyter Notebook - Size: 9.89 GB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 59 - Forks: 6
simonw/pge-outages-pre-2024
Tracking PG&E outages
Language: Python - Size: 134 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 56 - Forks: 7
pl4nty/intune-change-tracking
Track changes to Microsoft Intune with git and RSS
Language: Python - Size: 31.6 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 51 - Forks: 6
simonw/scrape-doge-gov
Track changes to doge.gov
Language: Python - Size: 126 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 51 - Forks: 2
simonw/disaster-scrapers
Scrapers for disaster data - writes to https://github.com/simonw/disaster-data
Language: Python - Size: 27.3 KB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 49 - Forks: 3
simonw/help-scraper
Record a history of --help for various commands
Language: Python - Size: 204 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 47 - Forks: 8
simonw/sf-tree-history
Tracking the history of trees in San Francisco
Size: 460 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 46 - Forks: 6
captn3m0/india-isin-data
International Securities Identification Numbers for various Indian Securities
Language: Shell - Size: 128 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 44 - Forks: 6
pkmn/randbats
Pokémon Showdown's Random Battle sets
Language: JavaScript - Size: 47 MB - Last synced at: about 7 hours ago - Pushed at: about 8 hours ago - Stars: 42 - Forks: 13
captn3m0/mf.captnemo.in
Get information about Indian Mutual Funds from their ISIN numbers.
Language: JavaScript - Size: 69.2 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 39 - Forks: 4
tigger0jk/ark-invest-scraper
Pulling a history of the holdings for ark invest funds https://ark-funds.com/
Language: Python - Size: 5.68 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 39 - Forks: 11
pkmn/smogon
Wrapper around Smogon's analyses and usage statistics
Language: TypeScript - Size: 1.11 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 37 - Forks: 7
simonw/scrape-open-data
Scrape various open data directories to create an index of what's available out there
Language: Python - Size: 5.9 GB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 37 - Forks: 3
mikepqr/real-estate-scrape-eg
A repository demonstrating the use of real-estate-scrape to store the estimated value of a property on Redfin and Zillow every night using Github Actions.
Size: 75.5 MB - Last synced at: about 18 hours ago - Pushed at: about 20 hours ago - Stars: 36 - Forks: 6
simonw/disaster-data
Data scraped by https://github.com/simonw/disaster-scrapers
Size: 191 MB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 36 - Forks: 10
captn3m0/historical-mf-data
Historical Mutual Funds data
Language: Python - Size: 7.46 GB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 33 - Forks: 11
ifoukarakis/jobscrapper
An automated job scrapper
Language: Python - Size: 708 KB - Last synced at: 8 months ago - Pushed at: almost 3 years ago - Stars: 32 - Forks: 11
fedora-python/portingdb
Database & tools to track Python 2 removal from Fedora
Language: Python - Size: 34.7 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 31 - Forks: 37
AyrtonB/EveryNoise-Watch
Spotify genre attributes from EveryNoise
Language: Jupyter Notebook - Size: 1.82 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 31 - Forks: 1
MaxHalford/bike-sharing-history
🚲 Git scraping for bike sharing APIs
Language: Python - Size: 48.9 GB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 30 - Forks: 4
bobek/masscan_as_a_service
masscan as a service
Language: Python - Size: 60.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 29 - Forks: 4
simonw/graphql-scraper
Track changes to GraphQL APIs by git scraping their schemas
Size: 1.08 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 28 - Forks: 4
jandinter/gesetze-im-internet
Archive of German legal acts (weekly archive of gesetze-im-internet.de)
Language: Ruby - Size: 414 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 27 - Forks: 1
iandees/usps-collection-boxes
US Postal Service collection box locations.
Language: Python - Size: 694 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 26 - Forks: 2
sarojbelbase/nepstonks
An automated bot that scrapes the latest upcoming issues, news, and investment opportunities that are announced inside Nepal and sends them to a telegram channel.
Language: Python - Size: 6.24 MB - Last synced at: 8 days ago - Pushed at: 19 days ago - Stars: 26 - Forks: 5
ahmedshahriar/depression-tweets-scraper
A Scraper that scrapes '#depression' tweets daily powered by GitHub action and snscrape (stopped at June 30,2023)
Language: Python - Size: 124 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 25 - Forks: 14
rdmurphy/actblue-ticker-tracker
Keeps tabs on the ticking donation amount found on ActBlue's home page.
Language: Shell - Size: 589 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 24 - Forks: 6
simonw/cdc-vaccination-history 📦
A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.
Language: Python - Size: 302 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 24 - Forks: 13
quacs/quacs-data
A repository holding all the data used on QuACS.org
Language: Rust - Size: 646 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 20 - Forks: 6
punchagan/playo-find-venue
Find good Playo venues in convenient locations
Language: JavaScript - Size: 14 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 19 - Forks: 4
beatrizmilz/mananciais
Base de dados sobre volume operacional em mananciais de abastecimento público na Região Metropolitana de São Paulo (SP - Brasil).
Language: R - Size: 2.75 GB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 19 - Forks: 2
knudmoeller/berlin_corona_cases
Scraper for the official dashboard with current Corona case numbers, traffic light indicators ("Corona-Ampel") and vaccination situation for Berlin.
Language: Ruby - Size: 2.27 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 19 - Forks: 1
simonw/scrape-fediverse
Git scrapers for scraping the fediverse
Size: 70.6 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 17 - Forks: 1
Jonty/uk_petitions_data
An up-to-date archive of the data from https://petition.parliament.uk & http://petitions.number10.gov.uk
Language: Python - Size: 6.63 GB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 5
captn3m0/india-mutual-fund-ter-tracker
Tracking Total Expense Ratios of Indian Mutual Funds. Automatically updated daily.
Language: Python - Size: 3.56 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 4
lfk-im/lfk.im 📦
🍽 Lawrence, Kansas curbside takeout and delivery for local COVID-19 impacted businesses
Language: CSS - Size: 1.71 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 16 - Forks: 7
simonw/pge-outages
Tracking PG&E power outages
Size: 22.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 15 - Forks: 3
givefood/data
Latest data on UK food banks from Give Food scraped from our API and republished in various formats.
Size: 1.11 GB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 15 - Forks: 8
mary-ext/bluesky-verifier-scraping
Git scraping of Bluesky trusted verifiers
Language: TypeScript - Size: 60.5 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 15 - Forks: 1
dbreunig/git-scraper-extractor
Pull out versions of specific files from a gitscraping repo into individual files.
Language: Ruby - Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 15 - Forks: 0
palewire/noaa-hurricane-gis-scraper
Automated downloads of geographic information system data posted by the National Oceanic and Atmospheric Administration's National Hurricane Center and Central Pacific Hurricane Center
Language: Python - Size: 4.1 GB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 14 - Forks: 0
richardsondev/pse-outages
Tracking Puget Sound Energy outage history since March 2021
Size: 139 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 14 - Forks: 0
simonw/fara-history
Tracking the history of the FARA data from https://www.justice.gov/nsd-fara
Language: Python - Size: 180 MB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 6
nguqtruong/tiki-price-watch 📦
Theo dõi biến động giá sản phẩm TIKI với Github Actions
Language: Shell - Size: 1.31 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 14 - Forks: 4
simonw/conditional-get
CLI tool for fetching data using HTTP conditional get
Language: Python - Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 14 - Forks: 0
MrFlynn/mcbroken-archive
:inbox_tray: Archive for data from mcbroken.com.
Language: Python - Size: 4.85 GB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 13 - Forks: 2
matchilling/hmrc-exchange-rates
🇬🇧 HMRC Exchange Rates API for Customs & VAT 💸
Language: Shell - Size: 889 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 13 - Forks: 1
openclimatedata/paris-agreement-entry-into-force
Data Package of ratification status of the Paris Climate Agreement and the emissions shares used for entry into force
Language: HTML - Size: 338 KB - Last synced at: 8 months ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 4
tobilg/aws-iam-managed-policies
Automatically populated repository of AWS IAM Managed Policies
Language: TypeScript - Size: 15.2 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 12 - Forks: 0
blr-today/ingest
Ingestion pipeline for blr.today
Language: Python - Size: 24.1 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 12 - Forks: 1
beatrizmilz/noticiasgov
Raspagem de dados de portais de noticias governamentais
Language: R - Size: 45.8 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0
outages/aws-outages
Track AWS outages via Git History
Size: 1.28 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 3
OSUKED/Crown-Estate-Watch
This repository includes code for retrieving the latest data on UK offshore wind production and speeds, including spatial data at an individual turbine level.
Language: Python - Size: 94.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 1
HDRUK/papers Fork of susheel/papers
Extract of publications that mention HDR-UK
Language: Python - Size: 3.93 GB - Last synced at: about 22 hours ago - Pushed at: 1 day ago - Stars: 10 - Forks: 3
hueyy/lacuna-db
legal data in machine-readable form
Language: Clojure - Size: 555 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 10 - Forks: 5
pl4nty/web-admx-tool
Windows group policy editor in your browser, preloaded with popular ADMX files
Language: Vue - Size: 15.4 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 10 - Forks: 0
outages/bchydro-outages
Track BCHydro Outages via Git history
Size: 160 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 10 - Forks: 4
openclimatedata/ndcs
Data Package with Nationally Determined Contributions (NDCs)
Language: Python - Size: 434 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 10 - Forks: 4
outages/vultr-outages
Track Vultr outages via Git History
Size: 1.91 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 10 - Forks: 1
simonw/irma-scrapers
Screen scrapers relating to natural disasters. See their output in https://github.com/simonw/disaster-data/
Language: Python - Size: 63.5 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 6
simonw/coronavirus-data-gov-archive 📦
Backing up https://coronavirus.data.gov.uk/ to a git repository
Size: 59.3 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 10 - Forks: 3
data-liberation-project/fema-daily-ops-email-to-rss
FEMA Daily Operations Briefing: Email → RSS (→ CSV)
Language: Python - Size: 1.11 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 8 - Forks: 0
radames/google-fonts-analytics-archive
Archiving Google Fonts analytics data for fun https://fonts.google.com/analytics
Size: 309 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 8 - Forks: 1
mary-ext/atproto-lexicon-scraping
Git scraping of AT Protocol lexicon schemas
Language: TypeScript - Size: 815 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8 - Forks: 0
ianmuchina/HashflagArchive
[automated] Archive of Twitter/X hashflags
Language: TypeScript - Size: 69.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8 - Forks: 1
danp/nspoweroutages
Git scraping of the Nova Scotia Power Outage Map
Language: Go - Size: 220 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 8 - Forks: 1
schwanksta/irs-bmf-changelog
Creates a changelog for the IRS' exempt org business master file
Size: 563 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 8 - Forks: 2
maliayas/SublimeText_Documentation
Daily unofficial mirror of the ST documentation
Language: HTML - Size: 1.27 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 8 - Forks: 0
sgraaf/openapi-scraper
Track changes to RESTful APIs by git scraping their OpenAPI descriptions
Size: 18 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 7 - Forks: 0
thejeshgn/karnataka-eletricity-generation
Karnataka State Electricity Generation and Load data.
Language: HTML - Size: 21.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7 - Forks: 0
simonw/scrape-fema-shelters
Size: 19.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7 - Forks: 0
beardicus/scrape-nws-alerts
Scraping weather alerts from the US National Weather Service's XML feed
Size: 1.73 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7 - Forks: 0
jeremiak/noaa-cpc-map-scraper
Archive of map images from NOAA's Climate Prediction Center
Language: Shell - Size: 338 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 7 - Forks: 0
rdmurphy/tx-covid-vaccine-data
Tracking data on the progress of vaccine distribution and adminstration in Texas.
Language: Python - Size: 107 MB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 0
rafguns/doaj-history
Tracking the history of journals in the Directory of Open Access Journals
Size: 120 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 6 - Forks: 0
palewire/nyc-open-data-monitor
Automated monitoring of new and updated datasets posted to New York City's data portal
Language: Python - Size: 2.69 GB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 6 - Forks: 0
JosephTLucas/CISA_KNOWN_EXPLOITED_VULNERABILITIES_CATALOG
Git Scraping project for CISA Known Exploited Vulnerability Catalog
Size: 1010 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 6 - Forks: 2
raylas/sbc-reservoirs-history
Logging reservoir level data from https://rain.cosbpw.net
Size: 39 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 6 - Forks: 1
PatMyron/forecasting
Language: Python - Size: 6.23 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 6 - Forks: 2
Joel-hanson/Iceberg-locations
Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2
Language: Python - Size: 222 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 6 - Forks: 1
wldtyp/starlist.dev
A GitHub Pages site about GitHub, hosted on GitHub, that updates itself using GitHub search data collected by GitHub Actions!
Language: Python - Size: 19.6 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 0
fasiha/finviz-git-scraper
FinViz map of sectors and sub-sectors (until 2025 Mar 27)
Language: JavaScript - Size: 187 MB - Last synced at: 22 days ago - Pushed at: 7 months ago - Stars: 6 - Forks: 1
ohbarye/git-scraping-template
A template of a git scraping
Language: Ruby - Size: 3.91 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0