An open API service providing repository metadata for many open source software ecosystems.

Topic: "git-scraping"

jstrieb/github-stats

Better GitHub statistics images for your profile, with stats from private repos too

Language: Python - Size: 2.29 MB - Last synced at: about 16 hours ago - Pushed at: about 18 hours ago - Stars: 3,187 - Forks: 667

alex/nyt-2020-election-scraper

Language: HTML - Size: 681 MB - Last synced at: 21 days ago - Pushed at: about 2 years ago - Stars: 1,762 - Forks: 292

factbook/factbook.json

World Factbook Country Profiles in JSON - Free Open Public Domain Data - No API Key Required ;-)

Size: 53.3 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,028 - Forks: 122

mackorone/spotify-playlist-archive

Daily snapshots of public Spotify playlists

Size: 77 GB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 443 - Forks: 276

femueller/cloud-ip-ranges

An up-to-date export of cloud provider IP address ranges

Size: 26.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 314 - Forks: 46

simonw/csv-diff

Python CLI tool and library for diffing CSV and JSON files

Language: Python - Size: 37.1 KB - Last synced at: 20 days ago - Pushed at: 9 months ago - Stars: 312 - Forks: 50

vinayak-mehta/conrad

Track conferences and meetups on your terminal.

Language: Python - Size: 1.35 MB - Last synced at: about 17 hours ago - Pushed at: about 18 hours ago - Stars: 254 - Forks: 58

swyxio/gh-action-data-scraping

this shows how to use github actions to do periodic data scraping

Language: JavaScript - Size: 19 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 232 - Forks: 37

simonw/ca-fires-history

Tracking fire data from www.fire.ca.gov

Size: 5.8 MB - Last synced at: 23 days ago - Pushed at: 11 months ago - Stars: 210 - Forks: 46

endoflife-date/release-data

Common Release Data for various projects in a consumable format, automatically updated.

Language: Python - Size: 13.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 170 - Forks: 50

mary-ext/atproto-scraping

Git scraping of AT Protocol/Bluesky instances

Language: TypeScript - Size: 3.09 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 133 - Forks: 6

simonw/git-scraper-template

Template repository for setting up a new git scraper

Language: Shell - Size: 17.6 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 81 - Forks: 6

vitorbaptista/google-covid19-mobility-reports

Data extraction of Google's COVID-19 Mobility Reports

Language: HTML - Size: 125 MB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 81 - Forks: 11

tobilg/public-cloud-provider-ip-ranges

Unified datasets for public cloud provider IP ranges. Providers include AWS, Azure, CloudFlare, DigitalOcean, Fastly, Google Cloud and Oracle Cloud.

Language: Shell - Size: 369 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 69 - Forks: 9

mary-ext/bluesky-labeler-scraping

Git scraping of Bluesky labelers/label providers

Language: TypeScript - Size: 2.72 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 68 - Forks: 2

tobilg/aws-iam-data

This repository contains the full dataset of AWS IAM data (services, actions, resource types and conditions keys). It's updated on a daily basis at 4AM UTC.

Language: TypeScript - Size: 322 MB - Last synced at: about 14 hours ago - Pushed at: about 16 hours ago - Stars: 62 - Forks: 0

simonw/scrape-hacker-news-by-domain

Scrape HN to track links from specific domains

Language: JavaScript - Size: 1.9 MB - Last synced at: about 16 hours ago - Pushed at: about 17 hours ago - Stars: 61 - Forks: 9

datadesk/california-coronavirus-scrapers

The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.

Language: Jupyter Notebook - Size: 9.89 GB - Last synced at: 23 days ago - Pushed at: 25 days ago - Stars: 59 - Forks: 6

simonw/pge-outages-pre-2024

Tracking PG&E outages

Language: Python - Size: 134 MB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 55 - Forks: 7

simonw/scrape-doge-gov

Track changes to doge.gov

Language: Python - Size: 124 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 51 - Forks: 2

simonw/disaster-scrapers

Scrapers for disaster data - writes to https://github.com/simonw/disaster-data

Language: Python - Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 49 - Forks: 3

simonw/help-scraper

Record a history of --help for various commands

Language: Python - Size: 204 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 47 - Forks: 8

simonw/sf-tree-history

Tracking the history of trees in San Francisco

Size: 438 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 46 - Forks: 6

pl4nty/intune-change-tracking

Track changes to Microsoft Intune via git and RSS

Language: Python - Size: 28.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 42 - Forks: 6

tigger0jk/ark-invest-scraper

Pulling a history of the holdings for ark invest funds https://ark-funds.com/

Language: Python - Size: 5.68 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 39 - Forks: 11

simonw/scrape-open-data

Scrape various open data directories to create an index of what's available out there

Language: Python - Size: 5.9 GB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 37 - Forks: 3

simonw/disaster-data

Data scraped by https://github.com/simonw/disaster-scrapers

Size: 191 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 36 - Forks: 10

captn3m0/india-isin-data

International Securities Identification Numbers for various Indian Securities

Language: Shell - Size: 101 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 35 - Forks: 6

pkmn/randbats

Pokémon Showdown's Random Battle sets

Language: JavaScript - Size: 42 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 35 - Forks: 13

pkmn/smogon

Wrapper around Smogon's analyses and usage statistics

Language: TypeScript - Size: 1.04 GB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 34 - Forks: 4

captn3m0/mf.captnemo.in

Get information about Indian Mutual Funds from their ISIN numbers.

Language: Ruby - Size: 69.2 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 33 - Forks: 3

mikepqr/real-estate-scrape-eg

A repository demonstrating the use of real-estate-scrape to store the estimated value of a property on Redfin and Zillow every night using Github Actions.

Size: 67 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 33 - Forks: 6

ifoukarakis/jobscrapper

An automated job scrapper

Language: Python - Size: 708 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 32 - Forks: 11

fedora-python/portingdb

Database & tools to track Python 2 removal from Fedora

Language: Python - Size: 34.7 MB - Last synced at: 24 days ago - Pushed at: 7 months ago - Stars: 31 - Forks: 37

AyrtonB/EveryNoise-Watch

Spotify genre attributes from EveryNoise

Language: Jupyter Notebook - Size: 1.82 MB - Last synced at: 2 days ago - Pushed at: about 4 years ago - Stars: 30 - Forks: 1

simonw/graphql-scraper

Track changes to GraphQL APIs by git scraping their schemas

Size: 1.08 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 28 - Forks: 4

jandinter/gesetze-im-internet

Archive of German legal acts (weekly archive of gesetze-im-internet.de)

Language: Ruby - Size: 414 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 27 - Forks: 1

MaxHalford/bike-sharing-history

🚲 Git scraping for bike sharing APIs

Language: Python - Size: 35.4 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 26 - Forks: 5

captn3m0/historical-mf-data

Historical Mutual Funds data

Language: Python - Size: 6.55 GB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 26 - Forks: 8

iandees/usps-collection-boxes

US Postal Service collection box locations.

Language: Python - Size: 665 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 26 - Forks: 2

sarojbelbase/nepstonks

An automated bot that scrapes the latest upcoming issues, news, and investment opportunities that are announced inside Nepal and sends them to a telegram channel.

Language: Python - Size: 6.12 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 26 - Forks: 5

bobek/masscan_as_a_service

masscan as a service

Language: Python - Size: 46.9 KB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 26 - Forks: 2

ahmedshahriar/depression-tweets-scraper

A Scraper that scrapes '#depression' tweets daily powered by GitHub action and snscrape (stopped at June 30,2023)

Language: Python - Size: 124 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 25 - Forks: 13

rdmurphy/actblue-ticker-tracker

Keeps tabs on the ticking donation amount found on ActBlue's home page.

Language: Shell - Size: 416 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 24 - Forks: 6

simonw/cdc-vaccination-history 📦

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Language: Python - Size: 302 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 13

quacs/quacs-data

A repository holding all the data used on QuACS.org

Language: Rust - Size: 611 MB - Last synced at: about 17 hours ago - Pushed at: about 18 hours ago - Stars: 20 - Forks: 5

punchagan/playo-find-venue

Find good Playo venues in convenient locations

Language: JavaScript - Size: 12 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 20 - Forks: 5

beatrizmilz/mananciais

Base de dados sobre volume operacional em mananciais de abastecimento público na Região Metropolitana de São Paulo (SP - Brasil).

Language: R - Size: 2.41 GB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 19 - Forks: 2

knudmoeller/berlin_corona_cases

Scraper for the official dashboard with current Corona case numbers, traffic light indicators ("Corona-Ampel") and vaccination situation for Berlin.

Language: Ruby - Size: 2.27 MB - Last synced at: 11 months ago - Pushed at: almost 2 years ago - Stars: 19 - Forks: 1

simonw/scrape-fediverse

Git scrapers for scraping the fediverse

Size: 58 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 17 - Forks: 1

Jonty/uk_petitions_data

An up-to-date archive of the data from https://petition.parliament.uk & http://petitions.number10.gov.uk

Language: Python - Size: 6.15 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 16 - Forks: 5

lfk-im/lfk.im 📦

🍽 Lawrence, Kansas curbside takeout and delivery for local COVID-19 impacted businesses

Language: CSS - Size: 1.71 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 7

givefood/data

Latest data on UK food banks from Give Food scraped from our API and republished in various formats.

Size: 1.01 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 15 - Forks: 7

dbreunig/git-scraper-extractor

Pull out versions of specific files from a gitscraping repo into individual files.

Language: Ruby - Size: 10.7 KB - Last synced at: 2 days ago - Pushed at: almost 4 years ago - Stars: 15 - Forks: 0

captn3m0/india-mutual-fund-ter-tracker

Tracking Total Expense Ratios of Indian Mutual Funds. Automatically updated daily.

Language: Python - Size: 3.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 14 - Forks: 3

simonw/fara-history

Tracking the history of the FARA data from https://www.justice.gov/nsd-fara

Language: Python - Size: 180 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 14 - Forks: 6

nguqtruong/tiki-price-watch 📦

Theo dõi biến động giá sản phẩm TIKI với Github Actions

Language: Shell - Size: 1.31 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 4

simonw/conditional-get

CLI tool for fetching data using HTTP conditional get

Language: Python - Size: 17.6 KB - Last synced at: 18 days ago - Pushed at: almost 4 years ago - Stars: 14 - Forks: 0

simonw/pge-outages

Tracking PG&E power outages

Size: 21.9 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 13 - Forks: 2

MrFlynn/mcbroken-archive

:inbox_tray: Archive for data from mcbroken.com.

Size: 4.44 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 13 - Forks: 1

openclimatedata/paris-agreement-entry-into-force

Data Package of ratification status of the Paris Climate Agreement and the emissions shares used for entry into force

Language: HTML - Size: 338 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 4

matchilling/hmrc-exchange-rates

🇬🇧 HMRC Exchange Rates API for Customs & VAT 💸

Language: Shell - Size: 846 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 12 - Forks: 2

palewire/noaa-hurricane-gis-scraper

Automated downloads of geographic information system data posted by the National Oceanic and Atmospheric Administration's National Hurricane Center and Central Pacific Hurricane Center

Language: Python - Size: 3.15 GB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 12 - Forks: 0

beatrizmilz/noticiasgov

Raspagem de dados de portais de noticias governamentais

Language: R - Size: 45.8 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

outages/aws-outages

Track AWS outages via Git History

Size: 1.28 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 3

OSUKED/Crown-Estate-Watch

This repository includes code for retrieving the latest data on UK offshore wind production and speeds, including spatial data at an individual turbine level.

Language: Python - Size: 89.1 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 11 - Forks: 1

tobilg/aws-iam-managed-policies

Automatically populated repository of AWS IAM Managed Policies

Language: TypeScript - Size: 13.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 11 - Forks: 0

richardsondev/pse-outages

Tracking Puget Sound Energy outage history since March 2021

Size: 123 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 10 - Forks: 0

outages/bchydro-outages

Track BCHydro Outages via Git history

Size: 148 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 10 - Forks: 4

HDRUK/papers Fork of susheel/papers

Extract of publications that mention HDR-UK

Language: Python - Size: 3.88 GB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 10 - Forks: 3

simonw/irma-scrapers

Screen scrapers relating to natural disasters. See their output in https://github.com/simonw/disaster-data/

Language: Python - Size: 63.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 6

simonw/coronavirus-data-gov-archive 📦

Backing up https://coronavirus.data.gov.uk/ to a git repository

Size: 59.3 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 10 - Forks: 3

hueyy/lacuna-db

legal data in machine-readable form

Language: Clojure - Size: 459 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 9 - Forks: 4

blr-today/ingest

Ingestion pipeline for blr.today

Language: Python - Size: 18.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 9 - Forks: 1

outages/vultr-outages

Track Vultr outages via Git History

Size: 1.78 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 9 - Forks: 1

openclimatedata/ndcs

Data Package with Nationally Determined Contributions (NDCs)

Language: Python - Size: 401 KB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 9 - Forks: 3

radames/google-fonts-analytics-archive

Archiving Google Fonts analytics data for fun https://fonts.google.com/analytics

Size: 274 MB - Last synced at: about 17 hours ago - Pushed at: 1 day ago - Stars: 8 - Forks: 1

danp/nspoweroutages

Git scraping of the Nova Scotia Power Outage Map

Language: Go - Size: 215 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8 - Forks: 1

mary-ext/bluesky-verifier-scraping

Git scraping of Bluesky trusted verifiers

Language: TypeScript - Size: 20.5 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8 - Forks: 0

data-liberation-project/fema-daily-ops-email-to-rss

FEMA Daily Operations Briefing: Email → RSS (→ CSV)

Language: Python - Size: 3.19 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8 - Forks: 0

ianmuchina/HashflagArchive

[automated] Archive of Twitter/X hashflags

Language: TypeScript - Size: 64.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 8 - Forks: 1

schwanksta/irs-bmf-changelog

Creates a changelog for the IRS' exempt org business master file

Size: 598 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 2

maliayas/SublimeText_Documentation

Daily unofficial mirror of the ST documentation

Language: HTML - Size: 1.27 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 8 - Forks: 0

jeremiak/noaa-cpc-map-scraper

Archive of map images from NOAA's Climate Prediction Center

Language: Shell - Size: 376 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 7 - Forks: 0

thejeshgn/karnataka-eletricity-generation

Karnataka State Electricity Generation and Load data.

Language: HTML - Size: 18.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 7 - Forks: 0

beardicus/scrape-nws-alerts

Scraping weather alerts from the US National Weather Service's XML feed

Size: 1.55 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7 - Forks: 0

simonw/scrape-fema-shelters

Size: 16.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7 - Forks: 0

sgraaf/openapi-scraper

Track changes to RESTful APIs by git scraping their OpenAPI descriptions

Size: 17.7 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 7 - Forks: 0

rdmurphy/tx-covid-vaccine-data

Tracking data on the progress of vaccine distribution and adminstration in Texas.

Language: Python - Size: 107 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

palewire/nyc-open-data-monitor

Automated monitoring of new and updated datasets posted to New York City's data portal

Language: Python - Size: 2.1 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0

JosephTLucas/CISA_KNOWN_EXPLOITED_VULNERABILITIES_CATALOG

Git Scraping project for CISA Known Exploited Vulnerability Catalog

Size: 1.89 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 6 - Forks: 2

rafguns/doaj-history

Tracking the history of journals in the Directory of Open Access Journals

Size: 96 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 6 - Forks: 0

raylas/sbc-reservoirs-history

Logging reservoir level data from https://rain.cosbpw.net

Size: 35.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 6 - Forks: 1

fasiha/finviz-git-scraper

FinViz map of sectors and sub-sectors (until 2025 Mar 27)

Language: JavaScript - Size: 187 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 1

ohbarye/git-scraping-template

A template of a git scraping

Language: Ruby - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 6 - Forks: 0

Joel-hanson/Iceberg-locations

Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Language: Python - Size: 147 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

ahmedshahriar/burnout-tweets-scraper

A Scraper that scrapes '#burnout' tweets daily powered by GitHub action and snscrape (stopped at June 30,2023)

Language: Python - Size: 39 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 2

brian14708/wh-briefings

The White House Briefing Room

Language: Python - Size: 73.8 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 5 - Forks: 1

iris-hep/analysis-community-summary

Summary report on community interactions and contributions with IRIS-HEP Analysis Systems related tools

Language: Python - Size: 1.82 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 5 - Forks: 0

simonw/scrape-github-actions-package-versions

Git scraper recording the package versions installed on the defaul GitHub Actions ubuntu-latest worker

Size: 230 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5 - Forks: 0