An open API service providing repository metadata for many open source software ecosystems.

GitHub / dylanpicart / excel_api_access

Multithreaded web scraping and API access using httpx, Selenium, Asyncio, and Concurrent.futures. Automates downloading and storing Excel datasets from publicly available NYC data sources. Features robust logging, error handling, virus scanning, and parallel processing for uninterrupted execution. CI/CD pipeline for testing & packaging

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dylanpicart%2Fexcel_api_access
PURL: pkg:github/dylanpicart/excel_api_access

Stars: 0
Forks: 0
Open issues: 0

License: mit
Language: Python
Size: 68.4 KB
Dependencies parsed at: Pending

Created at: 9 months ago
Updated at: about 2 months ago
Pushed at: about 2 months ago
Last synced at: about 1 month ago

Topics: asyncio, automation, ci-cd-pipeline, concurrency, data-automation, data-pipelines, education-data, excel, file-handling, http2, httpx, multithreading, nyc-data, python, selenium, tenacity, tqdm, webscraping

    Loading...