GitHub / dylanpicart / excel_api_access
Multithreaded web scraping and API access using httpx, Selenium, Asyncio, and Concurrent.futures. Automates downloading and storing Excel datasets from publicly available NYC data sources. Features robust logging, error handling, virus scanning, and parallel processing for uninterrupted execution. CI/CD pipeline for testing & packaging
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dylanpicart%2Fexcel_api_access
PURL: pkg:github/dylanpicart/excel_api_access
Stars: 0
Forks: 0
Open issues: 0
License: mit
Language: Python
Size: 68.4 KB
Dependencies parsed at: Pending
Created at: 9 months ago
Updated at: about 2 months ago
Pushed at: about 2 months ago
Last synced at: about 1 month ago
Topics: asyncio, automation, ci-cd-pipeline, concurrency, data-automation, data-pipelines, education-data, excel, file-handling, http2, httpx, multithreading, nyc-data, python, selenium, tenacity, tqdm, webscraping