Topic: "ai-scraping"
firecrawl/firecrawl
π₯ The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Language: TypeScript - Size: 74.6 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 65,512 - Forks: 5,162
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
Language: Python - Size: 15.2 MB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 21,664 - Forks: 1,872
D4Vinci/Scrapling
π·οΈ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
Language: Python - Size: 3.94 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8,054 - Forks: 458
any4ai/AnyCrawl
AnyCrawl π: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
Language: TypeScript - Size: 1.58 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2,346 - Forks: 234
itsOwen/CyberScraper-2077
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Language: Python - Size: 320 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 1,879 - Forks: 174
raznem/parsera
Lightweight library for scraping web-sites with LLMs
Language: Python - Size: 2.51 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 1,229 - Forks: 69
firecrawl/firecrawl-app-examples
π₯ This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
Language: Jupyter Notebook - Size: 13.6 MB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 574 - Forks: 179
devflowinc/firecrawl-simple Fork of firecrawl/firecrawl
β Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
Language: TypeScript - Size: 40 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 523 - Forks: 47
oxylabs/oxylabs-ai-studio-py
Oxylabs AI Studio python SDK
Language: Python - Size: 2.26 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 323 - Forks: 0
ArchiveBox/abx-dl
β¬οΈ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). π Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...
Language: JavaScript - Size: 185 KB - Last synced at: about 14 hours ago - Pushed at: 2 months ago - Stars: 87 - Forks: 4
WeebDataHoarder/go-away
[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.
Language: Go - Size: 839 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 75 - Forks: 5
spider-rs/web-crawling-guides
How to guides on web-crawling or scraping
Size: 7.85 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 24 - Forks: 5
spider-rs/spider-clients
Python, Javascript, and Rust libraries for the Spider Cloud API.
Language: Rust - Size: 1.51 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 19 - Forks: 9
kaymen99/ai-web-scraper
AI web scraper built with Crawl4AI for extracting structured leads data from websites.
Language: Python - Size: 19.5 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 14 - Forks: 1
L1shed/Turbo
Fastest and cheapest distributed residential proxy network.
Language: TypeScript - Size: 17.4 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 10 - Forks: 1
kaymen99/google-maps-lead-generator
Extract Google Maps business leads and enrich contact details using AI & web scraping
Language: Python - Size: 45.9 KB - Last synced at: 29 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 5
oxylabs/oxylabs-ai-studio-js
Oxylabs AI Studio JS SDK
Language: TypeScript - Size: 1.67 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0
nathabonfim59/md-fetch
A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications
Language: Go - Size: 1.58 MB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0
OpenData4Sciece/Disneyland-Resorts-Hotels
Disneyland Resorts Hotels Investigation Study
Language: Python - Size: 46.9 KB - Last synced at: 16 days ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0
MubeenAk47/ai-ready-website
π Analyze your website's AI readiness and optimize for performance with real-time scoring, recommendations, and detailed metrics.
Language: TypeScript - Size: 1.66 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0
awdsdmv/ai-ready-website
π Analyze your websiteβs AI readiness and optimization with real-time scoring, recommendations, and insights for better SEO and accessibility.
Language: TypeScript - Size: 1.65 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0
ScraperHub/web-scraper-with-gemini-ai
Web Scraper powered by Gemini AI in Python.
Language: Python - Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0
azhan85/firecrawl
Firecrawl simplifies web crawling with a user-friendly interface and powerful features. Join the community and enhance your data collection today! ππ
Language: TypeScript - Size: 54 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0
vonuyvicoo/crava
AI-powered web scraper using Javascript/Typescript.
Language: TypeScript - Size: 66.4 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
jenslys/skrape-js
TypeScript/Node.js SDK to easily interact with the skrape.ai API
Language: TypeScript - Size: 71.3 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0