An open API service providing repository metadata for many open source software ecosystems.

Topic: "ai-scraping"

firecrawl/firecrawl

πŸ”₯ The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

Language: TypeScript - Size: 74.6 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 65,512 - Forks: 5,162

ScrapeGraphAI/Scrapegraph-ai

Python scraper based on AI

Language: Python - Size: 15.2 MB - Last synced at: 2 days ago - Pushed at: 9 days ago - Stars: 21,664 - Forks: 1,872

D4Vinci/Scrapling

πŸ•·οΈ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

Language: Python - Size: 3.94 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8,054 - Forks: 458

any4ai/AnyCrawl

AnyCrawl πŸš€: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

Language: TypeScript - Size: 1.58 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2,346 - Forks: 234

itsOwen/CyberScraper-2077

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

Language: Python - Size: 320 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 1,879 - Forks: 174

raznem/parsera

Lightweight library for scraping web-sites with LLMs

Language: Python - Size: 2.51 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 1,229 - Forks: 69

firecrawl/firecrawl-app-examples

πŸ”₯ This repository contains complete application examples, including websites and other projects, developed using Firecrawl.

Language: Jupyter Notebook - Size: 13.6 MB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 574 - Forks: 179

devflowinc/firecrawl-simple Fork of firecrawl/firecrawl

βž– Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.

Language: TypeScript - Size: 40 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 523 - Forks: 47

oxylabs/oxylabs-ai-studio-py

Oxylabs AI Studio python SDK

Language: Python - Size: 2.26 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 323 - Forks: 0

ArchiveBox/abx-dl

⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...

Language: JavaScript - Size: 185 KB - Last synced at: about 14 hours ago - Pushed at: 2 months ago - Stars: 87 - Forks: 4

WeebDataHoarder/go-away

[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.

Language: Go - Size: 839 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 75 - Forks: 5

spider-rs/web-crawling-guides

How to guides on web-crawling or scraping

Size: 7.85 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 24 - Forks: 5

spider-rs/spider-clients

Python, Javascript, and Rust libraries for the Spider Cloud API.

Language: Rust - Size: 1.51 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 19 - Forks: 9

kaymen99/ai-web-scraper

AI web scraper built with Crawl4AI for extracting structured leads data from websites.

Language: Python - Size: 19.5 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 14 - Forks: 1

L1shed/Turbo

Fastest and cheapest distributed residential proxy network.

Language: TypeScript - Size: 17.4 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 10 - Forks: 1

kaymen99/google-maps-lead-generator

Extract Google Maps business leads and enrich contact details using AI & web scraping

Language: Python - Size: 45.9 KB - Last synced at: 29 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 5

oxylabs/oxylabs-ai-studio-js

Oxylabs AI Studio JS SDK

Language: TypeScript - Size: 1.67 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

nathabonfim59/md-fetch

A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications

Language: Go - Size: 1.58 MB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0

OpenData4Sciece/Disneyland-Resorts-Hotels

Disneyland Resorts Hotels Investigation Study

Language: Python - Size: 46.9 KB - Last synced at: 16 days ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

MubeenAk47/ai-ready-website

πŸš€ Analyze your website's AI readiness and optimize for performance with real-time scoring, recommendations, and detailed metrics.

Language: TypeScript - Size: 1.66 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

awdsdmv/ai-ready-website

πŸ” Analyze your website’s AI readiness and optimization with real-time scoring, recommendations, and insights for better SEO and accessibility.

Language: TypeScript - Size: 1.65 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

ScraperHub/web-scraper-with-gemini-ai

Web Scraper powered by Gemini AI in Python.

Language: Python - Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

azhan85/firecrawl

Firecrawl simplifies web crawling with a user-friendly interface and powerful features. Join the community and enhance your data collection today! πŸ™πŸŒ

Language: TypeScript - Size: 54 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

vonuyvicoo/crava

AI-powered web scraper using Javascript/Typescript.

Language: TypeScript - Size: 66.4 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

jenslys/skrape-js

TypeScript/Node.js SDK to easily interact with the skrape.ai API

Language: TypeScript - Size: 71.3 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0