Topic: "web-scraping"
firecrawl/firecrawl
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Language: TypeScript - Size: 74.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 65,512 - Forks: 5,162
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Language: Python - Size: 27.6 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 58,754 - Forks: 11,131
Mintplex-Labs/anything-llm
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
Language: JavaScript - Size: 47.5 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 50,561 - Forks: 5,332
dgtlmoon/changedetection.io
Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes, price drops, restock alerts, and website defacement monitoring—all for free or enjoy our SaaS plan!
Language: Python - Size: 11.7 MB - Last synced at: about 17 hours ago - Pushed at: about 19 hours ago - Stars: 28,307 - Forks: 1,574
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
Language: Python - Size: 15.2 MB - Last synced at: 3 days ago - Pushed at: 10 days ago - Stars: 21,664 - Forks: 1,872
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Language: TypeScript - Size: 144 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 20,257 - Forks: 1,054
Evil0ctal/Douyin_TikTok_Download_API
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Language: Python - Size: 8.5 MB - Last synced at: 30 days ago - Pushed at: about 1 month ago - Stars: 14,460 - Forks: 2,134
getmaxun/maxun
⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡
Language: TypeScript - Size: 5.27 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 13,802 - Forks: 1,123
seleniumbase/SeleniumBase
Python APIs for web automation, testing, and bypassing bot-detection.
Language: Python - Size: 14 MB - Last synced at: about 22 hours ago - Pushed at: about 24 hours ago - Stars: 11,817 - Forks: 1,436
D4Vinci/Scrapling
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
Language: Python - Size: 3.94 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8,054 - Forks: 458
mherrmann/helium
Lighter web automation with Python
Language: Python - Size: 39.5 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 8,038 - Forks: 501
lorien/awesome-web-scraping
List of libraries, tools and APIs for web scraping and data processing.
Language: Makefile - Size: 427 KB - Last synced at: 12 days ago - Pushed at: 20 days ago - Stars: 7,384 - Forks: 825
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
Language: Python - Size: 32.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7,064 - Forks: 508
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Language: Python - Size: 132 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 6,937 - Forks: 709
go-rod/rod
A Chrome DevTools Protocol driver for web automation and scraping.
Language: Go - Size: 3.61 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6,285 - Forks: 414
autoscrape-labs/pydoll
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
Language: Python - Size: 15 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 5,874 - Forks: 319
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Language: Python - Size: 33.8 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 4,829 - Forks: 322
firecrawl/firecrawl-mcp-server
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
Language: JavaScript - Size: 2.96 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 4,740 - Forks: 508
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
Language: Python - Size: 1.61 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 4,257 - Forks: 398
jaypyles/Scraperr
Self-hosted webscraper.
Language: TypeScript - Size: 2.42 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4,100 - Forks: 184
snooppr/snoop
Snoop — инструмент разведки на основе открытых данных (OSINT world)
Language: Python - Size: 60.6 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 3,475 - Forks: 397
php-curl-class/php-curl-class
PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
Language: PHP - Size: 2.82 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 3,300 - Forks: 819
x4nth055/pythoncode-tutorials
The Python Code Tutorials
Language: Jupyter Notebook - Size: 321 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 2,918 - Forks: 1,992
lorien/grab
Web Scraping Framework
Language: Python - Size: 9.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2,416 - Forks: 275
gosom/google-maps-scraper
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
Language: Go - Size: 21.6 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2,380 - Forks: 308
codingforentrepreneurs/30-Days-of-Python
Learn Python for the next 30 (or so) Days.
Language: HTML - Size: 61.4 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 2,179 - Forks: 1,336
yusufkaraaslan/Skill_Seekers
Single powerful tool to convert ANY documentation website into a Claude skill
Language: Python - Size: 524 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,938 - Forks: 192
oxylabs/amazon-scraper
Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.
Language: Python - Size: 576 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 1,813 - Forks: 64
A9T9/RPA
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
Language: JavaScript - Size: 13.1 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 1,759 - Forks: 364
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.
Language: JavaScript - Size: 5.09 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,684 - Forks: 62
justmarkham/DAT8
General Assembly's 2015 Data Science course in Washington, DC
Language: Jupyter Notebook - Size: 23 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1,613 - Forks: 1,067
brightdata/brightdata-mcp
A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.
Language: JavaScript - Size: 63.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,511 - Forks: 200
tidyverse/rvest
Simple web scraping for R
Language: R - Size: 12.8 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 1,507 - Forks: 350
rushter/selectolax
Python binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors.
Language: Cython - Size: 476 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,432 - Forks: 81
roach-php/core
The complete web scraping toolkit for PHP.
Language: PHP - Size: 787 KB - Last synced at: 15 days ago - Pushed at: 27 days ago - Stars: 1,429 - Forks: 77
oxylabs/free-proxy-list
Claim Free proxy list with United States IP addresses and use it for your projects.
Size: 4.86 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,244 - Forks: 8
juancarlospaco/faster-than-requests
Faster requests on Python 3
Language: Nim - Size: 20.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,127 - Forks: 92
Decodo/Decodo
HTTP(S)/SOCKS5 rotating residential proxies - code examples & general information.
Language: Java - Size: 320 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1,125 - Forks: 43
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
Language: TypeScript - Size: 557 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,103 - Forks: 54
rebrowser/rebrowser-patches
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
Language: JavaScript - Size: 79.1 KB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 1,072 - Forks: 58
tinyfish-io/agentql
AgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale. Includes REST API, Python and JavaScript SDKs, browser debugger.
Language: Python - Size: 868 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 958 - Forks: 120
0x676e67/rnet
A blazing-fast Python HTTP Client with TLS fingerprint
Language: Rust - Size: 3.95 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 942 - Forks: 72
vprusso/youtube_tutorials
Collection of scripts corresponding to LucidProgramming YouTube tutorials
Language: Python - Size: 955 KB - Last synced at: 26 days ago - Pushed at: about 3 years ago - Stars: 926 - Forks: 954
saifyxpro/HeadlessX
A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.
Language: JavaScript - Size: 3.69 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 898 - Forks: 120
Kaliiiiiiiiii-Vinyzu/patchright-python
Undetected Python version of the Playwright testing and automation library.
Language: Python - Size: 40 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 863 - Forks: 63
kaliiiiiiiiii/Selenium-Driverless
a stealthy browser automation framework
Language: Python - Size: 19.1 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 831 - Forks: 83
gildas-lormeau/single-file-cli
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
Language: JavaScript - Size: 5.16 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 830 - Forks: 83
postmodern/spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Language: Ruby - Size: 687 KB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 826 - Forks: 109
DataHenHQ/till
DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
Language: Go - Size: 2.04 MB - Last synced at: 8 months ago - Pushed at: almost 4 years ago - Stars: 814 - Forks: 22
lit26/finvizfinance
Finviz analysis python library.
Language: Jupyter Notebook - Size: 6.92 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 790 - Forks: 126
je-suis-tm/web-scraping
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Language: Python - Size: 1.88 MB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 787 - Forks: 177
scrapfly/scrapfly-scrapers
Scalable Python web scraping scripts for +40 popular domains
Language: Python - Size: 6.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 732 - Forks: 158
serpapi/google-search-results-python
Google Search Results via SERP API pip Python Package
Language: Python - Size: 237 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 710 - Forks: 116
alecxe/scrapy-fake-useragent
Random User-Agent middleware based on fake-useragent
Language: Python - Size: 54.7 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 693 - Forks: 96
dinubs/coolqlcool
Nextjs server to query websites with GraphQL
Language: JavaScript - Size: 4.22 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 631 - Forks: 48
achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project
Complete-Life-Cycle-of-a-Data-Science-Project
Size: 156 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 609 - Forks: 250
z0m31en7/Uscrapper
Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.
Language: Python - Size: 438 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 603 - Forks: 62
oxylabs/how-to-scrape-google-scholar
A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.
Language: Python - Size: 295 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 587 - Forks: 8
spekulatius/PHPScraper
A universal web-util for PHP.
Language: PHP - Size: 6.53 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 561 - Forks: 76
programminghistorian/jekyll
Jekyll-based static site for The Programming Historian
Language: HTML - Size: 955 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 538 - Forks: 229
oxylabs/quick-start-guide
Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.
Size: 59.6 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 516 - Forks: 3
jaebradley/basketball_reference_web_scraper
NBA Stats API via Basketball Reference
Language: HTML - Size: 19.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 513 - Forks: 124
oxylabs/how-to-scrape-amazon-prices
A code for extracting best-selling items, search results, and currently available deals from Amazon using Python and Oxylabs E-Commerce Scraper API.
Language: Python - Size: 83 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 513 - Forks: 6
ScrapeGraphAI/scrapecraft
🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.
Language: Python - Size: 46.7 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 507 - Forks: 85
AlexMathew/scrapple
A framework for creating semi-automatic web content extractors
Language: Python - Size: 1.15 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 502 - Forks: 41
austinoboyle/scrape-linkedin-selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Language: HTML - Size: 269 KB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 490 - Forks: 166
shaikhsajid1111/social-media-profile-scrapers
Fetch user's data across social media
Language: Python - Size: 4.5 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 478 - Forks: 77
VIDA-NYU/ache
ACHE is a web crawler for domain-specific search.
Language: Java - Size: 66.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 472 - Forks: 133
sangaline/wayback-machine-scraper
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Language: Python - Size: 82 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 454 - Forks: 81
Kaliiiiiiiiii-Vinyzu/patchright-nodejs
Undetected NodeJS version of the Playwright testing and automation library.
Language: JavaScript - Size: 72.3 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 443 - Forks: 29
davidteather/everything-web-scraping
Learn everything web scraping with David Teather Codes on YouTube
Language: HTML - Size: 7.6 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 431 - Forks: 86
roniemartinez/dude 📦
dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators
Language: Python - Size: 2.49 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 428 - Forks: 19
flairNLP/fundus
A very simple news crawler with a funny name
Language: Python - Size: 22.4 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 415 - Forks: 106
yusuzech/r-web-scraping-cheat-sheet
Guide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.
Language: R - Size: 2.89 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 394 - Forks: 104
orangecoding/fredy
❤️ Fredy - [F]ind [R]eal [E]state [D]amn Eas[y] - Fredy keeps searching for new apartments, houses, and flats in Germany on platforms like ImmoScout24, Immowelt, Immonet, eBay Kleinanzeigen, and WG-Gesucht and instantly delivers the results to you via Slack, Telegram, Email, Discord or ntfy, so you can focus on the more important things in life ;)
Language: JavaScript - Size: 6.37 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 383 - Forks: 102
lkuffo/web-scraping
Más de 50 ejemplos de web scraping utilizando: Requests | Scrapy | Selenium | LXML | BeautifulSoup
Language: Python - Size: 60.5 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 377 - Forks: 215
deedy5/primp
🪞PRIMP (Python Requests IMPersonate). The fastest python HTTP client that can impersonate web browsers
Language: Python - Size: 2.4 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 373 - Forks: 44
graphlit/graphlit-mcp-server
Model Context Protocol (MCP) Server for Graphlit Platform
Language: TypeScript - Size: 376 KB - Last synced at: 12 days ago - Pushed at: 26 days ago - Stars: 368 - Forks: 48
crwlrsoft/crawler
Library for Rapid (Web) Crawler and Scraper Development
Language: PHP - Size: 1.02 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 364 - Forks: 13
City-Bureau/city-scrapers
Scrape, standardize and share public meetings from local government websites
Language: HTML - Size: 11.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 362 - Forks: 316
sdl60660/letterboxd_recommendations
Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username
Language: Python - Size: 7.29 GB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 354 - Forks: 33
serpapi/nokolexbor
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
Language: C - Size: 665 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 347 - Forks: 6
shaikhsajid1111/twitter-scraper-selenium
Python's package to scrap Twitter's front-end easily
Language: Python - Size: 127 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 337 - Forks: 55
yaroslaff/nudecrawler
Crawl telegra.ph searching for nudes!
Language: Python - Size: 212 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 334 - Forks: 27
buyukakyuz/email-sleuth
Discover and verify professional emails using names + domains (Rust-based tool with SMTP, DNS, scraping, and scoring)
Language: Rust - Size: 155 KB - Last synced at: 22 days ago - Pushed at: 6 months ago - Stars: 333 - Forks: 42
oxylabs/oxylabs-ai-studio-py
Oxylabs AI Studio python SDK
Language: Python - Size: 2.26 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 323 - Forks: 0
walissonsilva/web-scraping-python
🌐 Repositório com o conteúdo (slides, exemplos, códigos) da série de vídeos no YouTube sobre Web Scraping com Python.
Language: Python - Size: 913 KB - Last synced at: 7 months ago - Pushed at: over 4 years ago - Stars: 321 - Forks: 55
s0rg/crawley
The unix-way web crawler
Language: Go - Size: 216 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 315 - Forks: 19
web-agent-master/google-search
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.
Language: TypeScript - Size: 121 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 312 - Forks: 55
infinilabs/crawler
🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)
Language: Go - Size: 54.6 MB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 309 - Forks: 82
scrapehero-code/amazon-scraper
A simple web scraper to extract Product Data and Pricing from Amazon
Language: Python - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 307 - Forks: 156
passivebot/facebook-marketplace-scraper 📦
This repository contains a script to scrape Facebook Marketplace data using Playwright, BeautifulSoup and Streamlit.
Language: Python - Size: 664 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 302 - Forks: 86
oxylabs/Python-Web-Scraping-Tutorial
In this Python Web Scraping Tutorial, we will outline everything needed to get started with web scraping. We will begin with simple examples and move on to relatively more complex.
Language: Python - Size: 117 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 294 - Forks: 32
currentslab/extractnet
A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package
Language: HTML - Size: 421 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 294 - Forks: 24
n0kovo/fb_friend_list_scraper
OSINT tool to scrape names and usernames from large friend lists on Facebook, without being rate limited.
Language: Python - Size: 62.5 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 292 - Forks: 25
vakhov/fresh-proxy-list
Provides a list of fresh, working proxy servers (HTTP, HTTPS, SOCKS4 & SOCKS5) with multiple formats available for download.
Language: PHP - Size: 45.3 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 289 - Forks: 25
tuhinpal/imdb-api 📦
Serverless IMDB API powered by Cloudflare Worker
Language: JavaScript - Size: 148 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 289 - Forks: 326
csu/quora-api
An unofficial API for Quora.
Language: Python - Size: 141 KB - Last synced at: over 1 year ago - Pushed at: about 9 years ago - Stars: 287 - Forks: 65
vdutts7/gpt4V-scraper
AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.
Language: JavaScript - Size: 10.4 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 284 - Forks: 24
amoudgl/short-jokes-dataset
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Language: Python - Size: 33.7 MB - Last synced at: 7 months ago - Pushed at: about 5 years ago - Stars: 275 - Forks: 74