An open API service providing repository metadata for many open source software ecosystems.

Topic: "web-scraping"

firecrawl/firecrawl

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

Language: TypeScript - Size: 74.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 65,512 - Forks: 5,162

scrapy/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Language: Python - Size: 27.6 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 58,754 - Forks: 11,131

Mintplex-Labs/anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.

Language: JavaScript - Size: 47.5 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 50,561 - Forks: 5,332

dgtlmoon/changedetection.io

Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes, price drops, restock alerts, and website defacement monitoring—all for free or enjoy our SaaS plan!

Language: Python - Size: 11.7 MB - Last synced at: about 17 hours ago - Pushed at: about 19 hours ago - Stars: 28,307 - Forks: 1,574

ScrapeGraphAI/Scrapegraph-ai

Python scraper based on AI

Language: Python - Size: 15.2 MB - Last synced at: 3 days ago - Pushed at: 10 days ago - Stars: 21,664 - Forks: 1,872

apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Language: TypeScript - Size: 144 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 20,257 - Forks: 1,054

Evil0ctal/Douyin_TikTok_Download_API

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。

Language: Python - Size: 8.5 MB - Last synced at: 30 days ago - Pushed at: about 1 month ago - Stars: 14,460 - Forks: 2,134

getmaxun/maxun

⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡

Language: TypeScript - Size: 5.27 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 13,802 - Forks: 1,123

seleniumbase/SeleniumBase

Python APIs for web automation, testing, and bypassing bot-detection.

Language: Python - Size: 14 MB - Last synced at: about 22 hours ago - Pushed at: about 24 hours ago - Stars: 11,817 - Forks: 1,436

D4Vinci/Scrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

Language: Python - Size: 3.94 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 8,054 - Forks: 458

mherrmann/helium

Lighter web automation with Python

Language: Python - Size: 39.5 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 8,038 - Forks: 501

lorien/awesome-web-scraping

List of libraries, tools and APIs for web scraping and data processing.

Language: Makefile - Size: 427 KB - Last synced at: 12 days ago - Pushed at: 20 days ago - Stars: 7,384 - Forks: 825

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Language: Python - Size: 32.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7,064 - Forks: 508

alirezamika/autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Language: Python - Size: 132 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 6,937 - Forks: 709

go-rod/rod

A Chrome DevTools Protocol driver for web automation and scraping.

Language: Go - Size: 3.61 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6,285 - Forks: 414

autoscrape-labs/pydoll

Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.

Language: Python - Size: 15 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 5,874 - Forks: 319

adbar/trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Language: Python - Size: 33.8 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 4,829 - Forks: 322

firecrawl/firecrawl-mcp-server

🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.

Language: JavaScript - Size: 2.96 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 4,740 - Forks: 508

lexiforest/curl_cffi

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.

Language: Python - Size: 1.61 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 4,257 - Forks: 398

jaypyles/Scraperr

Self-hosted webscraper.

Language: TypeScript - Size: 2.42 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4,100 - Forks: 184

snooppr/snoop

Snoop — инструмент разведки на основе открытых данных (OSINT world)

Language: Python - Size: 60.6 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 3,475 - Forks: 397

php-curl-class/php-curl-class

PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs

Language: PHP - Size: 2.82 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 3,300 - Forks: 819

x4nth055/pythoncode-tutorials

The Python Code Tutorials

Language: Jupyter Notebook - Size: 321 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 2,918 - Forks: 1,992

lorien/grab

Web Scraping Framework

Language: Python - Size: 9.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2,416 - Forks: 275

gosom/google-maps-scraper

scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place

Language: Go - Size: 21.6 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2,380 - Forks: 308

codingforentrepreneurs/30-Days-of-Python

Learn Python for the next 30 (or so) Days.

Language: HTML - Size: 61.4 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 2,179 - Forks: 1,336

yusufkaraaslan/Skill_Seekers

Single powerful tool to convert ANY documentation website into a Claude skill

Language: Python - Size: 524 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,938 - Forks: 192

oxylabs/amazon-scraper

Free Trial Amazon Scraper API for extracting search, product, offer listing, reviews, question and answers, best sellers and sellers data.

Language: Python - Size: 576 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 1,813 - Forks: 64

A9T9/RPA

Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.

Language: JavaScript - Size: 13.1 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 1,759 - Forks: 364

Kaliiiiiiiiii-Vinyzu/patchright

Undetected version of the Playwright testing and automation library.

Language: JavaScript - Size: 5.09 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,684 - Forks: 62

justmarkham/DAT8

General Assembly's 2015 Data Science course in Washington, DC

Language: Jupyter Notebook - Size: 23 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1,613 - Forks: 1,067

brightdata/brightdata-mcp

A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.

Language: JavaScript - Size: 63.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,511 - Forks: 200

tidyverse/rvest

Simple web scraping for R

Language: R - Size: 12.8 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 1,507 - Forks: 350

rushter/selectolax

Python binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors.

Language: Cython - Size: 476 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,432 - Forks: 81

roach-php/core

The complete web scraping toolkit for PHP.

Language: PHP - Size: 787 KB - Last synced at: 15 days ago - Pushed at: 27 days ago - Stars: 1,429 - Forks: 77

oxylabs/free-proxy-list

Claim Free proxy list with United States IP addresses and use it for your projects.

Size: 4.86 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,244 - Forks: 8

juancarlospaco/faster-than-requests

Faster requests on Python 3

Language: Nim - Size: 20.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,127 - Forks: 92

Decodo/Decodo

HTTP(S)/SOCKS5 rotating residential proxies - code examples & general information.

Language: Java - Size: 320 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1,125 - Forks: 43

intoli/user-agents

A JavaScript library for generating random user agents with data that's updated daily.

Language: TypeScript - Size: 557 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,103 - Forks: 54

rebrowser/rebrowser-patches

Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

Language: JavaScript - Size: 79.1 KB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 1,072 - Forks: 58

tinyfish-io/agentql

AgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale. Includes REST API, Python and JavaScript SDKs, browser debugger.

Language: Python - Size: 868 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 958 - Forks: 120

0x676e67/rnet

A blazing-fast Python HTTP Client with TLS fingerprint

Language: Rust - Size: 3.95 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 942 - Forks: 72

vprusso/youtube_tutorials

Collection of scripts corresponding to LucidProgramming YouTube tutorials

Language: Python - Size: 955 KB - Last synced at: 26 days ago - Pushed at: about 3 years ago - Stars: 926 - Forks: 954

saifyxpro/HeadlessX

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.

Language: JavaScript - Size: 3.69 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 898 - Forks: 120

Kaliiiiiiiiii-Vinyzu/patchright-python

Undetected Python version of the Playwright testing and automation library.

Language: Python - Size: 40 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 863 - Forks: 63

kaliiiiiiiiii/Selenium-Driverless

a stealthy browser automation framework

Language: Python - Size: 19.1 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 831 - Forks: 83

gildas-lormeau/single-file-cli

CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)

Language: JavaScript - Size: 5.16 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 830 - Forks: 83

postmodern/spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Language: Ruby - Size: 687 KB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 826 - Forks: 109

DataHenHQ/till

DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.

Language: Go - Size: 2.04 MB - Last synced at: 8 months ago - Pushed at: almost 4 years ago - Stars: 814 - Forks: 22

lit26/finvizfinance

Finviz analysis python library.

Language: Jupyter Notebook - Size: 6.92 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 790 - Forks: 126

je-suis-tm/web-scraping

Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist

Language: Python - Size: 1.88 MB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 787 - Forks: 177

scrapfly/scrapfly-scrapers

Scalable Python web scraping scripts for +40 popular domains

Language: Python - Size: 6.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 732 - Forks: 158

serpapi/google-search-results-python

Google Search Results via SERP API pip Python Package

Language: Python - Size: 237 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 710 - Forks: 116

alecxe/scrapy-fake-useragent

Random User-Agent middleware based on fake-useragent

Language: Python - Size: 54.7 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 693 - Forks: 96

dinubs/coolqlcool

Nextjs server to query websites with GraphQL

Language: JavaScript - Size: 4.22 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 631 - Forks: 48

achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project

Complete-Life-Cycle-of-a-Data-Science-Project

Size: 156 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 609 - Forks: 250

z0m31en7/Uscrapper

Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.

Language: Python - Size: 438 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 603 - Forks: 62

oxylabs/how-to-scrape-google-scholar

A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.

Language: Python - Size: 295 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 587 - Forks: 8

spekulatius/PHPScraper

A universal web-util for PHP.

Language: PHP - Size: 6.53 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 561 - Forks: 76

programminghistorian/jekyll

Jekyll-based static site for The Programming Historian

Language: HTML - Size: 955 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 538 - Forks: 229

oxylabs/quick-start-guide

Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.

Size: 59.6 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 516 - Forks: 3

jaebradley/basketball_reference_web_scraper

NBA Stats API via Basketball Reference

Language: HTML - Size: 19.4 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 513 - Forks: 124

oxylabs/how-to-scrape-amazon-prices

A code for extracting best-selling items, search results, and currently available deals from Amazon using Python and Oxylabs E-Commerce Scraper API.

Language: Python - Size: 83 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 513 - Forks: 6

ScrapeGraphAI/scrapecraft

🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.

Language: Python - Size: 46.7 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 507 - Forks: 85

AlexMathew/scrapple

A framework for creating semi-automatic web content extractors

Language: Python - Size: 1.15 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 502 - Forks: 41

austinoboyle/scrape-linkedin-selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Language: HTML - Size: 269 KB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 490 - Forks: 166

shaikhsajid1111/social-media-profile-scrapers

Fetch user's data across social media

Language: Python - Size: 4.5 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 478 - Forks: 77

VIDA-NYU/ache

ACHE is a web crawler for domain-specific search.

Language: Java - Size: 66.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 472 - Forks: 133

sangaline/wayback-machine-scraper

A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

Language: Python - Size: 82 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 454 - Forks: 81

Kaliiiiiiiiii-Vinyzu/patchright-nodejs

Undetected NodeJS version of the Playwright testing and automation library.

Language: JavaScript - Size: 72.3 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 443 - Forks: 29

davidteather/everything-web-scraping

Learn everything web scraping with David Teather Codes on YouTube

Language: HTML - Size: 7.6 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 431 - Forks: 86

roniemartinez/dude 📦

dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators

Language: Python - Size: 2.49 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 428 - Forks: 19

flairNLP/fundus

A very simple news crawler with a funny name

Language: Python - Size: 22.4 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 415 - Forks: 106

yusuzech/r-web-scraping-cheat-sheet

Guide, reference and cheatsheet on web scraping using rvest, httr and Rselenium.

Language: R - Size: 2.89 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 394 - Forks: 104

orangecoding/fredy

❤️ Fredy - [F]ind [R]eal [E]state [D]amn Eas[y] - Fredy keeps searching for new apartments, houses, and flats in Germany on platforms like ImmoScout24, Immowelt, Immonet, eBay Kleinanzeigen, and WG-Gesucht and instantly delivers the results to you via Slack, Telegram, Email, Discord or ntfy, so you can focus on the more important things in life ;)

Language: JavaScript - Size: 6.37 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 383 - Forks: 102

lkuffo/web-scraping

Más de 50 ejemplos de web scraping utilizando: Requests | Scrapy | Selenium | LXML | BeautifulSoup

Language: Python - Size: 60.5 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 377 - Forks: 215

deedy5/primp

🪞PRIMP (Python Requests IMPersonate). The fastest python HTTP client that can impersonate web browsers

Language: Python - Size: 2.4 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 373 - Forks: 44

graphlit/graphlit-mcp-server

Model Context Protocol (MCP) Server for Graphlit Platform

Language: TypeScript - Size: 376 KB - Last synced at: 12 days ago - Pushed at: 26 days ago - Stars: 368 - Forks: 48

crwlrsoft/crawler

Library for Rapid (Web) Crawler and Scraper Development

Language: PHP - Size: 1.02 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 364 - Forks: 13

City-Bureau/city-scrapers

Scrape, standardize and share public meetings from local government websites

Language: HTML - Size: 11.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 362 - Forks: 316

sdl60660/letterboxd_recommendations

Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username

Language: Python - Size: 7.29 GB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 354 - Forks: 33

serpapi/nokolexbor

High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.

Language: C - Size: 665 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 347 - Forks: 6

shaikhsajid1111/twitter-scraper-selenium

Python's package to scrap Twitter's front-end easily

Language: Python - Size: 127 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 337 - Forks: 55

yaroslaff/nudecrawler

Crawl telegra.ph searching for nudes!

Language: Python - Size: 212 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 334 - Forks: 27

buyukakyuz/email-sleuth

Discover and verify professional emails using names + domains (Rust-based tool with SMTP, DNS, scraping, and scoring)

Language: Rust - Size: 155 KB - Last synced at: 22 days ago - Pushed at: 6 months ago - Stars: 333 - Forks: 42

oxylabs/oxylabs-ai-studio-py

Oxylabs AI Studio python SDK

Language: Python - Size: 2.26 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 323 - Forks: 0

walissonsilva/web-scraping-python

🌐 Repositório com o conteúdo (slides, exemplos, códigos) da série de vídeos no YouTube sobre Web Scraping com Python.

Language: Python - Size: 913 KB - Last synced at: 7 months ago - Pushed at: over 4 years ago - Stars: 321 - Forks: 55

s0rg/crawley

The unix-way web crawler

Language: Go - Size: 216 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 315 - Forks: 19

web-agent-master/google-search

A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.

Language: TypeScript - Size: 121 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 312 - Forks: 55

infinilabs/crawler

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

Language: Go - Size: 54.6 MB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 309 - Forks: 82

scrapehero-code/amazon-scraper

A simple web scraper to extract Product Data and Pricing from Amazon

Language: Python - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 307 - Forks: 156

passivebot/facebook-marketplace-scraper 📦

This repository contains a script to scrape Facebook Marketplace data using Playwright, BeautifulSoup and Streamlit.

Language: Python - Size: 664 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 302 - Forks: 86

oxylabs/Python-Web-Scraping-Tutorial

In this Python Web Scraping Tutorial, we will outline everything needed to get started with web scraping. We will begin with simple examples and move on to relatively more complex.

Language: Python - Size: 117 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 294 - Forks: 32

currentslab/extractnet

A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package

Language: HTML - Size: 421 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 294 - Forks: 24

n0kovo/fb_friend_list_scraper

OSINT tool to scrape names and usernames from large friend lists on Facebook, without being rate limited.

Language: Python - Size: 62.5 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 292 - Forks: 25

vakhov/fresh-proxy-list

Provides a list of fresh, working proxy servers (HTTP, HTTPS, SOCKS4 & SOCKS5) with multiple formats available for download.

Language: PHP - Size: 45.3 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 289 - Forks: 25

tuhinpal/imdb-api 📦

Serverless IMDB API powered by Cloudflare Worker

Language: JavaScript - Size: 148 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 289 - Forks: 326

csu/quora-api

An unofficial API for Quora.

Language: Python - Size: 141 KB - Last synced at: over 1 year ago - Pushed at: about 9 years ago - Stars: 287 - Forks: 65

vdutts7/gpt4V-scraper

AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.

Language: JavaScript - Size: 10.4 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 284 - Forks: 24

amoudgl/short-jokes-dataset

Python scripts for building 'Short Jokes' dataset, featured on Kaggle

Language: Python - Size: 33.7 MB - Last synced at: 7 months ago - Pushed at: about 5 years ago - Stars: 275 - Forks: 74