Topic: "crawler-python"
cuiyuheng/crawlee-python Fork of apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
Size: 21 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
Viper373/LOL-DataAnalytics
腾讯游戏-英雄联盟赛事20/21/22年数据综合分析预测
Language: Jupyter Notebook - Size: 5.14 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
Zepolimer/python-crawler
Python crawler - implementing Google and Bing browsers
Language: Python - Size: 6.84 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
zezs/Ice-breaker-powered-by-LLM
Ice Breaker is comprehensive fullstack app leveraging generative AI and LangChain to find LinkedIn profiles and generate engaging ice breakers. LangChain ReAct agents ensure accurate URL retrieval and JSON cleaning, identifying a summary, facts, topics, and ice breakers. The frontend is built with HTML/CSS, and Flask powers the backend development.
Language: Python - Size: 50.8 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1
Kathange/crawler_for_unsplash
Language: Python - Size: 278 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
stylepatrick/point-staking-monitor
Point Staking Monitor with Telegram notification accomplished through web crawler. Can be used for e very Cryptocurrency Explorer.
Language: Python - Size: 4.88 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
Viper373/Chengdu-Emotion
网易云音乐《成都》评论的文本聚类与情感分析
Language: Jupyter Notebook - Size: 21.9 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
Williams-Media/Exipred-Domain-Finder
Python script to crawl a website and see if it links to any expired domains.
Language: Python - Size: 841 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
Mr0Wido/commoncrawl.py
This Python script is a multi-threaded tool for retrieving data from the CommonCrawl index. It allows you to specify a domain or a list of domains, and it will retrieve all URLs associated with those domains that are indexed by CommonCrawl.
Language: Python - Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
LuisSanchez/python-crawler-playwright
Webcrawler for company links in LinkedIn and csv management.
Language: Python - Size: 25.4 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
AnthonyWu0709/Safebooru_Simple_Crawler
A simple image crawler written with Selenium
Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0
lokhiufung/webscraping-buddy
Web scrapers for instagram, XHS and investor contacts
Language: Python - Size: 83 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1
ShayaanKhan/Web-Crawler 📦
Web crawler for emails
Language: Python - Size: 6.18 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
codeeeep/HDU-ACM
使用 Python 爬取 HDU 的 OJ 题库
Language: Python - Size: 364 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
visiontechventures/batchdownload
Scripts for webscraping. Scrape all URLs, images and files from a website.
Language: Python - Size: 91.8 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
Ptttt001/weather_bot
python的爬蟲與Line API應用
Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0
Haein34/GooglePlayStore_Crawler
GooglePlayStore Crawler
Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
RanaPrince/Extracting-Movie-Database
Extracting the data for movies using API Key,endpoint & Requests
Language: Python - Size: 21.5 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
raylan-oliveira/jsonAnalytic
jsonAnalytic - List all keys & all values in json
Language: Python - Size: 24.4 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
ZGG2016/python-crawler-tutorial-itcast
传智播客python爬虫教程文档和源码
Language: Jupyter Notebook - Size: 737 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
Sammeeey/schmerztin
Crawler that checks for available appointments and sends Telegram notification in case of earlier available appointment
Language: Python - Size: 7.81 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
EesunMoon/spam_review_detection
[Project] Capstone Design - Spam Detection
Language: Python - Size: 2.21 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
KrSuma/Cafe24-Crawler
A Crawler that mines all the collections of products and its information from webstores created on Cafe24 platform.
Language: Python - Size: 12.7 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
maancham/IMDB-Crawler
Automatic IMDB movies, ratings, and reviews crawler
Language: Python - Size: 6.43 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
ga-felix/twittery
Language: Python - Size: 19 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0
KruglikDev/Scrapy-Patent-Parser
Patent parser (Python + Scrapy)
Language: Python - Size: 6.84 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0
hyn0027/Python-web-crawler-for-Bilibili
Python小学期 B站爬虫&网站
Language: HTML - Size: 1.88 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0
moj124/web_crawler
The web_crawler is a asynchoronous gevent link crawler that maps all the associated local links constrained by the input webpage url.
Language: Python - Size: 758 KB - Last synced at: 8 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0
snowood1/Corpora-for-Conflict-Study
Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0
anonyxhappie/crawlers
This repo contains web-crawling scripts
Language: Python - Size: 48.8 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 3
Kyurenpoto/maple-inven-crawler
maplestory inven crawler
Language: Python - Size: 458 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0
liyt96/web-crawler
A web crawler that is easy to use and follows politeness policies.
Language: Python - Size: 23.4 KB - Last synced at: 2 months ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0
javiergayala/SiteGloop
A python-based web scraper/snapshot tool.
Language: Python - Size: 732 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0
everbrightw/MovieWebScraper
Scraping famous movies and actors from wiki page with BeautifulSoup and flask for providiing APIs
Language: Python - Size: 2.78 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0