An open API service providing repository metadata for many open source software ecosystems.

Topic: "crawler-python"

cuiyuheng/crawlee-python Fork of apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Size: 21 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Viper373/LOL-DataAnalytics

腾讯游戏-英雄联盟赛事20/21/22年数据综合分析预测

Language: Jupyter Notebook - Size: 5.14 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Zepolimer/python-crawler

Python crawler - implementing Google and Bing browsers

Language: Python - Size: 6.84 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

zezs/Ice-breaker-powered-by-LLM

Ice Breaker is comprehensive fullstack app leveraging generative AI and LangChain to find LinkedIn profiles and generate engaging ice breakers. LangChain ReAct agents ensure accurate URL retrieval and JSON cleaning, identifying a summary, facts, topics, and ice breakers. The frontend is built with HTML/CSS, and Flask powers the backend development.

Language: Python - Size: 50.8 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

Kathange/crawler_for_unsplash

Language: Python - Size: 278 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

stylepatrick/point-staking-monitor

Point Staking Monitor with Telegram notification accomplished through web crawler. Can be used for e very Cryptocurrency Explorer.

Language: Python - Size: 4.88 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Viper373/Chengdu-Emotion

网易云音乐《成都》评论的文本聚类与情感分析

Language: Jupyter Notebook - Size: 21.9 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Williams-Media/Exipred-Domain-Finder

Python script to crawl a website and see if it links to any expired domains.

Language: Python - Size: 841 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Mr0Wido/commoncrawl.py

This Python script is a multi-threaded tool for retrieving data from the CommonCrawl index. It allows you to specify a domain or a list of domains, and it will retrieve all URLs associated with those domains that are indexed by CommonCrawl.

Language: Python - Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

LuisSanchez/python-crawler-playwright

Webcrawler for company links in LinkedIn and csv management.

Language: Python - Size: 25.4 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

AnthonyWu0709/Safebooru_Simple_Crawler

A simple image crawler written with Selenium

Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

lokhiufung/webscraping-buddy

Web scrapers for instagram, XHS and investor contacts

Language: Python - Size: 83 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

ShayaanKhan/Web-Crawler 📦

Web crawler for emails

Language: Python - Size: 6.18 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

codeeeep/HDU-ACM

使用 Python 爬取 HDU 的 OJ 题库

Language: Python - Size: 364 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

visiontechventures/batchdownload

Scripts for webscraping. Scrape all URLs, images and files from a website.

Language: Python - Size: 91.8 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Ptttt001/weather_bot

python的爬蟲與Line API應用

Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Haein34/GooglePlayStore_Crawler

GooglePlayStore Crawler

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

RanaPrince/Extracting-Movie-Database

Extracting the data for movies using API Key,endpoint & Requests

Language: Python - Size: 21.5 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

raylan-oliveira/jsonAnalytic

jsonAnalytic - List all keys & all values in json

Language: Python - Size: 24.4 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ZGG2016/python-crawler-tutorial-itcast

传智播客python爬虫教程文档和源码

Language: Jupyter Notebook - Size: 737 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Sammeeey/schmerztin

Crawler that checks for available appointments and sends Telegram notification in case of earlier available appointment

Language: Python - Size: 7.81 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

EesunMoon/spam_review_detection

[Project] Capstone Design - Spam Detection

Language: Python - Size: 2.21 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

KrSuma/Cafe24-Crawler

A Crawler that mines all the collections of products and its information from webstores created on Cafe24 platform.

Language: Python - Size: 12.7 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

maancham/IMDB-Crawler

Automatic IMDB movies, ratings, and reviews crawler

Language: Python - Size: 6.43 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

ga-felix/twittery

Language: Python - Size: 19 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

KruglikDev/Scrapy-Patent-Parser

Patent parser (Python + Scrapy)

Language: Python - Size: 6.84 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

hyn0027/Python-web-crawler-for-Bilibili

Python小学期 B站爬虫&网站

Language: HTML - Size: 1.88 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

moj124/web_crawler

The web_crawler is a asynchoronous gevent link crawler that maps all the associated local links constrained by the input webpage url.

Language: Python - Size: 758 KB - Last synced at: 8 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

snowood1/Corpora-for-Conflict-Study

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

anonyxhappie/crawlers

This repo contains web-crawling scripts

Language: Python - Size: 48.8 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 3

Kyurenpoto/maple-inven-crawler

maplestory inven crawler

Language: Python - Size: 458 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

liyt96/web-crawler

A web crawler that is easy to use and follows politeness policies.

Language: Python - Size: 23.4 KB - Last synced at: 2 months ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

javiergayala/SiteGloop

A python-based web scraper/snapshot tool.

Language: Python - Size: 732 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

everbrightw/MovieWebScraper

Scraping famous movies and actors from wiki page with BeautifulSoup and flask for providiing APIs

Language: Python - Size: 2.78 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0