An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: web-spider

EDROOR27/musicer

Musicer is a mini and convenient local music player, lightweight and practical, specially designed for playing local audio, supports a variety of common audio formats, enjoy your music anytime, anywhere!

Language: Swift - Size: 1.89 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

kan01234/ur-web-spider

web spider to scan UR avialbe room and output as csv

Language: Python - Size: 53.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 6 - Forks: 1

thewebscraping/tls-requests

TLS Requests is a powerful Python library for secure HTTP requests, offering browser-like TLS client, fingerprinting, anti-bot page bypass, and high performance.

Language: Python - Size: 3.68 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 48 - Forks: 3

LOKESH-loky/Concurrent-Web-Crawler

The Concurrent Web Crawler is a Go-based application designed to crawl web pages efficiently using Go's powerful concurrency features.

Language: Go - Size: 12.7 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

projectdiscovery/katana

A next-generation crawling and spidering framework.

Language: Go - Size: 1.8 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 13,635 - Forks: 727

s0rg/crawley

The unix-way web crawler

Language: Go - Size: 206 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 293 - Forks: 16

ssssssss-team/spider-flow

新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。

Language: Java - Size: 3.23 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 9,902 - Forks: 1,908

HHN/crawler4j Fork of yasserg/crawler4j

Open Source Web Crawler for Java - A fork of yasserg/crawler4j

Language: Java - Size: 1.96 MB - Last synced at: 3 days ago - Pushed at: 15 days ago - Stars: 27 - Forks: 7

postmodern/spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Language: Ruby - Size: 685 KB - Last synced at: 16 days ago - Pushed at: 3 months ago - Stars: 816 - Forks: 107

VIDA-NYU/ache

ACHE is a web crawler for domain-specific search.

Language: Java - Size: 66.6 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 465 - Forks: 135

xianhu/PSpider

简单易用的Python爬虫框架,QQ交流群:597510560

Language: Python - Size: 814 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 1,835 - Forks: 502

3nock/SpiderSuite

Advance web security spider/crawler

Size: 6.98 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 634 - Forks: 70

elliotxx/zhihu-crawler-people

A simple distributed crawler for zhihu && data analysis

Language: Python - Size: 183 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 192 - Forks: 89

rivermont/spidy

The simple, easy to use command line web crawler.

Language: Python - Size: 81.8 MB - Last synced at: 17 days ago - Pushed at: 9 months ago - Stars: 346 - Forks: 69

Andromeda1957/netpwn

Tool made to automate tasks of pentesting.

Language: Python - Size: 223 KB - Last synced at: 13 days ago - Pushed at: over 5 years ago - Stars: 165 - Forks: 45

Hecate2/Ignareo-ISML-auto-voter

Ignareo the Carillon, a web crawler/spider template of ultimate high concurrency built for leprechauns. Carillons as the best web spiders; Long live the golden years of leprechauns! (ISML=international saimoe; 2022 ISML is last ISML)

Language: Python - Size: 34.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 187 - Forks: 11

infinilabs/crawler

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

Language: Go - Size: 54.6 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 308 - Forks: 82

esfelurm/spider-web

Language: Python - Size: 102 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 2

TeeWrath/web-spider

Practice Web Scrapper

Language: Python - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

lewisakura/spiderboi

A web crawling library written in TypeScript.

Language: TypeScript - Size: 376 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

acuciureanu/spidertrap-rs

A simple trap for web crawlers

Language: Rust - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 0

FearlessPeople/lianjia_spider

链家网小区信息爬取

Language: Python - Size: 1.55 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 15 - Forks: 3

waived/google-drive-crawler

Proxy-based crawler to expose public (shared) Google Drive links

Language: Python - Size: 48.8 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

psidex/nomad

An experimental web crawler to visualise & map the connections between domains

Language: Go - Size: 1.96 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

techguy-bhushan/Web-Spider

multi-threaded webs crawler

Language: Python - Size: 1.95 KB - Last synced at: 29 days ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 2

howie6879/talospider

talospider - A simple,lightweight scraping micro-framework

Language: Python - Size: 174 KB - Last synced at: 15 days ago - Pushed at: about 6 years ago - Stars: 55 - Forks: 4

QuantumWizard888/get_jp_word_info

Parser script that gets a word info from https://dictionary.goo.ne.jp explanatory dictionary

Language: Python - Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

antchfx/antch

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

Language: Go - Size: 56.6 KB - Last synced at: 10 months ago - Pushed at: almost 5 years ago - Stars: 258 - Forks: 41

lindsaygelle/steamer

Go application. Crawls the Steam store and collects Game records based on search criteria.

Language: Go - Size: 128 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

some-avail/freekwensie

Website-profiler using word-frequencies; profiles all child-links of parent-website.

Language: Nim - Size: 2.16 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

lucky521/pyspider

My Web Spider

Language: Python - Size: 263 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

rdempsey/data-gathering-and-wrangling

Code and slides for my class: Data Gathering & Wrangling

Language: Python - Size: 24.6 MB - Last synced at: about 1 year ago - Pushed at: about 10 years ago - Stars: 5 - Forks: 11

rudissaar/web-spider

A basic sample of Web Spider written in Python.

Language: Python - Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

raspi/scrapy-finlex

Scrapy for finlex

Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

genkio/spider-less

Web spider as a service, spider on serverless

Language: JavaScript - Size: 1.18 MB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 186 - Forks: 25

nirjharlo/complete-google-seo-scan

WordPress Plugin with inbuilt SEO crawler

Language: PHP - Size: 967 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 3

bouxin/company-crawler 📦

天眼查爬虫&企查查爬虫,指定关键字爬取公司信息

Language: Python - Size: 80.1 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 552 - Forks: 149

xiayouran/Musicer

旨在将网易云、酷狗、QQ、酷我等各音乐平台集于一体

Language: Python - Size: 10 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 202 - Forks: 21

wmylxmj/Web-Spider-Login-Bilibili-Python3

网络爬虫模拟登陆bilibili 滑动验证码的破解 弹幕发送 2018-10-9

Language: Python - Size: 590 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 39 - Forks: 20

kjony/web-spider

A simple, generic implementation of a web spider, based on libraries for Html parsing, Excel and Word. It can extract and parse data, and produce XML documents.

Language: Python - Size: 21.5 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

judemont/JdmSearchIndex-Bot

Open source public Search Engine indexation Web Crawler.

Language: Python - Size: 37.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

Srokit/table-collector

Collect numeric html tables from internet into a csv dataset

Language: Python - Size: 96.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vladimanaev/web-spider

web crawler allowing full page render crawl using HtmlUnit

Language: Java - Size: 48.8 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

holisound/haokan

2018年底到2019上半年,通过抓包工具发现🎦好看视频的参数签名漏洞,并成功利用刷金币兑换现金,持续半年。

Language: Python - Size: 498 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 10 - Forks: 4

dremendes/getvideofromfacebook

A program that browses facebook or twitter to download video from a public post/tweet

Language: JavaScript - Size: 73.2 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

fxtack/weibo-spider

单文件项目,使用 Python3(jupyter notebook)调用微博移动端 api 对特定话题进行数据爬取、存储、分析、展示。

Language: Jupyter Notebook - Size: 5.12 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

Chisanan232/smoothcrawler

🕷🕸🌍 Building crawler humanly as different roles with different components.

Language: Python - Size: 4.45 MB - Last synced at: 21 days ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

othree/spider-screenshot

Web spider and take screenshot

Language: JavaScript - Size: 15.6 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

khilnani/spidey.py

Web spiders are usually disliked by websites, but useful for recursive API/page downloads for offline analysis.

Language: Python - Size: 11.7 KB - Last synced at: 16 days ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

fatihyildizli/FYCrawler

🕷 Custom Web Spider | Frontend: ⚛️ React.js - Backend: ☕️ Java / NodeJS

Language: JavaScript - Size: 18.5 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

Chisanan232/SmoothCrawler-Cluster

🌍🕷🔗🕷🔗🕷🕸 Building crawler cluster humanly depends on SmoothCrawler.

Language: Python - Size: 18.1 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

caheredia/caheredia.github.io

Cristian Heredia | Data exploration expert, maximizing signal-to-noise through storytelling. Ask me how to use data to inform the business decisions.

Language: HTML - Size: 1.69 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

ayoubzulfiqar/go-scraper

This repo show how to Scrape different type of data

Language: Go - Size: 26.4 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

sourenaKhanzadeh/PixelProwler

PixelProwler is a web spider that crawls the internet for images based on user-provided prompts. It uses advanced web crawling and image recognition technology to search for and return relevant results. Use PixelProwler to quickly and easily find the images you need for your projects.

Language: Python - Size: 21.5 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

lucasxlu/JiaYuan

user profile of jiayuan.com

Language: Python - Size: 4.49 MB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 40 - Forks: 21

geoffreybauduin/website-checker

Performs useful checks against a website, such as 404 errors reporting, structured data validation...

Language: Go - Size: 20.5 KB - Last synced at: 24 days ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

davidrogger/trybe-project-tech-news

Projeto de raspagem de dados do blog da trybe

Language: Python - Size: 782 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

tengjuilin/best-photo-spider

A web spider to scrap photo for Best Educational Organization (Best International Primary School, Kinglee High School).

Language: Jupyter Notebook - Size: 636 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AngusMonroe/Intelligent-interrogation

Use Word2vec model and LDA model for drug recommendation

Language: Python - Size: 191 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 6 - Forks: 4

Chisanan232/SmoothCrawler-AppIntegration

🌍🔗🕷🕸 Building smooth crawler by application integration.

Language: Python - Size: 226 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

lucasxlu/DataHouse

a data mining and machine learning repo

Language: Python - Size: 36 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 5 - Forks: 5

cgallegoan/Wordpress-combinations

Web RPA to automatically create combinations of products in wordpress. Built specifically for espejoled.com products

Language: Jupyter Notebook - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

FedorChervyakov/sitemap-crawler

Language: Python - Size: 20.5 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

gnuns/raspa

data mining stuff

Language: JavaScript - Size: 51.8 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

zongdeiqianxing/GetWebSiteLinks

获取网站里所有页面的链接地址

Language: Python - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

duruyao/get-more

Python spider scripts for getting music in batches from web.

Language: Python - Size: 4.2 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

xuludev/System

Internet hot topic detection and tracking system

Language: JavaScript - Size: 19.5 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 3 - Forks: 4

MrThanlon/bili-comment

哔哩哔哩(https://www.bilibili.com )读取楼层,发送评论,抢楼,抢沙发

Language: Python - Size: 89.8 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 10 - Forks: 1

lalbert/daric

Simple and configurable PHP web spider and web scraper

Language: PHP - Size: 34.2 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

jdevelop/webspider

Open WEB spider platform

Language: Scala - Size: 509 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 6 - Forks: 3

abhinuvpitale/Spiders

Contains scripts used to learn, test and explore web spiders or crawlers, which are tools used to index / explore various web sites and content.

Language: Python - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 4

ixploit/Fuzzix

a simple python written url fuzzer

Language: Python - Size: 81.1 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

aodiwei/wb_sp

use scrapy to crawl weibo

Language: Python - Size: 1.44 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1