An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: webmagic

zhaoweilong007/zhihuCrawler

基于webMagic爬取知乎数据,并按天定时归档

Language: Java - Size: 618 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 43 - Forks: 6

JhonatanIT/cl-generator-ai-service

Cover letter generator with Generative AI

Language: Java - Size: 215 KB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 1 - Forks: 1

hemin1003/java-spider

一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。

Language: Java - Size: 14.4 MB - Last synced at: about 22 hours ago - Pushed at: over 2 years ago - Stars: 338 - Forks: 151

wxynihao/baidu-search-result-crawler

一个百度搜索结果内容获取爬虫。

Language: Java - Size: 14.6 KB - Last synced at: 8 days ago - Pushed at: about 7 years ago - Stars: 12 - Forks: 5

woyumen4597/crawler

Java 爬虫

Language: Java - Size: 417 KB - Last synced at: 10 months ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 2

v5tech/solrj-example

solrj示例

Language: Java - Size: 5.24 MB - Last synced at: 19 days ago - Pushed at: over 9 years ago - Stars: 55 - Forks: 44

jasonnull/web-spider

Language: Java - Size: 23.3 MB - Last synced at: 12 months ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 1

liyifeng1994/webmagic-csdnblog

基于WebMagic写的一个csdn博客小爬虫

Language: Java - Size: 6.91 MB - Last synced at: 19 days ago - Pushed at: almost 7 years ago - Stars: 92 - Forks: 57

zifangsky/WeatherSpider

天气爬虫(全国城镇天气自动定时抓取更新,并开放RESTful查询接口),附带代理IP池定时更新并检测其可用性

Language: Java - Size: 53.7 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 364 - Forks: 142

ChinLong/credit-blacklist-spider

一个爬虫

Language: Kotlin - Size: 55.7 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

dszblackmagic/lol-spider

🚀利用WebMagic写的爬虫小案例(获取英雄联盟[LOL]官方英雄信息)

Language: Java - Size: 23.4 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

xiaoshun007/webCrawler

Crawler use webMagic

Language: Java - Size: 155 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

xiaoyvyv/AndroidCrawlerEngine

A dynamic crawler plug-in for the Android platform based on Dex dynamic loading, which can dynamically load and execute the dex plug-in package, and can realize real-time updates of crawler and other functions.

Language: Kotlin - Size: 936 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

jinhx128/springboot-demo

基于SpringBoot 2.x整合各种常用开发工具,包括但不限于Redis,MyBatisPlus,RocketMQ,RabbitMQ,Elasticsearch,Quartz,Xxl-Job,Kafka等。

Language: Java - Size: 453 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 176 - Forks: 91

youngkuan/subtitle

subtitle downloader using webmagic (使用webmagic爬取字幕网站电影字幕以及相关信息)

Language: Java - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 2

zglong182/java-spring-applets

java小程序

Language: Java - Size: 50.8 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

hooyantsing/webmagic-job

基于 springboot 底座、webmagic 爬虫内核、xxl-job 任务定时调度实现的分布式爬虫平台

Language: Java - Size: 1.76 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

soberqian/Java-Carwler-Technology

网络数据采集技术—Java网络爬虫 (书稿完整代码,涉及网络爬虫的各种技术和知识点)

Language: Java - Size: 31.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 61 - Forks: 20

casolxia/TwitterCrawler

抓取twitter数据,可根据时间、话题、用户名等条件抓取数据,twitter爬虫

Language: Java - Size: 245 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 24 - Forks: 8

CR553/Project01

基于springboot+mybatis+echarts+webmagic 的疫情数据可视化网站

Language: Java - Size: 52.8 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 85 - Forks: 14

Qyl097731/crawl

多线程爬虫

Language: Java - Size: 54.7 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Yodeser/Crawler

爬虫(诗词、歌曲)

Language: Java - Size: 130 KB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 2

99246255/SpringBoot-Solr

SpringBoot+Solr + webmagic JD商品爬取数据,放入solr中做搜索,学习下solr使用

Language: Java - Size: 7.49 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 42 - Forks: 30

shangjing105/spray-module

spray模块架构分离

Language: Java - Size: 1.34 MB - Last synced at: 7 months ago - Pushed at: over 6 years ago - Stars: 24 - Forks: 14

kxinds/GraduationProject

毕业设计-爬虫及数据动态分析管理

Language: JavaScript - Size: 7.37 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 2

fengsam6/webmagic-learn

使用springboot、spring-data-jpa、webmagic等技术,定时爬取爱奇艺视频、360视频

Language: Java - Size: 76.8 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

sutra/webmagic-delayed-proxy

WebMagic ProxyProvider implementation using DelayQueue.

Language: Java - Size: 43.9 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

FrankCy/spring-boot-frank-spider

Java 电商爬虫,动态代理请自行更换!爬取目标:京东、考拉、丝芙兰;使用工具:HtmlUnit(单线程,大部分网站通过代理可以获取,但是反爬多层JS的无法取到)、ChromeDriver(多进程,需要考虑销毁机制)等(其它的不咋好用)(此项目只为研究各个工具的优劣,并不支持商用)

Language: Java - Size: 229 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 9

gsabbih6/GHNewsCrawler

A crawling and scraping project for news content build on top of Webmagic

Language: Java - Size: 16.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 1

Jasonandy/Skeleton-X

:tada:基于Springboot的SSM脚手架,目前已整合spring-scurity,websocket,docker,echarts,mybatis,elsticSearch.logback,ehcache,redis,kafka,jwt等,旨在开箱即用,简化搭建流程.集成了爬虫项目,OpenCV项目.WebSocket项目.

Language: JavaScript - Size: 6.49 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 6

TGhoul/spider914j

91 web spider for java.

Language: Java - Size: 45.9 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 6 - Forks: 2

uptonking/newsfeed-crawler

crawler for news articles using webmagic

Language: Java - Size: 20.5 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

uptonking/dataspiderman

data crawler based on webmagic. forked from webmagic

Language: Java - Size: 432 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 1

FelixMundial/simple-crawler

Crawler on Zhihu/Bilibili/Weibo/Baidu/Douban trending items, powered by WebMagic

Language: Java - Size: 250 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

leonGravel/ip-spider

一个爬虫小程序,使用webmagic+springboot抓取代理IP网站的数据,并持久化到本地

Language: Java - Size: 23.4 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 5

yongzhuo/JavaLearning

A project of Java Learning、webmagic、mongo、arango、redis、mysql

Language: Java - Size: 1.59 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 11

phinehasz/bilibiliCrawl

a spider for bilibili based on WebMagic 基于WebMagic的b站视频爬虫

Language: Java - Size: 97.7 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 2

DanielLin07/baidu-map-crawler

:beetle: 基于 WebMagic 的百度地图数据爬虫

Language: Java - Size: 50.8 KB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

wxynihao/hudong-pedia-crawler

基于webmagic、springboot和mongodb的互动百科爬虫

Language: Java - Size: 28.3 KB - Last synced at: 8 days ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 2

weldon-h/book-crawler

利用 webmagic 爬取图书网站

Language: Java - Size: 24.4 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ThreadNew/EncycProject

一个爬取宠物百科内容的后台demo。

Language: JavaScript - Size: 9.31 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

gglinux/doucha

校园招聘信息服务平台,数据来源于湖南五所高校的全部校园招聘信息

Language: PHP - Size: 5.85 MB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 3 - Forks: 1

her-cat/novel-spider

基于 webmagic 的小说爬虫

Language: Java - Size: 638 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

abinbao/chinabook_crawler

Language: JavaScript - Size: 3.8 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

AnyListen/edps-spider

Language: CSS - Size: 451 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

leonGravel/musicSpider

✏️ webmagic+springboot 爬取网易云音乐的歌曲评论

Language: Java - Size: 61.5 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

MrGlaucus/CobWeibo

一个爬取新浪微博社交关系的项目

Language: Kotlin - Size: 11.7 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 2

shinelon/webmagic

webmagic demo

Language: Java - Size: 59.6 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

INotWant/ZhiHuCrawler

爬取知乎

Language: Java - Size: 337 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

MccreeFei/WebSpider

基于webmagic的java爬虫项目

Language: Java - Size: 126 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

w41ter/SCUCrawler

a crawler for SCU website, such as Career talk, Recruit notices.

Language: Java - Size: 61.5 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 2

jumpjumpbean/FoodsSpider

Webmagic+Springboot+Mybatis爬虫抓取食材数据

Language: Java - Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 1

xdCao/Java_Spider

内含福利heiheihei~

Language: Java - Size: 30.3 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 1

wangdamu/SpiderApplication

Language: Java - Size: 12.7 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 1

timelessmemory/WebMagicNetEaseMusicCrawler

Netease cloud music spider, realized with webmagic which is a spider framework written by java.

Language: Java - Size: 6.06 MB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 3

JiangWenqi/CellPhone_JD

京东商城手机类产品信息抓取

Language: Java - Size: 7.81 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 3