Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: stormcrawler
apache/incubator-stormcrawler
A scalable, mature and versatile web crawler based on Apache Storm
Language: HTML - Size: 6.41 MB - Last synced: 27 days ago - Pushed: about 1 month ago - Stars: 856 - Forks: 252
DigitalPebble/stormcrawler-docker
Resources for running StormCrawler with Docker services
Language: Dockerfile - Size: 16.6 KB - Last synced: 14 days ago - Pushed: 4 months ago - Stars: 7 - Forks: 2
DigitalPebble/benchmark
StormCrawler topology to evaluate the performance of different backends and configurations
Language: Shell - Size: 43 KB - Last synced: about 2 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
DigitalPebble/ansible-storm
Ansible playbook for deploying a Storm cluster
Size: 27.3 KB - Last synced: about 2 months ago - Pushed: 5 months ago - Stars: 7 - Forks: 1
ngramp/stormcrawlnlp
Language: Java - Size: 35.6 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0
sebastian-nagel/warc-crawler
Process web archives (WARC format) with StormCrawler and index content into Elasticsearch or Solr
Language: FLUX - Size: 44.9 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 6 - Forks: 1