Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

Package Usage: maven: it.unimi.dsi:mg4j

MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java.
1 version
Latest release: about 12 years ago
2 dependent packages

View more package details: https://packages.ecosyste.ms/registries/repo1.maven.org/packages/it.unimi.dsi:mg4j

View more repository details: https://repos.ecosyste.ms/hosts/GitHub/repositories/vigna%2FMG4J

Dependent Repos 20

netarchivesuite/netarchivesuite
Netarchivesuite 5.X development
  • harvester/harvester-test/pom.xml
  • 1.0.1 pom.xml

Size: 182 MB - Last synced: about 1 month ago - Pushed: 6 months ago

cckwzmc/myLearning
  • 1.0.1 heritrix3/heritrix-3.1.1/commons/pom.xml
  • 1.0.1 heritrix3/heritrix-3.1.1/commons/target/classes/META-INF/maven/org.archive.heritrix/heritrix-commons/pom.xml

Size: 122 MB - Last synced: about 1 month ago - Pushed: almost 10 years ago

Paxle/Paxle
Main-Repository of the Paxle Project
  • 2.1.2 sandbox/IndexMG4J/pom.xml

Size: 65.1 MB - Last synced: about 1 month ago - Pushed: over 12 years ago

borowiak/pwa-technologies
Automatically exported from code.google.com/p/pwa-technologies
  • 1.0.1 PwaArchive-access/projects/wayback/wayback-core/pom.xml

Size: 148 MB - Last synced: about 1 year ago - Pushed: about 9 years ago

claudiouzelac/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 8.66 MB - Last synced: about 1 month ago - Pushed: over 1 year ago

acidburn0zzz/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.62 MB - Last synced: about 1 month ago - Pushed: 4 months ago

truemped/heritrix3 📦
Mirror of Heritrix 3 (the Internet Archive's crawler)
  • 1.0.1 commons/pom.xml

Size: 3.6 MB - Last synced: about 1 month ago - Pushed: over 13 years ago

BertrandDechoux/Heritrix-3
Personal clone of "Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project"
  • 1.0.1 commons/pom.xml

Size: 3.24 MB - Last synced: about 1 month ago - Pushed: over 13 years ago

WebCuratorTool/heritrix-1-14-adjust
An adjusted version of heritrix-1.14.1 to work with updated dependencies.
  • 2.0.1 pom.xml

Size: 42.6 MB - Last synced: about 1 year ago - Pushed: over 1 year ago

hurt-gong/heritrix
  • 1.0.1 commons/pom.xml

Size: 5.91 MB - Last synced: about 1 year ago - Pushed: over 1 year ago

internetarchive/archive-commons
  • 1.0.1 archive-commons/pom.xml

Size: 7.5 MB - Last synced: 15 days ago - Pushed: about 11 years ago

panjf2000/MySpider 📦
模仿开源爬虫框架Heritrix开发的一个自定义爬虫框架,Java实现。
  • 1.0.1 pom.xml

Size: 329 KB - Last synced: 22 days ago - Pushed: over 1 year ago

chpooo/crawler
  • 1.0.1 commons/pom.xml

Size: 125 MB - Last synced: about 1 year ago - Pushed: over 10 years ago

shriphani/sleipnir-heritrix3
Mirror of Heritrix3 used by sleipnir
  • 1.0.1 commons/pom.xml

Size: 10.7 MB - Last synced: about 1 year ago - Pushed: about 9 years ago

netarchivesuite/netarchivesuite-svngit-migration
Git conversion of Subversion repository.
  • 1.0.1 m2-build/netarchivesuite-wayback/pom.xml

Size: 225 MB - Last synced: about 1 month ago - Pushed: about 9 years ago

magicdog/urlfilter
filter url by bdb. bloom and tair
  • 1.0.1 pom.xml

Size: 2.03 MB - Last synced: 12 months ago - Pushed: almost 11 years ago

aaronbinns/heritrix3
Local hacks and patches to IA Heritrix3
  • 1.0.1 commons/pom.xml

Size: 4.24 MB - Last synced: about 1 month ago - Pushed: almost 11 years ago

hubeizcl/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Last synced: over 1 year ago

chengjinqian/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

cmiles74/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 5.48 MB - Last synced: about 1 year ago - Pushed: about 12 years ago

WebCuratorTool/webcurator-v2-legacy
The Web Curator Tool is a tool for managing the selective web harvesting process. (moved from SourceForge). https://webcurator.slack.com https://webcuratortool.readthedocs.io
  • 2.0.1 wct-core/pom.xml
  • 2.0.1 wct-store/pom.xml

Size: 169 MB - Last synced: 5 months ago - Pushed: over 1 year ago

ConnectionMaster/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.76 MB - Last synced: about 1 month ago - Pushed: about 1 month ago

wyrover/heritrix3 Fork of zhengyouxiang/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 7.68 MB - Last synced: about 1 year ago - Pushed: over 10 years ago

whwhzzz/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 7.3 MB - Last synced: 10 months ago - Pushed: almost 11 years ago

zhanzf/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.62 MB - Last synced: 11 months ago - Pushed: about 11 years ago

yangshangshe/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.6 MB - Last synced: 10 months ago - Pushed: over 9 years ago

quixey/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.65 MB - Last synced: about 1 month ago - Pushed: over 10 years ago

ezhouyang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 7.89 MB - Last synced: about 1 month ago - Pushed: over 10 years ago

gischen/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.6 MB - Last synced: 10 months ago - Pushed: almost 9 years ago

zjulib/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.39 MB - Last synced: 6 months ago - Pushed: almost 10 years ago

kangzhenkang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.6 MB - Last synced: about 2 months ago - Pushed: almost 9 years ago

xuxiaosahn/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: 8 months ago - Pushed: almost 9 years ago

MjAbuz/heritrix3 Fork of iipc/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.54 MB - Last synced: about 1 year ago - Pushed: over 11 years ago

fendouai/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.6 MB - Last synced: about 1 year ago - Pushed: almost 9 years ago

zhaochl/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

VSaliy/webcurator Fork of WebCuratorTool/webcurator-v2-legacy
The Web Curator Tool is a tool for managing the selective web harvesting process. (moved from SourceForge)
  • 2.0.1 wct-core/pom.xml
  • 2.0.1 wct-harvest-agent/pom.xml
  • 2.0.1 wct-store/pom.xml

Size: 135 MB - Last synced: 5 months ago - Pushed: over 8 years ago

amberfan/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: 10 months ago - Pushed: over 8 years ago

huwenqiang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 month ago - Pushed: over 8 years ago

JiroHuang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

EvanSky/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

Nihal2211/heritrix
  • 1.0.1 commons/pom.xml

Size: 2.55 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

guoyu07/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.41 MB - Last synced: 10 months ago - Pushed: over 8 years ago

vampireshj2013/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.45 MB - Last synced: 5 months ago - Pushed: over 8 years ago

sivaakurati/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

oraclexbw/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: 10 months ago - Pushed: over 8 years ago

templefox/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.8 MB - Last synced: 10 months ago - Pushed: over 8 years ago

songfj/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: 10 months ago - Pushed: over 8 years ago

codev777/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: 10 months ago - Pushed: over 8 years ago

sunjue-heavyrain/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: 2 months ago - Pushed: over 8 years ago

yykui/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

travisfw/archive-commons Fork of internetarchive/archive-commons
  • 1.0.1 archive-commons/pom.xml

Size: 7.01 MB - Last synced: about 1 year ago - Pushed: over 11 years ago

mrt/ia-web-commons Fork of commoncrawl/ia-web-commons
  • 1.0.1 pom.xml

Size: 7.56 MB - Last synced: 16 days ago - Pushed: almost 10 years ago

shriphani/heritrix-3.2.0
Personal mirror of Heritrix 3.2.0 (stable) version.
  • 1.0.1 commons/pom.xml

Size: 1.61 MB - Last synced: about 1 year ago - Pushed: almost 10 years ago

lydonchandra/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

huangxk/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.41 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

TruthHun/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

YuanAttach/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

martinsbalodis/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 8.6 MB - Last synced: about 1 year ago - Pushed: about 10 years ago

prayagverma/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.44 MB - Last synced: 12 months ago - Pushed: over 8 years ago

zbhlove100/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.17 MB - Last synced: about 1 year ago - Pushed: over 11 years ago

AlbertYou/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.39 MB - Last synced: about 1 year ago - Pushed: almost 10 years ago

lemurproject/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.43 MB - Last synced: about 2 months ago - Pushed: over 11 years ago

InfernoJJ/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.41 MB - Last synced: about 1 year ago - Pushed: over 11 years ago

lingchant/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 6.42 MB - Last synced: about 1 year ago - Pushed: about 11 years ago

rugby110/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.88 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

caofangkun/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.9 MB - Last synced: about 1 month ago - Pushed: over 8 years ago

lengyubing/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 12.1 MB - Last synced: about 1 month ago - Pushed: over 8 years ago

zhoujg/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 10.2 MB - Last synced: 10 months ago - Pushed: over 8 years ago

bo729892905/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 10.2 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

Lokihjl/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 10.3 MB - Last synced: about 1 year ago - Pushed: over 8 years ago

noscripter/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 10.3 MB - Last synced: 10 months ago - Pushed: over 8 years ago

mfirdausharun/Heritrix_Experiment
  • 1.0.1 commons/pom.xml

Size: 2.23 MB - Last synced: 9 months ago - Pushed: over 6 years ago

mfirdausharun/heritrix3
  • 1.0.1 commons/pom.xml

Size: 5.65 MB - Last synced: 9 months ago - Pushed: over 6 years ago

gfthr/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.85 MB - Last synced: about 1 month ago - Pushed: about 9 years ago

MarQuisKnox/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.5 MB - Last synced: 10 months ago - Pushed: about 9 years ago

mouqi123/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 11.5 MB - Last synced: about 1 year ago - Pushed: about 9 years ago

vitgou/heritrix3 Fork of arquivo/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.64 MB - Last synced: 11 months ago - Pushed: over 3 years ago

friesper/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.7 MB - Last synced: 10 months ago - Pushed: almost 8 years ago

praveenmunagapati/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
  • 1.0.1 commons/pom.xml

Size: 9.07 MB - Last synced: about 1 month ago - Pushed: over 6 years ago

ahamblyn/webcurator-upgrade-poc
Webcurator 2.0 upgrade proof of concept
  • 2.0.1 pom.xml

Size: 60.2 MB - Last synced: 5 months ago - Pushed: almost 6 years ago