Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
Package Usage: maven: it.unimi.dsi:mg4j
MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java.
1 version
Latest release: about 12 years ago
2 dependent packages
View more package details: https://packages.ecosyste.ms/registries/repo1.maven.org/packages/it.unimi.dsi:mg4j
View more repository details: https://repos.ecosyste.ms/hosts/GitHub/repositories/vigna%2FMG4J
Dependent Repos 20
netarchivesuite/netarchivesuite
Netarchivesuite 5.X development- harvester/harvester-test/pom.xml
- 1.0.1 pom.xml
Size: 182 MB - Last synced: about 1 month ago - Pushed: 6 months ago
cckwzmc/myLearning
- 1.0.1 heritrix3/heritrix-3.1.1/commons/pom.xml
- 1.0.1 heritrix3/heritrix-3.1.1/commons/target/classes/META-INF/maven/org.archive.heritrix/heritrix-commons/pom.xml
Size: 122 MB - Last synced: about 1 month ago - Pushed: almost 10 years ago
Paxle/Paxle
Main-Repository of the Paxle Project- 2.1.2 sandbox/IndexMG4J/pom.xml
Size: 65.1 MB - Last synced: about 1 month ago - Pushed: over 12 years ago
borowiak/pwa-technologies
Automatically exported from code.google.com/p/pwa-technologies- 1.0.1 PwaArchive-access/projects/wayback/wayback-core/pom.xml
Size: 148 MB - Last synced: about 1 year ago - Pushed: about 9 years ago
claudiouzelac/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 8.66 MB - Last synced: about 1 month ago - Pushed: over 1 year ago
acidburn0zzz/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.62 MB - Last synced: about 1 month ago - Pushed: 4 months ago
truemped/heritrix3 📦
Mirror of Heritrix 3 (the Internet Archive's crawler)- 1.0.1 commons/pom.xml
Size: 3.6 MB - Last synced: about 1 month ago - Pushed: over 13 years ago
BertrandDechoux/Heritrix-3
Personal clone of "Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project"- 1.0.1 commons/pom.xml
Size: 3.24 MB - Last synced: about 1 month ago - Pushed: over 13 years ago
WebCuratorTool/heritrix-1-14-adjust
An adjusted version of heritrix-1.14.1 to work with updated dependencies.- 2.0.1 pom.xml
Size: 42.6 MB - Last synced: about 1 year ago - Pushed: over 1 year ago
hurt-gong/heritrix
- 1.0.1 commons/pom.xml
Size: 5.91 MB - Last synced: about 1 year ago - Pushed: over 1 year ago
internetarchive/archive-commons
- 1.0.1 archive-commons/pom.xml
Size: 7.5 MB - Last synced: 15 days ago - Pushed: about 11 years ago
panjf2000/MySpider 📦
模仿开源爬虫框架Heritrix开发的一个自定义爬虫框架,Java实现。- 1.0.1 pom.xml
Size: 329 KB - Last synced: 22 days ago - Pushed: over 1 year ago
chpooo/crawler
- 1.0.1 commons/pom.xml
Size: 125 MB - Last synced: about 1 year ago - Pushed: over 10 years ago
shriphani/sleipnir-heritrix3
Mirror of Heritrix3 used by sleipnir- 1.0.1 commons/pom.xml
Size: 10.7 MB - Last synced: about 1 year ago - Pushed: about 9 years ago
netarchivesuite/netarchivesuite-svngit-migration
Git conversion of Subversion repository.- 1.0.1 m2-build/netarchivesuite-wayback/pom.xml
Size: 225 MB - Last synced: about 1 month ago - Pushed: about 9 years ago
magicdog/urlfilter
filter url by bdb. bloom and tair- 1.0.1 pom.xml
Size: 2.03 MB - Last synced: 12 months ago - Pushed: almost 11 years ago
aaronbinns/heritrix3
Local hacks and patches to IA Heritrix3- 1.0.1 commons/pom.xml
Size: 4.24 MB - Last synced: about 1 month ago - Pushed: almost 11 years ago
hubeizcl/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Last synced: over 1 year ago
chengjinqian/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
cmiles74/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 5.48 MB - Last synced: about 1 year ago - Pushed: about 12 years ago
WebCuratorTool/webcurator-v2-legacy
The Web Curator Tool is a tool for managing the selective web harvesting process. (moved from SourceForge). https://webcurator.slack.com https://webcuratortool.readthedocs.io- 2.0.1 wct-core/pom.xml
- 2.0.1 wct-store/pom.xml
Size: 169 MB - Last synced: 5 months ago - Pushed: over 1 year ago
ConnectionMaster/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.76 MB - Last synced: about 1 month ago - Pushed: about 1 month ago
wyrover/heritrix3 Fork of zhengyouxiang/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 7.68 MB - Last synced: about 1 year ago - Pushed: over 10 years ago
whwhzzz/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 7.3 MB - Last synced: 10 months ago - Pushed: almost 11 years ago
zhanzf/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.62 MB - Last synced: 11 months ago - Pushed: about 11 years ago
yangshangshe/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.6 MB - Last synced: 10 months ago - Pushed: over 9 years ago
quixey/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.65 MB - Last synced: about 1 month ago - Pushed: over 10 years ago
ezhouyang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 7.89 MB - Last synced: about 1 month ago - Pushed: over 10 years ago
gischen/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.6 MB - Last synced: 10 months ago - Pushed: almost 9 years ago
zjulib/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.39 MB - Last synced: 6 months ago - Pushed: almost 10 years ago
kangzhenkang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.6 MB - Last synced: about 2 months ago - Pushed: almost 9 years ago
xuxiaosahn/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: 8 months ago - Pushed: almost 9 years ago
MjAbuz/heritrix3 Fork of iipc/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.54 MB - Last synced: about 1 year ago - Pushed: over 11 years ago
fendouai/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.6 MB - Last synced: about 1 year ago - Pushed: almost 9 years ago
zhaochl/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
VSaliy/webcurator Fork of WebCuratorTool/webcurator-v2-legacy
The Web Curator Tool is a tool for managing the selective web harvesting process. (moved from SourceForge)- 2.0.1 wct-core/pom.xml
- 2.0.1 wct-harvest-agent/pom.xml
- 2.0.1 wct-store/pom.xml
Size: 135 MB - Last synced: 5 months ago - Pushed: over 8 years ago
amberfan/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: 10 months ago - Pushed: over 8 years ago
huwenqiang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 month ago - Pushed: over 8 years ago
JiroHuang/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
EvanSky/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
Nihal2211/heritrix
- 1.0.1 commons/pom.xml
Size: 2.55 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
guoyu07/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.41 MB - Last synced: 10 months ago - Pushed: over 8 years ago
vampireshj2013/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.45 MB - Last synced: 5 months ago - Pushed: over 8 years ago
sivaakurati/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
oraclexbw/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: 10 months ago - Pushed: over 8 years ago
templefox/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.8 MB - Last synced: 10 months ago - Pushed: over 8 years ago
songfj/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: 10 months ago - Pushed: over 8 years ago
codev777/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: 10 months ago - Pushed: over 8 years ago
sunjue-heavyrain/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: 2 months ago - Pushed: over 8 years ago
yykui/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.7 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
travisfw/archive-commons Fork of internetarchive/archive-commons
- 1.0.1 archive-commons/pom.xml
Size: 7.01 MB - Last synced: about 1 year ago - Pushed: over 11 years ago
mrt/ia-web-commons Fork of commoncrawl/ia-web-commons
- 1.0.1 pom.xml
Size: 7.56 MB - Last synced: 16 days ago - Pushed: almost 10 years ago
shriphani/heritrix-3.2.0
Personal mirror of Heritrix 3.2.0 (stable) version.- 1.0.1 commons/pom.xml
Size: 1.61 MB - Last synced: about 1 year ago - Pushed: almost 10 years ago
lydonchandra/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
huangxk/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.41 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
TruthHun/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
YuanAttach/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
martinsbalodis/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 8.6 MB - Last synced: about 1 year ago - Pushed: about 10 years ago
prayagverma/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.44 MB - Last synced: 12 months ago - Pushed: over 8 years ago
zbhlove100/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.17 MB - Last synced: about 1 year ago - Pushed: over 11 years ago
AlbertYou/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.39 MB - Last synced: about 1 year ago - Pushed: almost 10 years ago
lemurproject/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.43 MB - Last synced: about 2 months ago - Pushed: over 11 years ago
InfernoJJ/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.41 MB - Last synced: about 1 year ago - Pushed: over 11 years ago
lingchant/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 6.42 MB - Last synced: about 1 year ago - Pushed: about 11 years ago
rugby110/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.88 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
caofangkun/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.9 MB - Last synced: about 1 month ago - Pushed: over 8 years ago
lengyubing/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 12.1 MB - Last synced: about 1 month ago - Pushed: over 8 years ago
zhoujg/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 10.2 MB - Last synced: 10 months ago - Pushed: over 8 years ago
bo729892905/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 10.2 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
Lokihjl/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 10.3 MB - Last synced: about 1 year ago - Pushed: over 8 years ago
noscripter/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 10.3 MB - Last synced: 10 months ago - Pushed: over 8 years ago
mfirdausharun/Heritrix_Experiment
- 1.0.1 commons/pom.xml
Size: 2.23 MB - Last synced: 9 months ago - Pushed: over 6 years ago
mfirdausharun/heritrix3
- 1.0.1 commons/pom.xml
Size: 5.65 MB - Last synced: 9 months ago - Pushed: over 6 years ago
gfthr/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.85 MB - Last synced: about 1 month ago - Pushed: about 9 years ago
MarQuisKnox/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.5 MB - Last synced: 10 months ago - Pushed: about 9 years ago
mouqi123/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 11.5 MB - Last synced: about 1 year ago - Pushed: about 9 years ago
vitgou/heritrix3 Fork of arquivo/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.64 MB - Last synced: 11 months ago - Pushed: over 3 years ago
friesper/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.7 MB - Last synced: 10 months ago - Pushed: almost 8 years ago
praveenmunagapati/heritrix3 Fork of internetarchive/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.- 1.0.1 commons/pom.xml
Size: 9.07 MB - Last synced: about 1 month ago - Pushed: over 6 years ago
ahamblyn/webcurator-upgrade-poc
Webcurator 2.0 upgrade proof of concept- 2.0.1 pom.xml
Size: 60.2 MB - Last synced: 5 months ago - Pushed: almost 6 years ago