Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / netarchivesuite 53 repositories

netarchivesuite/jwat

Java Web Archive Toolkit

Language: Java - Size: 22.1 MB - Last synced: 14 days ago - Pushed: 6 months ago - Stars: 3 - Forks: 2

netarchivesuite/solrwayback

A search interface and wayback machine for the UKWA Solr based warc-indexer framework.

Language: Java - Size: 27.1 MB - Last synced: 19 days ago - Pushed: 25 days ago - Stars: 95 - Forks: 18

netarchivesuite/heatmap

A GitHub-inspired graph for visualising activity

Language: JavaScript - Size: 216 KB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 27 - Forks: 2

netarchivesuite/shine Fork of ukwa/shine

Prototype SOLR-powered web archive exploration UI.

Language: Java - Size: 7.79 MB - Last synced: about 1 month ago - Pushed: over 7 years ago - Stars: 1 - Forks: 0

netarchivesuite/so-me

Social Media harvests

Language: Shell - Size: 241 KB - Last synced: 29 days ago - Pushed: over 1 year ago - Stars: 8 - Forks: 0

netarchivesuite/netarchivesuite

Netarchivesuite 5.X development

Language: Java - Size: 182 MB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 17 - Forks: 23

netarchivesuite/jwarc-cdx-indexer-workflow

Will process all warc-files defined in a text file with JWARC and send to a CDX-server (Outback CDX etc.) . If process is stopped and restarted it will continue from where it was.

Language: Java - Size: 4.08 MB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 1 - Forks: 0

netarchivesuite/jwarc Fork of iipc/jwarc

Java library for reading and writing WARC files with a typed API

Language: Java - Size: 693 KB - Last synced: about 1 month ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

netarchivesuite/netarchivesuite-docker-compose

Testbed for some netarchivesuite docker experiments

Language: Jinja - Size: 18.5 MB - Last synced: about 1 month ago - Pushed: about 2 years ago - Stars: 2 - Forks: 5

netarchivesuite/NAS-research Fork of jolf/NAS-research

Research tools for Netarkivet.dk

Language: Java - Size: 2.07 MB - Last synced: about 1 month ago - Pushed: almost 6 years ago - Stars: 1 - Forks: 0

netarchivesuite/cdx-summarize-warc-indexer Fork of ymaurer/cdx-summarize-warc-indexer

Summarize Web Archive holdings using an existing SOLR index

Language: Shell - Size: 23.4 KB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

netarchivesuite/jwat-tools

JWAT Tools

Language: Java - Size: 971 KB - Last synced: 13 days ago - Pushed: 6 months ago - Stars: 4 - Forks: 2

netarchivesuite/solrwaybackrootproxy

Using the solrwaybackrootproxy will improve playback, can redirect and fix leaked resources.

Language: Java - Size: 31.3 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

netarchivesuite/browsertrix-cloud Fork of webrecorder/browsertrix-cloud

Danish Royal Library customisations and modifications

Language: TypeScript - Size: 859 KB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

netarchivesuite/crawlrss Fork of Landsbokasafn/crawlrss

Crawl RSS - Heritrix 3 add-on

Language: Java - Size: 140 KB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

netarchivesuite/heritrix3 Fork of Landsbokasafn/heritrix3

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Language: Java - Size: 10.4 MB - Last synced: about 1 month ago - Pushed: 9 months ago - Stars: 0 - Forks: 1

netarchivesuite/heritrix3-scripts

Some heritrix3 scripts, for use in the h3 console.

Language: Groovy - Size: 11.7 KB - Last synced: about 1 month ago - Pushed: almost 8 years ago - Stars: 1 - Forks: 1

netarchivesuite/openwayback-netarkivet-overlay

Project to create a customised openwayback for netarkivet using maven overlays.

Language: Java - Size: 48.8 KB - Last synced: about 1 month ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

netarchivesuite/openwayback-netarchivesuite Fork of iipc/openwayback

NetarchiveSuite fork of OpenWayback

Language: Java - Size: 26.8 MB - Last synced: about 1 month ago - Pushed: about 3 years ago - Stars: 1 - Forks: 0

netarchivesuite/heritrix3-wrapper

Small wrapper to start/stop and communicate with Heritrix 3.

Language: Java - Size: 108 KB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 3 - Forks: 1

netarchivesuite/vagrant-hadoop-hive-spark Fork of martinprobson/vagrant-hadoop-hive-spark

Vagrant project to spin up a single node VM running current versions of Hadoop, Hive and Spark

Language: Shell - Size: 162 KB - Last synced: about 1 month ago - Pushed: over 3 years ago - Stars: 0 - Forks: 1

netarchivesuite/common-datastructures Fork of antiaction/common-datastructures

Library of common data structures implemented in Java.

Size: 148 KB - Last synced: about 1 month ago - Pushed: almost 9 years ago - Stars: 0 - Forks: 0

netarchivesuite/netsearch

Merged search-arctika and search-achon into a multi-module project

Language: Java - Size: 90.2 MB - Last synced: about 1 month ago - Pushed: about 2 years ago - Stars: 11 - Forks: 2

netarchivesuite/hadoopifications Fork of det-kgl-bibliotek/hadoopifications

Attempts to create hadoop jobs from other processes

Language: Java - Size: 20.4 MB - Last synced: about 1 month ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

netarchivesuite/logtrix Fork of iipc/logtrix

Java library/tool for parsing and summarising Heritrix crawl logs

Language: Java - Size: 65.4 KB - Last synced: about 1 month ago - Pushed: about 5 years ago - Stars: 1 - Forks: 0

netarchivesuite/umbra Fork of internetarchive/umbra

A queue-controlled browser automation tool for improving web crawl quality

Language: Python - Size: 249 KB - Last synced: about 1 month ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0

netarchivesuite/netarchivesuite-umbra-docker

Language: Shell - Size: 24.4 KB - Last synced: about 1 month ago - Pushed: about 5 years ago - Stars: 2 - Forks: 1

netarchivesuite/fits-wrapper Fork of nclarkekb/fits-wrapper

Small FITS wrapper to run it using a custom classloader and provide some basic JAXB (un)marshalling of the XML output.

Language: Java - Size: 18.6 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

netarchivesuite/lap-writer-warc

WARC writer for INAs Live Archiving Proxy

Language: Java - Size: 268 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1

netarchivesuite/bitrepository-rest-client

Java REST client to interact with the Bitrepository software.

Language: Java - Size: 20.5 KB - Last synced: about 1 month ago - Pushed: over 8 years ago - Stars: 0 - Forks: 1

netarchivesuite/jwat-wayback-resourcestore

Wayback resourcestore using JWAT

Language: Java - Size: 8.79 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1

netarchivesuite/retro

Konverting af gamle data til WARC format anno 2012

Language: Java - Size: 47.9 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 0 - Forks: 1

netarchivesuite/jwat-tools-gui

JWAT Tools minimal GUI version

Language: Java - Size: 75.2 KB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 0 - Forks: 2

netarchivesuite/webdanica

System for finding Danish webpages outside the .dk domain

Language: Java - Size: 127 MB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 3

netarchivesuite/batchJobs

Batchjobs for Netarchivesuite

Language: Java - Size: 5.4 MB - Last synced: about 1 month ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0

netarchivesuite/dvenabler

Adds DocValues to Solr index fields without full re-index

Language: Java - Size: 103 KB - Last synced: about 1 month ago - Pushed: almost 3 years ago - Stars: 8 - Forks: 1

netarchivesuite/compression

Scripts related to the workflow for compressing an arc/warc repository

Language: Arc - Size: 56.2 MB - Last synced: about 1 month ago - Pushed: about 6 years ago - Stars: 0 - Forks: 0

netarchivesuite/jenkins-jobs

Defines the NetarchiveSUite jobs to run on SBForge through the Jenkins DSL plugin

Language: Groovy - Size: 14.6 KB - Last synced: about 1 month ago - Pushed: almost 8 years ago - Stars: 1 - Forks: 1

netarchivesuite/netarchivesuite.github.io

Size: 145 KB - Last synced: about 1 month ago - Pushed: almost 10 years ago - Stars: 1 - Forks: 0

netarchivesuite/docker-netarchivesuite

Language: Groovy - Size: 160 KB - Last synced: about 1 month ago - Pushed: almost 10 years ago - Stars: 1 - Forks: 1

netarchivesuite/netarchivesuite-svngit-migration

Git conversion of Subversion repository.

Language: Java - Size: 225 MB - Last synced: about 1 month ago - Pushed: about 9 years ago - Stars: 4 - Forks: 0

netarchivesuite/webdanica-extractlinks

Tools for extracting links from ARC and WARC files

Language: Java - Size: 8.79 KB - Last synced: about 1 month ago - Pushed: about 7 years ago - Stars: 0 - Forks: 0

netarchivesuite/nas_ansible

Start of an ansible deploy framework for Netarchive Suite

Language: Shell - Size: 9.77 KB - Last synced: about 1 month ago - Pushed: about 7 years ago - Stars: 0 - Forks: 1

netarchivesuite/bacon Fork of aaronbinns/bacon

Experimenting with Apache Pig.

Language: Java - Size: 22.3 MB - Last synced: about 1 month ago - Pushed: about 11 years ago - Stars: 0 - Forks: 0

netarchivesuite/language-detector Fork of optimaize/language-detector

Language Detection Library for Java

Language: Java - Size: 1.94 MB - Last synced: about 1 month ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

netarchivesuite/xml-formatter Fork of acegi/xml-format-maven-plugin

Fork of http://code.google.com/p/xml-formatter/ with enhancements

Language: Java - Size: 124 KB - Last synced: about 1 month ago - Pushed: almost 10 years ago - Stars: 0 - Forks: 0

netarchivesuite/tika Fork of apache/tika

Mirror of Apache Tika

Language: Java - Size: 101 MB - Last synced: about 1 month ago - Pushed: about 8 years ago - Stars: 0 - Forks: 0

netarchivesuite/webarchive-discovery Fork of ukwa/webarchive-discovery

WARC and ARC indexing and discovery tools.

Language: Java - Size: 13.1 MB - Last synced: about 1 month ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

netarchivesuite/jbs Fork of internetarchive/jbs

Builds Lucene/Solr indexes out of NutchWAX segments and revisit records via Hadoop.

Language: Java - Size: 102 MB - Last synced: about 1 month ago - Pushed: over 10 years ago - Stars: 0 - Forks: 0

netarchivesuite/netarkivet-4.4-openwayback

Netarkivet openwayback linked with NAS 4.4 jars.

Language: Java - Size: 3.27 MB - Last synced: about 1 month ago - Pushed: about 9 years ago - Stars: 0 - Forks: 0

netarchivesuite/webarchive-commons Fork of iipc/webarchive-commons

Common web archive utility code.

Language: Java - Size: 7.88 MB - Last synced: about 1 month ago - Pushed: over 9 years ago - Stars: 0 - Forks: 0

netarchivesuite/heritrix3-client-testbed

A little testbed/playground/prototype for exploring the h3 REST interface.

Language: Java - Size: 145 KB - Last synced: about 1 month ago - Pushed: over 9 years ago - Stars: 0 - Forks: 0

netarchivesuite/hadoop-tools

Language: Java - Size: 117 KB - Last synced: about 1 month ago - Pushed: almost 11 years ago - Stars: 0 - Forks: 0