GitHub topics: warc-format
commoncrawl/arc2warc-conversion
Experiences converting Common Crawl's ARC files from the crawls 2008 - 2012 to the WARC format
Size: 24.4 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

toimik/WarcProtocol
Parser for WARC (aka WebArchive) files
Language: C# - Size: 181 KB - Last synced at: 13 days ago - Pushed at: 10 months ago - Stars: 13 - Forks: 3

edgi-govdata-archiving/eis-WARC-archiver 📦
ARCHIVED--Docker app to crawl URLs and generate WARCs
Language: Python - Size: 28.1 MB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 10 - Forks: 5

pierlauro/MDBubing
From WARC records to MongoDB documents
Language: Java - Size: 145 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0
