html-scraper | Topic | Ecosyste.ms: Repos

Topic: "html-scraper"

BetaHuhn/metadata-scraper

🏷️ A JavaScript library for scraping/parsing metadata from a web page.

Language: TypeScript - Size: 857 KB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 121 - Forks: 20

CompileInc/hodor

🕷Configuration based html scraper

Language: Python - Size: 62.5 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 23 - Forks: 3

julleboi/fast-wasm-scraper

Faster HTML scraper with WebAssembly

Language: Rust - Size: 38.1 KB - Last synced at: 23 days ago - Pushed at: almost 5 years ago - Stars: 17 - Forks: 2

imelgrat/feed-finder

A PHP class for extracting the URLs of RSS (1.0 and 2.0) and ATOM feeds associated to a page, as well as OPML outline documents.

Language: PHP - Size: 646 KB - Last synced at: 6 months ago - Pushed at: almost 4 years ago - Stars: 9 - Forks: 3

yaylinda/serebii-parser

Python scripts to parse Pokemon info from serebii.net

Language: Python - Size: 1.01 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 1

marcomontalbano/html-miner

A powerful miner that will scrape html pages for you. ` HTML Scraper ´

Language: JavaScript - Size: 2.62 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 2

Atia-Farha/HTML-Fetcher-Script

A Python script that allows users to fetch and optionally save the HTML content from a specified URL using `requests` library.

Language: Python - Size: 83 KB - Last synced at: 8 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 0

anshu-krishna/HTML-Scraper

A PHP class to simplify data extraction from HTML.

Language: HTML - Size: 54.7 KB - Last synced at: 4 months ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

SandeepKundalwal/Automated-Plagiarism-Detector

An automated plagiarism detector that handles unzipping, generates plagiarism report and scraps the reports for threshold plagiarism.

Language: Java - Size: 101 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

phatpham9/scraper

An html scraper microservice based on x-ray & micro

Language: JavaScript - Size: 792 KB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

httpanand/Web-html-scraper

Scrap a website's html code with python

Language: Python - Size: 24.4 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

kgruiz/stealth-crawler

Asynchronous headless-Chrome web crawler that discovers internal links and optionally saves HTML, Markdown, screenshots, or PDFs. Built for scripting, inspection, and automation.

Language: Python - Size: 1.28 MB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

BaseMax/kashan-university-phone-directory

This repository contains a scraper and dataset for extracting and publishing the phone directory of employees and other personnel from the University of Kashan. It includes tools to scrape, parse, and export data from a given HTML file into JSON format.

Language: HTML - Size: 128 KB - Last synced at: 26 days ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

reverse-developer/Moodle-Rearrange-exam-questions

A Program that cuts down several moodle exam answers and arranges for correct answering

Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

MattMoradi/HTML_ScraperPro

An HTML Downloader Client For Scraping HTML Code From Websites

Language: C# - Size: 30.3 KB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 2

Payal1225/offline_site_mirror

🐙 Offline-Site-Mirror is a tiny Tkinter app that uses wget2 to mirror static websites for offline browsing, with link rewriting and resumable downloads.

Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

T3rr0rS0ck3/home-assistant-enpal-website

Custom Integration for scraping Enpal device data

Language: Python - Size: 41 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

lsegg/scraper-api-challenge

Data extraction package which supports CLI and API requests.

Language: JavaScript - Size: 110 KB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

pakelcomedy/SiteMirror

Python tool for advanced web scraping and site mirroring. It downloads entire websites, including HTML, CSS, JS, images, and other assets, while preserving site structure and updating links for offline use. Ideal for developers needing detailed and customizable website backups.

Language: Python - Size: 12.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

martapanc/PlayBooks-Notes-to-MD

Extract highlighted notes from aGoogle Play Books' Note file

Language: Python - Size: 112 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

stopsopa/docker-puppeteer-html-scraper

(Deprecated -> use better https://github.com/stopsopa/html-scraper-browserless) Microservice tool to scraping html from "any" page

Language: JavaScript - Size: 1.32 MB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

PoppingXanax/justclone

Automate website scraping and resource extraction with this Go script, effortlessly downloading CSS, JS, and image files while preserving website structure and providing scraping statistics.

Language: Go - Size: 4.28 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0