Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: html-parser

rajatomar788/pywebcopy

Locally saves webpages to your hard disk with images, css, js & links as is.

Language: Python - Size: 1.73 MB - Last synced: about 10 hours ago - Pushed: 5 months ago - Stars: 499 - Forks: 102

philss/floki

Floki is a simple HTML parser that enables search for nodes using CSS selectors.

Language: Elixir - Size: 1.49 MB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 1,998 - Forks: 153

vborovikov/brackets

Resilient markup parser library

Language: HTML - Size: 1010 KB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 0

rajatxs/fivemin-block-parser

Simple Editor.js block data parser for Fivemin platform

Language: TypeScript - Size: 15.6 KB - Last synced: 3 days ago - Pushed: 6 months ago - Stars: 1 - Forks: 0

miso-belica/jusText

Heuristic based boilerplate removal tool

Language: Python - Size: 1.01 MB - Last synced: about 12 hours ago - Pushed: 5 days ago - Stars: 683 - Forks: 78

rusterlium/html5ever_elixir

NIF wrapper of html5ever using Rustler

Language: HTML - Size: 423 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 79 - Forks: 71

Prettyhtml/prettyhtml

💅 The formatter for the modern web https://prettyhtml.netlify.com/

Language: JavaScript - Size: 4.12 MB - Last synced: 1 day ago - Pushed: over 1 year ago - Stars: 281 - Forks: 21

alphanome-ai/sec-parser

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual (semantic) structure of the document.

Language: Python - Size: 2.45 MB - Last synced: 5 days ago - Pushed: about 1 month ago - Stars: 90 - Forks: 36

tid-kijyun/Kanna

Kanna(鉋) is an XML/HTML parser for Swift.

Language: Swift - Size: 468 KB - Last synced: 5 days ago - Pushed: 16 days ago - Stars: 2,384 - Forks: 223

jgarber623/micromicro

A Ruby gem for extracting microformats2-encoded data from HTML documents.

Language: Ruby - Size: 354 KB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 2 - Forks: 2

ispras/dedoc

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Document logical extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser)

Language: Python - Size: 221 MB - Last synced: about 14 hours ago - Pushed: about 15 hours ago - Stars: 77 - Forks: 11

csonchen/wxParse

微信小程序富文本解析

Language: JavaScript - Size: 1.79 MB - Last synced: 7 days ago - Pushed: over 1 year ago - Stars: 271 - Forks: 41

mykolaharmash/hyntax

Straightforward HTML parser for JavaScript

Language: JavaScript - Size: 2.24 MB - Last synced: 3 days ago - Pushed: 7 months ago - Stars: 137 - Forks: 8

Imangazaliev/DiDOM

Simple and fast HTML and XML parser

Language: PHP - Size: 457 KB - Last synced: 9 days ago - Pushed: 2 months ago - Stars: 2,176 - Forks: 206

danny1113/html-parser-builder

A result builder that build HTML parser and transform HTML elements to strongly-typed result, inspired by RegexBuilder.

Language: Swift - Size: 28.3 KB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 4 - Forks: 0

b-fuze/deno-dom

Browser DOM & HTML parser in Deno

Language: HTML - Size: 5.41 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 384 - Forks: 43

antchfx/htmlquery

htmlquery is golang XPath package for HTML query.

Language: Go - Size: 135 KB - Last synced: 9 days ago - Pushed: about 1 month ago - Stars: 702 - Forks: 70

skrapeit/skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Language: Kotlin - Size: 5.02 MB - Last synced: 11 days ago - Pushed: 11 days ago - Stars: 754 - Forks: 57

orottier/webpage-rs

Small Rust library to fetch info about a web page: title, description, language, HTTP info, RSS feeds, Opengraph, Schema.org, and more

Language: Rust - Size: 112 KB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 47 - Forks: 11

graetz23/xmlcc

an ANSI C++ XML library keeping SAX interface and XML / DOM tree

Language: C++ - Size: 418 KB - Last synced: 12 days ago - Pushed: 12 days ago - Stars: 3 - Forks: 2

dagmawig/html-parser

An html parser that user sax parsing method.

Language: JavaScript - Size: 39.1 KB - Last synced: 12 days ago - Pushed: 13 days ago - Stars: 0 - Forks: 0

psharanda/Atributika

Convert text with HTML tags, links, hashtags, mentions into NSAttributedString. Make them clickable with UILabel drop-in replacement.

Language: Swift - Size: 856 KB - Last synced: 11 days ago - Pushed: about 2 months ago - Stars: 1,354 - Forks: 152

cezheng/Fuzi

A fast & lightweight XML & HTML parser in Swift with XPath & CSS support

Language: Swift - Size: 630 KB - Last synced: 6 days ago - Pushed: about 1 year ago - Stars: 1,060 - Forks: 149

zzzprojects/html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.

Language: C# - Size: 1.94 MB - Last synced: 13 days ago - Pushed: 14 days ago - Stars: 2,553 - Forks: 369

ocramz/twelve

Like @11ty , but this goes up to 12

Language: Haskell - Size: 70.3 KB - Last synced: 13 days ago - Pushed: over 3 years ago - Stars: 5 - Forks: 1

Bystroushaak/pyDHTMLParser 📦

Lightweight HTML/XML parser for quick and dirty web scraping.

Language: Python - Size: 252 KB - Last synced: 12 days ago - Pushed: over 1 year ago - Stars: 6 - Forks: 3

romagny13/html-parser

TypeScript/JavaScript HTML Parser

Language: TypeScript - Size: 22.5 KB - Last synced: 13 days ago - Pushed: about 7 years ago - Stars: 4 - Forks: 2

clj-commons/hickory

HTML as data

Language: Clojure - Size: 317 KB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 619 - Forks: 51

ioBroker/ioBroker.parser

Parse web-site or file and extract data from it.

Language: JavaScript - Size: 1.23 MB - Last synced: 12 days ago - Pushed: 14 days ago - Stars: 21 - Forks: 12

vborovikov/readability

A C# port of standalone version of the readability lib

Language: HTML - Size: 6.73 MB - Last synced: 15 days ago - Pushed: 16 days ago - Stars: 0 - Forks: 0

OwenOrcan/YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

Language: Python - Size: 207 KB - Last synced: 10 days ago - Pushed: 2 months ago - Stars: 13 - Forks: 0

1623311678/html-parser-server

将html解析成json,服务端,前端都可以用

Language: JavaScript - Size: 24.4 KB - Last synced: 15 days ago - Pushed: 15 days ago - Stars: 4 - Forks: 1

hawa1222/data-stream-etl

Python-MySQL ETL pipeline to centralise personal data from sources like YouTube and Apple into a structured database, enabling advanced data analysis and application development.

Language: Python - Size: 246 KB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 0 - Forks: 0

yorickpeterse/oga

Oga is an XML/HTML parser written in Ruby.

Language: Ruby - Size: 8.24 MB - Last synced: 12 days ago - Pushed: 11 months ago - Stars: 1,163 - Forks: 43

nuzulul/telegram-scraper

A simple Telegram channel scraper

Language: JavaScript - Size: 28.3 KB - Last synced: 10 days ago - Pushed: 3 months ago - Stars: 8 - Forks: 4

snapframework/xmlhtml

XML parser and renderer with HTML 5 quirks mode

Language: Haskell - Size: 987 KB - Last synced: 18 days ago - Pushed: 18 days ago - Stars: 21 - Forks: 22

Azq2/perl-html5-dom

⚡ Super fast html5 DOM library with css selectors (based on Modest/MyHTML)

Language: Perl - Size: 445 KB - Last synced: 19 days ago - Pushed: 19 days ago - Stars: 10 - Forks: 2

genius257/php-html

simple html parser in PHP

Language: PHP - Size: 38.1 KB - Last synced: 13 days ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

akaanuzman/landing_page

My landing page 💣💣💣

Language: HTML - Size: 1.83 MB - Last synced: 21 days ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

khomin/meta-tag-parser-qml

meta-tag html parser qml

Language: C++ - Size: 3.95 MB - Last synced: 21 days ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

jsdf/react-native-htmlview

A React Native component which renders HTML content as native views

Language: JavaScript - Size: 816 KB - Last synced: 9 days ago - Pushed: 6 months ago - Stars: 2,682 - Forks: 465

fb55/htmlparser2

The fast & forgiving HTML and XML parser

Language: TypeScript - Size: 5.54 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 4,273 - Forks: 365

Mercanuis/PythonHTMLScraper

A small HTML Parser/Scraper for Python

Language: Python - Size: 36.1 KB - Last synced: 21 days ago - Pushed: about 5 years ago - Stars: 1 - Forks: 0

YashBhajbhuje67/Bot-Automation

🤖 Automation Bot which can do textual extraction from a URL and perform text analysis and store into a file.

Language: Python - Size: 160 KB - Last synced: 22 days ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

jiangxianli/SimpleHtml

一款类似于Jquery语法的HTML DOM解析PHP扩展包

Language: PHP - Size: 17.6 KB - Last synced: 22 days ago - Pushed: over 6 years ago - Stars: 3 - Forks: 0

duzun/hQuery.php

An extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.

Language: PHP - Size: 3.32 MB - Last synced: 24 days ago - Pushed: 24 days ago - Stars: 351 - Forks: 75

cactoes/cpp-html-parser

A simple html parser written in c++

Language: C++ - Size: 11.7 KB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 1 - Forks: 1

rni-l/pure-js-html-parser

Implement an HTML parser in pure JS(TS) without relying on any libraries.

Language: TypeScript - Size: 249 KB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 0 - Forks: 0

Sub6Resources/flutter_html

A Flutter widget for rendering static html as Flutter widgets (Will render over 80 different html tags!)

Language: Dart - Size: 2.99 MB - Last synced: 25 days ago - Pushed: about 1 month ago - Stars: 1,742 - Forks: 797

bel-framework/bel-dom

DOM (Document Object Model) API for Erlang

Language: Erlang - Size: 13.7 KB - Last synced: 26 days ago - Pushed: 27 days ago - Stars: 0 - Forks: 0

Hengyu/HTMLParser

A Swift package build upon `libxml2`

Language: Swift - Size: 13.7 KB - Last synced: 27 days ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

bel-framework/bel-html

HTML utilities for Erlang

Language: Erlang - Size: 31.3 KB - Last synced: 26 days ago - Pushed: 27 days ago - Stars: 0 - Forks: 0

lexhouk/simplehtmldom

A copy of the PHP Simple HTML DOM Parser project.

Language: PHP - Size: 15.6 KB - Last synced: 28 days ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

kannan-ar/MariGold.OpenXHTML

MariGold.OpenXHTML is a wrapper library for Open XML SDK to convert HTML documents into Open XML word documents.

Language: C# - Size: 470 KB - Last synced: 15 days ago - Pushed: over 1 year ago - Stars: 50 - Forks: 21

leogachimu/scraping_a_website_with_pagination_and_popups

The task involves scraping therapists' data from psychologytoday.com. Therapists per State and category are in multiple pages and the script should click on the view button to load and parse the page of a specific therapist.

Language: Jupyter Notebook - Size: 197 KB - Last synced: 28 days ago - Pushed: 29 days ago - Stars: 0 - Forks: 0

ZhgChgLi/ZMarkupParser

ZMarkupParser is a pure-Swift library that helps you convert HTML strings into NSAttributedString with customized styles and tags.

Language: Swift - Size: 33.7 MB - Last synced: 21 days ago - Pushed: about 2 months ago - Stars: 264 - Forks: 19

sangupta/lhtml

Lenient HTML parser for Go.

Language: Go - Size: 138 KB - Last synced: 29 days ago - Pushed: about 2 months ago - Stars: 0 - Forks: 1

markhuge/hrefs

A golang library for extracting URIs from HTML elements

Language: Go - Size: 6.84 KB - Last synced: 29 days ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

hashi7412/HTML-parser-in-Golang

Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document.

Language: Go - Size: 3.91 KB - Last synced: 29 days ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

brucificus/html-antlr4-typescript

HTML lexer & parser written in TypeScript using ANTLR 4 & ANTLR4TS

Language: TypeScript - Size: 1.31 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

leonamtv/html-parser

A little html-parser that accepts a html file as input and generates a html tree data structure.

Language: Python - Size: 18.6 KB - Last synced: 30 days ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

markuspoerschke/extractum

Extractum is a PHP library that extracts information from web pages.

Language: PHP - Size: 1.04 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 3 - Forks: 1

esign-consulting/postdenuncia

Projeto de software para cidadãos denunciarem problemas urbanos.

Language: Java - Size: 3.09 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 5 - Forks: 0

vitkarpov/fast-xml-parser

🚀 Is a fast XML parser in TypeScript with zero dependencies

Language: TypeScript - Size: 103 KB - Last synced: about 1 month ago - Pushed: about 5 years ago - Stars: 4 - Forks: 2

pagescrape/toks.rs

Language: HTML - Size: 23.4 KB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

oblac/jodd

Jodd! Lightweight. Java. Zero dependencies. Use what you like.

Language: Java - Size: 40.8 MB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 4,057 - Forks: 725

voku/simple_html_dom Fork of samacs/simple_html_dom

📜 Modern Simple HTML DOM Parser for PHP

Language: PHP - Size: 1.07 MB - Last synced: about 1 month ago - Pushed: 2 months ago - Stars: 812 - Forks: 109

posthtml/posthtml

PostHTML is a tool to transform HTML/XML with JS plugins

Language: JavaScript - Size: 1.08 MB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 2,924 - Forks: 114

GeReV/NSoup

NSoup is a .NET port of the jsoup (http://jsoup.org) HTML parser and sanitizer originally written in Java

Language: C# - Size: 517 KB - Last synced: about 1 month ago - Pushed: about 5 years ago - Stars: 154 - Forks: 47

SoftCircuits/HtmlMonkey

Lightweight HTML/XML parser written in C#.

Language: C# - Size: 436 KB - Last synced: 27 days ago - Pushed: about 2 months ago - Stars: 48 - Forks: 9

Swaagie/minimize

Minimize HTML

Language: JavaScript - Size: 771 KB - Last synced: 3 days ago - Pushed: almost 4 years ago - Stars: 163 - Forks: 18

justinwilaby/sax-wasm

The first streamable, fixed memory XML, HTML, and JSX parser for WebAssembly.

Language: JavaScript - Size: 1.05 MB - Last synced: 8 days ago - Pushed: 7 months ago - Stars: 161 - Forks: 8

mbahmodin/youtube-watch-history-to-csv

This project allows you to convert your YouTube watch history HTML file from Google Takeout into a CSV file that can be used by the universalscrobbler.com to Scrobble manually in bulk.

Language: Python - Size: 6.84 KB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 11 - Forks: 3

polskieai/link-reporter

Plugin for many PHP applications, for extracting links from html files.

Language: PHP - Size: 4.88 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

ShiftHackZ/Joy-Reactor-Android

Android client app for website joyreactor.cc

Language: Kotlin - Size: 423 KB - Last synced: 12 days ago - Pushed: 3 months ago - Stars: 2 - Forks: 1

inversoft/prime-transformer

Fast Java8 BBCode & HTML parser and transformation library.

Language: Java - Size: 499 KB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 14 - Forks: 4

kata198/AdvancedHTMLParser

Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modification, and formatting. Also XPath.

Language: Python - Size: 1.11 MB - Last synced: about 1 month ago - Pushed: 10 months ago - Stars: 97 - Forks: 26

zadean/htmerl

HTML Parser in Erlang

Language: Erlang - Size: 60.5 KB - Last synced: 27 days ago - Pushed: over 1 year ago - Stars: 14 - Forks: 2

Bystroushaak/DHTMLParser 📦

D HTML Parser, similar to python BeautifulSoup

Language: D - Size: 158 KB - Last synced: 13 days ago - Pushed: about 2 months ago - Stars: 17 - Forks: 2

esign-consulting/qualidade-ar-smac

Dados de qualidade do ar coletados da Prefeitura do RJ - Secretaria Municipal de Meio Ambiente (SMAC).

Language: HTML - Size: 225 KB - Last synced: about 2 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

rohanasan/rohanasantml

Rohanasantml an easy alternative to html!

Language: Rust - Size: 29.3 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 3 - Forks: 0

mathiversen/html-parser

A simple and general purpose html/xhtml parser, using Pest.

Language: HTML - Size: 224 KB - Last synced: 26 days ago - Pushed: 5 months ago - Stars: 72 - Forks: 17

sunshineplan/node

HTML parsing library, the alternative to BeautifulSoup in Golang.

Language: Go - Size: 67.4 KB - Last synced: 27 days ago - Pushed: about 1 month ago - Stars: 3 - Forks: 0

levlabs/php-html-parser

Lenient event-driven HTML parser written in PHP.

Language: PHP - Size: 16.6 KB - Last synced: 26 days ago - Pushed: 27 days ago - Stars: 0 - Forks: 0

silenzzz/RuTracker4j

RuTracker java library

Language: Java - Size: 81.1 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 2 - Forks: 0

MohamedRejeb/Ksoup

Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML, extracting HTML tags, attributes, and text, and encoding and decoding HTML entities.

Language: Kotlin - Size: 442 KB - Last synced: about 2 months ago - Pushed: 4 months ago - Stars: 305 - Forks: 7

cacing69/cquery

Cquery is an acronym for Crawl Query, its a PHP Scraper with language expression, could be used to scrape data from a website that uses javascript or ajax

Language: PHP - Size: 662 KB - Last synced: about 2 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 3

punit-naik/html-parser

A Clojure library designed to parse HTML string and return any errors and warnings while parsing

Language: Clojure - Size: 21.5 KB - Last synced: about 2 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

saurindashadia/simple_html_dom

Fork copy of simple html dom from https://sourceforge.net/projects/simplehtmldom/

Language: PHP - Size: 15.6 KB - Last synced: about 2 months ago - Pushed: almost 7 years ago - Stars: 0 - Forks: 0

bupt1987/html-parser

php html parser,类似与PHP Simple HTML DOM Parser,但是比它快好几倍

Language: PHP - Size: 78.1 KB - Last synced: about 2 months ago - Pushed: over 4 years ago - Stars: 526 - Forks: 192

KTBsomen/httl-s

html but templating language, hyper text templating language

Size: 3.91 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

pejman-hkh/gdp

GoLang Dom Parser

Language: Go - Size: 306 KB - Last synced: 29 days ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

acrazing/html5parser

A super tiny and fast html5 AST parser.

Language: TypeScript - Size: 929 KB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 170 - Forks: 26

huozhi/html2any

🌀 parse and convert html string to anything

Language: JavaScript - Size: 309 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 66 - Forks: 2

AndreasPizsa/hypertag

🏎 The fastest HTML tag and attributes parser for JavaScript

Language: HTML - Size: 770 KB - Last synced: 12 days ago - Pushed: about 2 years ago - Stars: 30 - Forks: 1

kevinhermawan/markup2json

A library for converting HTML and XML into JSON

Language: TypeScript - Size: 384 KB - Last synced: 8 days ago - Pushed: 7 months ago - Stars: 3 - Forks: 1

i-e-b/HNode

A very small HTML / XML parser library

Language: C# - Size: 13.7 KB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

RealityRipple/Satellite-Restriction-Tracker

🛰️ Monitor and record your ViaSat Satellite network usage.

Language: Visual Basic .NET - Size: 1.86 MB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 1 - Forks: 0

Vectorized/Aris

Aris - A fast and powerful tool to write HTML in JS easily. Includes syntax highlighting, templates, SVG, CSS autofixing, debugger support and more...

Language: JavaScript - Size: 107 KB - Last synced: 26 days ago - Pushed: over 1 year ago - Stars: 88 - Forks: 16

ariary/JSextractor

Fastly gather all JavaScript from url (CLi+TUI)

Language: Go - Size: 6.76 MB - Last synced: 27 days ago - Pushed: about 1 year ago - Stars: 6 - Forks: 1