Topic: "article-extracting"
fivefilters/ftr-site-config
Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.
Size: 4.69 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 395 - Forks: 269

Strumenta/SmartReader
SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla
Language: C# - Size: 27.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 166 - Forks: 37

artiomn/markdown_articles_tool
Parse markdown article, download images and replace images URL's with local paths
Language: Python - Size: 302 KB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 122 - Forks: 25

myifeng/article-parser
Extract article or news by url or html, parse the title and content, output in markdown format.
Language: Python - Size: 85 KB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 49 - Forks: 7

johnbumgarner/newshound
This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.
Size: 28.3 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 33 - Forks: 3

woojubb/html-article-extractor
A web page content extractor
Language: JavaScript - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 20 - Forks: 1

lord-alfred/dnlp
📚 Сборник полезных штук из Natural Language Processing: Определение языка текста, Разделение текста на предложения, Получение основного содержимого из html документа
Language: Python - Size: 43 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 5

mitica/ascrape-js
Extracts article content from a web page.
Language: JavaScript - Size: 225 KB - Last synced at: 1 day ago - Pushed at: about 8 years ago - Stars: 10 - Forks: 5

EmailThis/readability Fork of keepcosmos/readability
Readability is Elixir library for extracting and curating articles.
Language: Elixir - Size: 694 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 9 - Forks: 2

Sathish-Vasudev/Article-Scraper
The program can be used to scrape the content from an article from web by an input of a set of URLs in a text file or a URL. This project uses newspaper3k and python-docx libraries. The output of this program will give a neatly modified Word Document in '.docx' format with the contents of the article.
Language: Python - Size: 38.1 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 4

ghostdogpr/readability4s
Scala library to extract relevant content from an article HTML
Language: Scala - Size: 31.3 KB - Last synced at: 7 days ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 0

KashmereLabs/permalink_web_archiver
Allows any article on the web to be parsed into a readable format and archived into the permanent web
Language: JavaScript - Size: 1.84 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

trmckay/article-export
Export Pocket list as CSV!
Language: Python - Size: 30.3 KB - Last synced at: 2 months ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

kl/the-daily-stallman
Read the news like Stallman would. No JavaScript required.
Language: HTML - Size: 380 KB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 2

ivanovishado/NewsScraper
Article scraper for Mexican news websites. My terminal project at Universidad de Guadalajara - CUCEI 2018.
Language: Python - Size: 276 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

MrKioZ/Mawdoo3Picker 📦
A Simple Article Picker Simply it Scrapes the website http://mawdoo3.com and picks a random from it to show it you
Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

0x01h/yozdil-article-scraper-generator
Scrape Yılmaz Özdil articles and create Markov model to generate newspaper articles like Yılmaz Özdil. Turkish text dataset creator for data science and NLP projects.
Language: Python - Size: 19.5 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 2

absingh31/MercuryAPI_Client
Python wrapper for Mercury API and get the JSON and html output, using your key. Using which anyone can denoise a online article and view the same without any adds or external links or content.
Language: HTML - Size: 27.3 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 2

KennethOnuorah/next-small-project
A web app that returns Wikipedia data based on a given search query.
Language: TypeScript - Size: 420 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Xeven777/Ada-7x
GPT-4 powered Article, news , URL, blog summarizer within seconds!😇😍
Language: JavaScript - Size: 90.8 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Krishnadhruv/Creating-summary-from-a-news-article-in-Python
An attempt to create summary from a news article using object oriented Python Programming approach
Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

acrylian/featured_image 📦
A Zenphoto plugin to attach an images to a Zenpage items
Language: PHP - Size: 43 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

kamilmustecep/ArticleDetector
This automation, which provides automatic Font, Size, Line Spacing, Page Margins, Paragraph Indents info and Citation Controls, has been developed using the "DocumentFormat.OpenXml" library.
Language: C# - Size: 26.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

DonaldKLee/Ardio
Ardio is a web application that converts CTV News articles into mp3 files. Currently, this only works for CTV, but in the future, I am planning on expanding it to more news soruces such as Global News or CNN.
Language: Python - Size: 503 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

peterasorensen/Newsdaq-Backend Fork of Nasdaq/hackathons
A New Way to Visualize the Markets (Created in 24 hours @ CalHacks)
Language: Python - Size: 163 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

melphi/article-extractor
Extracts the article content from web pages. Runs as a standalone Rest service.
Language: HTML - Size: 15 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 1

absingh31/Article_Smart Fork of little-endian-0x01/Article_Smart
A python project (with nlp integration) to denoise any news article and strip off any images, advertisement from it giving a basic and hassle free article. It provides a 'smart view' for web-view in mobile devices with heading, keywords and text. Powered with newspaper3k.
Language: Python - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 2

radumpopescu/newspaper-api
Python Newspaper api
Language: Python - Size: 1000 Bytes - Last synced at: almost 2 years ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0
