An open API service providing repository metadata for many open source software ecosystems.

Topic: "article-extracting"

fivefilters/ftr-site-config

Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.

Size: 4.69 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 395 - Forks: 269

Strumenta/SmartReader

SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla

Language: C# - Size: 27.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 166 - Forks: 37

artiomn/markdown_articles_tool

Parse markdown article, download images and replace images URL's with local paths

Language: Python - Size: 302 KB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 122 - Forks: 25

myifeng/article-parser

Extract article or news by url or html, parse the title and content, output in markdown format.

Language: Python - Size: 85 KB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 49 - Forks: 7

johnbumgarner/newshound

This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.

Size: 28.3 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 33 - Forks: 3

woojubb/html-article-extractor

A web page content extractor

Language: JavaScript - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 20 - Forks: 1

lord-alfred/dnlp

📚 Сборник полезных штук из Natural Language Processing: Определение языка текста, Разделение текста на предложения, Получение основного содержимого из html документа

Language: Python - Size: 43 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 5

mitica/ascrape-js

Extracts article content from a web page.

Language: JavaScript - Size: 225 KB - Last synced at: 1 day ago - Pushed at: about 8 years ago - Stars: 10 - Forks: 5

EmailThis/readability Fork of keepcosmos/readability

Readability is Elixir library for extracting and curating articles.

Language: Elixir - Size: 694 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 9 - Forks: 2

Sathish-Vasudev/Article-Scraper

The program can be used to scrape the content from an article from web by an input of a set of URLs in a text file or a URL. This project uses newspaper3k and python-docx libraries. The output of this program will give a neatly modified Word Document in '.docx' format with the contents of the article.

Language: Python - Size: 38.1 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 4

ghostdogpr/readability4s

Scala library to extract relevant content from an article HTML

Language: Scala - Size: 31.3 KB - Last synced at: 7 days ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 0

KashmereLabs/permalink_web_archiver

Allows any article on the web to be parsed into a readable format and archived into the permanent web

Language: JavaScript - Size: 1.84 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

trmckay/article-export

Export Pocket list as CSV!

Language: Python - Size: 30.3 KB - Last synced at: 2 months ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

kl/the-daily-stallman

Read the news like Stallman would. No JavaScript required.

Language: HTML - Size: 380 KB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 2

ivanovishado/NewsScraper

Article scraper for Mexican news websites. My terminal project at Universidad de Guadalajara - CUCEI 2018.

Language: Python - Size: 276 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

MrKioZ/Mawdoo3Picker 📦

A Simple Article Picker Simply it Scrapes the website http://mawdoo3.com and picks a random from it to show it you

Language: Python - Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

0x01h/yozdil-article-scraper-generator

Scrape Yılmaz Özdil articles and create Markov model to generate newspaper articles like Yılmaz Özdil. Turkish text dataset creator for data science and NLP projects.

Language: Python - Size: 19.5 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 2

absingh31/MercuryAPI_Client

Python wrapper for Mercury API and get the JSON and html output, using your key. Using which anyone can denoise a online article and view the same without any adds or external links or content.

Language: HTML - Size: 27.3 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 2

KennethOnuorah/next-small-project

A web app that returns Wikipedia data based on a given search query.

Language: TypeScript - Size: 420 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Xeven777/Ada-7x

GPT-4 powered Article, news , URL, blog summarizer within seconds!😇😍

Language: JavaScript - Size: 90.8 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Krishnadhruv/Creating-summary-from-a-news-article-in-Python

An attempt to create summary from a news article using object oriented Python Programming approach

Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

acrylian/featured_image 📦

A Zenphoto plugin to attach an images to a Zenpage items

Language: PHP - Size: 43 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

kamilmustecep/ArticleDetector

This automation, which provides automatic Font, Size, Line Spacing, Page Margins, Paragraph Indents info and Citation Controls, has been developed using the "DocumentFormat.OpenXml" library.

Language: C# - Size: 26.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

DonaldKLee/Ardio

Ardio is a web application that converts CTV News articles into mp3 files. Currently, this only works for CTV, but in the future, I am planning on expanding it to more news soruces such as Global News or CNN.

Language: Python - Size: 503 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

peterasorensen/Newsdaq-Backend Fork of Nasdaq/hackathons

A New Way to Visualize the Markets (Created in 24 hours @ CalHacks)

Language: Python - Size: 163 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

melphi/article-extractor

Extracts the article content from web pages. Runs as a standalone Rest service.

Language: HTML - Size: 15 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 1

absingh31/Article_Smart Fork of little-endian-0x01/Article_Smart

A python project (with nlp integration) to denoise any news article and strip off any images, advertisement from it giving a basic and hassle free article. It provides a 'smart view' for web-view in mobile devices with heading, keywords and text. Powered with newspaper3k.

Language: Python - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 2

radumpopescu/newspaper-api

Python Newspaper api

Language: Python - Size: 1000 Bytes - Last synced at: almost 2 years ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0