Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: website-scraper

filesite-io/machete_hero

为Ta荐(TaJian.tv)工作的基于Hero的Node.js爬虫程序,可抓取B站、抖音、快手、西瓜视频播放页、直播页的标题和封面图

Language: JavaScript - Size: 57.6 KB - Last synced: about 17 hours ago - Pushed: about 22 hours ago - Stars: 2 - Forks: 0

html2rss/html2rss-web

🕸 Generates and delivers RSS feeds via HTTP. Docker image available! Create your own feeds or get started quickly with the included configs.

Language: Ruby - Size: 597 KB - Last synced: 1 day ago - Pushed: 1 day ago - Stars: 79 - Forks: 11

shurco/goClone

🌱 goClone - clone websites in a matter of seconds

Language: Go - Size: 7.05 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 54 - Forks: 2

OSINT-TECHNOLOGIES/dpulse

DPULSE - Domain Public Data Collection Service

Language: Python - Size: 99.6 KB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 3 - Forks: 0

NoAssosciation/NightFall

Introducing NightFall, a cutting-edge tool revolutionizing Open-Source Intelligence. Dive deeper into the vast web with NightFall, unlocking unparalleled data extraction capabilities. NightFall empowers users to explore uncharted territories of the dark web and unearth hidden gems with pinpoint accuracy, courtesy of its advanced keyword extraction.

Language: Python - Size: 581 KB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 2 - Forks: 0

website-scraper/website-scraper-existing-directory

Plugin for website-scraper which allows to save resources to existing directory

Language: JavaScript - Size: 55.7 KB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 7 - Forks: 4

website-scraper/website-scraper-puppeteer

Plugin for website-scraper which returns html for dynamic websites using puppeteer

Language: JavaScript - Size: 80.1 KB - Last synced: 17 days ago - Pushed: 17 days ago - Stars: 298 - Forks: 73

EthoKikon/Stock-Prediction

Sentiment-driven stock market prediction

Language: Jupyter Notebook - Size: 939 KB - Last synced: 20 days ago - Pushed: 21 days ago - Stars: 1 - Forks: 0

LexiestLeszek/scrapeGPT

ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.

Language: Python - Size: 62.5 KB - Last synced: 25 days ago - Pushed: 3 months ago - Stars: 49 - Forks: 8

vlmaier/marvel-snap-scrapr

Scraper for https://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.

Language: Python - Size: 31.3 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 21 - Forks: 5

youstinus/car-scrape

Scrapes website content, puts to sqlite3 database, downloads preview picture

Language: Go - Size: 6.84 KB - Last synced: about 2 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

josephlimtech/linkedin-profile-scraper-api

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Language: TypeScript - Size: 10.8 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 466 - Forks: 133

sansagara/ipvu_trends_dashboard

Python application that scraps diverse sources for covid-19 papers, applies NLP transformations and stores them in a dataset for visualizing on a Flask web application.

Language: Jupyter Notebook - Size: 25.7 MB - Last synced: 30 days ago - Pushed: over 1 year ago - Stars: 3 - Forks: 2

website-scraper/node-website-scraper

Download website to local directory (including all css, images, js, etc.)

Language: JavaScript - Size: 824 KB - Last synced: about 2 months ago - Pushed: 2 months ago - Stars: 1,488 - Forks: 265

CRAKZOR/linkedin-post-automator

Automatically curates and posts content to LinkedIn. It can optionally use web scraping to gather data, which is then fed to ChatGPT to craft engaging LinkedIn posts.

Language: Python - Size: 45.9 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 33 - Forks: 10

dtflare/GPTparser

Use GPTparser with your OpenAI API to scrape & parse files into structured JSON files.

Language: Python - Size: 173 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

Nidhal-Bouzara/usthb-news

📰 (Live 🚀 - Check Link 🔗) University News Aggregation Website 👉 Built with AdonisJS, InertiaJS ( SSR React with Typescript, routing done server side ), MYSQL (Lucid ORM), and puppeteer.

Language: JavaScript - Size: 1.73 MB - Last synced: about 2 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

theSTremblay/GPT-Image-Scraper

A robust Image Scraper that leverages OpenAI's GPT Chat Completions to determine the relevant HTML used to Scrape Images from websites.

Language: Python - Size: 9.99 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Tinram/Login-Spider

Spider through a website login and process the pages behind it.

Language: Python - Size: 15.6 KB - Last synced: 2 months ago - Pushed: over 4 years ago - Stars: 1 - Forks: 1

Kooboo/Kooboo

A new web development methodology for JavaScript & C# developers. A super fast and very easy to use CMS.

Language: C# - Size: 35.7 MB - Last synced: about 2 months ago - Pushed: about 1 year ago - Stars: 296 - Forks: 94

Mar-Issah/site_assistant_ai

Transform your website into a dynamic and interactive platform with SiteAssistant AI. Built with Python, Streamlit, LangChain, Openai - GPT 3.5

Language: Python - Size: 274 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

imthaghost/goclone

Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

Language: Go - Size: 123 MB - Last synced: 3 months ago - Pushed: 6 months ago - Stars: 1,063 - Forks: 228

codassassin/website-url-scraper

This is a website url scraper built using python.

Language: Python - Size: 24.4 KB - Last synced: 2 months ago - Pushed: almost 3 years ago - Stars: 4 - Forks: 1

z0m31en7/Uscrapper

Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.

Language: Python - Size: 429 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 356 - Forks: 31

PickySalamander/website-alerter

Tool for alerting when a website changes

Language: TypeScript - Size: 2.16 MB - Last synced: 2 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

xarantolus/Collect

A server to collect & archive websites that also supports video downloads

Language: TypeScript - Size: 2.07 MB - Last synced: 3 months ago - Pushed: over 1 year ago - Stars: 75 - Forks: 10

erlange/wbm-dl

Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.

Language: C# - Size: 295 KB - Last synced: 2 months ago - Pushed: almost 2 years ago - Stars: 76 - Forks: 16

website-scraper/node-website-scraper-phantom 📦

Plugin for website-scraper which returns html for dynamic websites using PhantomJS.

Language: JavaScript - Size: 17.6 KB - Last synced: 22 days ago - Pushed: over 2 years ago - Stars: 58 - Forks: 13

CityIsBetter/Manga-Scrapper

Manga Scraping Tool made in python, It fetches the manga page from the website and downloads it in JPG format and saves it locally. This is basically web Scraping

Language: Python - Size: 4.88 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

martapanc/aoc-data-api

API to retrieve stars obtained by year from the personal account on adventofcode.com

Language: Kotlin - Size: 230 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

antiquarybedna/scam-Drainer

scam Drainer

Size: 1000 Bytes - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

enrycol/Sushiswap-Drainer

Sushiswap Drainer

Size: 1000 Bytes - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

rilyndodg/website-Drainer

website Drainer

Size: 1000 Bytes - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

hainecle/website-Drainer

website Drainer

Size: 1000 Bytes - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

TheEmilyAves/bnao-diet-scraper

Uses Scrapy to scrape diet data from Birds of North America Online

Language: Python - Size: 9.77 KB - Last synced: 7 months ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

linusrachlis/fringr2-scraper

Scraper for show info and performance times on the Toronto Fringe website. Used for linusrachlis/fringr2-fe

Language: PHP - Size: 8.79 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 1 - Forks: 0

MLArtist/WebScraper

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

Language: Python - Size: 40 KB - Last synced: 7 months ago - Pushed: 8 months ago - Stars: 19 - Forks: 7

Govardhan211103/WebScraping

Web scraping using Scrapy framework is often the most efficient way to gather structured data from websites, and this project showcases its power and flexibility in action.

Language: Python - Size: 72.3 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

SamuraiPolix/openbible-verse-scraper

This script scrapes the verses and references from an openbible.info page into a JSON file - if needed, we use bible-api.com to translate to another bible version.

Language: Python - Size: 177 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

rtk-rnjn/cricbuzz_scraper 📦

A simple async-website-scraper for cricket score.

Language: Python - Size: 11.8 MB - Last synced: 4 days ago - Pushed: about 2 years ago - Stars: 2 - Forks: 0

methylDragon/news-anaCrawler

Article Dataset Generator for Internet News Sites. Crawls news sites, analyses them with NLP (sentiment analysis), and pushes to a database.

Language: Jupyter Notebook - Size: 299 KB - Last synced: 9 months ago - Pushed: over 6 years ago - Stars: 7 - Forks: 1

AnirudhaPatil-1/Github-Scrapper

Scrap the github for the treding topics. Dive in them. Find top n repos. Dive in and scrap issues. Gather issues and put them in pdf named after repo within a folder named after Topic.

Language: JavaScript - Size: 44.9 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 0

Ashwin-op/Email-Extractor

A spider to crawl webpages

Language: Python - Size: 4.88 KB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 18 - Forks: 3

github-1970/link-crawler

Web Link Crawler: A Python script to crawl websites and collect links based on a regex pattern. Efficient and customizable.

Language: Python - Size: 31.3 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

hamna-moieez/Gogo-Anime-Downloader

Download all anime episodes from gogoanime simultaneously. This is provided as a Python package.

Language: Python - Size: 2.93 KB - Last synced: 10 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

yuis-ice/jseval

Evaluate JavaScript on a URL through headless Chrome browser.

Language: JavaScript - Size: 2.93 KB - Last synced: 6 months ago - Pushed: almost 3 years ago - Stars: 22 - Forks: 1

psynt/DataShuffle 📦

A data visualisation tool we wrote for a uni project. It scrapes data off websites and helps the user sort through it at a glance

Language: HTML - Size: 9.58 MB - Last synced: 9 months ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0

SySyAli/mosqueswebscraping

This Python script scrapes Salatomatic for US masjid data, including names, locations, and phone numbers. It uses requests, BeautifulSoup, and csv modules for web scraping and CSV handling.

Language: Python - Size: 86.9 KB - Last synced: almost 1 year ago - Pushed: almost 1 year ago - Stars: 1 - Forks: 0

matcdac/BruteForceFileDownloader

Apply brute force combinations to generate all possible combinations of URL, for a particular base url, for file download

Language: Java - Size: 6.84 KB - Last synced: 2 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 2

ajaygithub2/yellow-pages-scraper

There is a script for scraping yellowpages.com website for name, contact, address and link

Language: Python - Size: 12.7 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

alihanucar/nnipscraping

Investment Fund's annual management fees and aum value scraping with python

Language: Jupyter Notebook - Size: 436 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

dann1/ndown

Bandwidth efficient scheduled downloads

Language: Shell - Size: 32.2 KB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 10 - Forks: 0

Harwood/PropertyScraper 📦

Airbnb listing scraper

Language: Python - Size: 30.3 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 3 - Forks: 1

thenurhabib/linkext

A python Script for automatically collect links from a web page.

Language: Python - Size: 106 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 5 - Forks: 0

faheel/file-extensions

JSON collection of scraped file extensions, along with their description and type, from FileInfo.com

Language: Python - Size: 223 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 15 - Forks: 7

jeanrauwers/followers-scraper-serverless

Now you can keep track of your followers from YouTube, Instagram and Twitter accounts - Followers scraper API on AWS serverless

Language: TypeScript - Size: 2.51 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 13 - Forks: 1

xhitz/Crawl

Python webcrawler to automate events

Language: Python - Size: 28.3 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

jasniec/WebsiteParser

Simple library which parses web pages into objects usin attributes

Language: C# - Size: 104 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 6 - Forks: 0

hudson-newey/Website-Text-Extractor

This is a project to systematically extract all readable text out of a web page (only works on very primitive pages at the moment)

Language: Ruby - Size: 5.86 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

Michaelzats/Email-Scrapper-from-the-list

The following tools is able to scrape the list of emails from the list of URLS

Language: Jupyter Notebook - Size: 5.86 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

smartcatai/website-downloader

Download website contents for translation (or statistics calculation) in Smartcat.

Language: JavaScript - Size: 28.3 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 0

mounicanaidu/GRE-generate-mnemonic-code

Run the following python code with a text file in the same directory containing the words for which you need the mnemonic.

Language: Python - Size: 4.88 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 4 - Forks: 1

redevil1/web-scraper-pro

scrape any dynamic website that render java script

Size: 0 Bytes - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

orangmuda/SECTOOL

sᴇᴀʀᴄʜ ᴇɴɢɪɴᴇ sᴄʀᴀᴘᴇʀ ᴛᴏᴏʟ (ʙᴀsʜ)

Language: Shell - Size: 14.6 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 11 - Forks: 3

ioreshnikov/tamizdat

Yet another telegram bot for flibusta library.

Language: HTML - Size: 114 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 0

methylDragon/fb_embedded_comment_scraper

A scraper for gathering data from Facebook's embedded comment widgets for all pages on any number of URLs! It bypasses the Facebook graph API (you don't need an access token) so there's little risk of throttling.

Language: Python - Size: 21.8 MB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 1 - Forks: 1

hajhassanghani/FamilyFinder

Helping you find your loved ones! Over 350 million people from around the world!

Language: JavaScript - Size: 17.6 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

I2rys/SMTSALOAW

Simple module to scrape links on a website.

Language: JavaScript - Size: 2.93 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 1

ashutoshSce/nys-liquor-authority

Directly scrape records from https://www.tran.sla.ny.gov/JSP/query/PublicQueryAdvanceSearchPage.jsp and present it in datatable format

Language: CSS - Size: 1.71 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

nigeld3v/Tumblr_Image_scrape

Download ALL the images (JPEG/GIF/PNG) from any Tumblr website! This project employs Python3 and BeautifulSoup4 to scrape a Tumblr site (with the url provided by the user) to download, page by page, all the images from the Tumblr site's posts. Ideal for archiving other peoples' Tumblrs <3

Language: Python - Size: 43 KB - Last synced: 3 months ago - Pushed: about 6 years ago - Stars: 8 - Forks: 4

epegzz/node-scraper

Scraping websites made easy! A minimalistic yet powerful tool for collecting data from websites.

Language: JavaScript - Size: 143 KB - Last synced: 3 days ago - Pushed: over 5 years ago - Stars: 9 - Forks: 0

Sachinart/alexa-rank-checker

Alexa Bulk Website Rank Checker PHP Script 2020 Latest! you can grab 200+ URL's website ranking at once!

Language: CSS - Size: 139 KB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 6 - Forks: 2

anaustinbeing/website-scraper

Scrapes any website to retrieve all hyperlinks from it in a matter of seconds. Scraping made easy!

Language: Python - Size: 5.86 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 5 - Forks: 1

sharmadhiraj/web_scraper_php_goutte

Web Scraper with Goutte (PHP)

Language: PHP - Size: 392 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

james-w-balcomb/website-scraper-python

Web-Site Scraping Utility

Language: Python - Size: 131 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 1 - Forks: 1

davidezanella/CatchTheStream

Simple program which is able to extract the video stream from online streaming sites and show it using VLC

Language: Python - Size: 11.7 KB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 3 - Forks: 2

kdutta9/nba-analysis

Analyzing second round picks in the NBA

Language: Jupyter Notebook - Size: 159 KB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

froehlichA/RA-Reader

:arrow_down: A Program created to scrap website data from pmg.ages.at, the austrian register of pesticides.

Language: Java - Size: 93.8 KB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 1 - Forks: 0

abhinav-codealchemist/ParseHub

Website Scraper

Language: Java - Size: 358 KB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0

katieannedavis/WebReader

This is a website to mp3 converter. You only need to give it a complete website, and what you would like to name your mp3.

Language: Python - Size: 7.5 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

rafaelogic/horoz

your daily, monthly, and/or yearly horoscope scraper

Language: JavaScript - Size: 863 KB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 0 - Forks: 1

JamesSingleton/Scrape

Python Code to create a website scraper

Language: Python - Size: 4.88 KB - Last synced: about 1 year ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0