GitHub topics: scraped-data
CUNY-CL/wikipron
Massively multilingual pronunciation mining
Language: Python - Size: 172 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 338 - Forks: 73

devicemanager/randomNames
Language: JavaScript - Size: 2.51 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

emibcn/covid-data
Store and serve daily collected data from https://dadescovid.org for sibling app at https://emibcn.github.io/covid/
Language: JavaScript - Size: 15.9 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

warifp/Shopee-Scrape
Shopee Scrape is a tool that functions to collect data - the data needed, such as finding data from photos, prices, names, store locations and others.
Language: PHP - Size: 559 KB - Last synced at: about 18 hours ago - Pushed at: over 3 years ago - Stars: 90 - Forks: 27

DavidBellamy/visa_dates
Web scraper for US visa bulletins
Language: Python - Size: 8.19 MB - Last synced at: 8 days ago - Pushed at: 10 days ago - Stars: 9 - Forks: 1

racinmat/mal-analysis
github repo for MyAnimeList analysis. Also links to the MAL dataset.
Language: Jupyter Notebook - Size: 684 MB - Last synced at: 15 days ago - Pushed at: almost 2 years ago - Stars: 34 - Forks: 8

tangible-idea/BitUtils
Systematic coin price notifier, Telegram public channel history parser, Trading tool with python
Language: Python - Size: 39.1 KB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 53 - Forks: 21

Ephellon/game-store-catalog
Catalog of PlayStation, Xbox, Nintendo, and Steam games
Size: 33.7 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 5 - Forks: 0

meowabyte/upload-systems-archive 📦
RIP Upload.Systems
Size: 165 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

fernandod1/ProductHunt-scraper
Producthunt.com famous website scraper script. Scrap all offers and save in spreadsheet excel file.
Language: Python - Size: 9.77 KB - Last synced at: 15 days ago - Pushed at: almost 3 years ago - Stars: 24 - Forks: 8

joelbarmettlerUZH/Scrapeasy
Scraping in python made easy - receive the content you like in just one line of code
Language: Python - Size: 63.5 KB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 101 - Forks: 51

wurstbroteater/HomeTemp
Measure temperature and humdity of a room, retrieve online weather data, visualize it, analyse it and send it via email.
Language: Python - Size: 10.3 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

SuperKogito/CoinMarketCapScraper
a small python scraper to scrape historical data from the CoinMarketCap website and convert it to csv files . This is an initial step for a data mining process to develop a predictive model of cryptocurrencies prices.
Language: CSS - Size: 1.97 MB - Last synced at: 18 days ago - Pushed at: about 4 years ago - Stars: 20 - Forks: 5

palahsu/YouTubeScraper
Scraping YouTube Video Description and Video Likes and Comments and Times and Replies! It's Automatically Extracting Data from Video.
Language: Python - Size: 49.8 KB - Last synced at: 19 days ago - Pushed at: about 4 years ago - Stars: 24 - Forks: 5

NomanSiddiqui0000/Rozee.pk-jobs-Scrapper
This scraper, built in Node.js using Puppeteer and Cheerio, is designed to extract job listings from the Rozee.pk website. It can scrape multiple pages and gather detailed information, including job titles, company names, skills, and more. The output is saved in structured CSV files, with sample datasets for cities like Lahore, Karachi, etc.
Language: JavaScript - Size: 6.75 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

ScottGuthart/movietable
Movie Table - All time best movies with custom filters
Language: JavaScript - Size: 1.27 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

recommend-games/board-game-scraper
Board game data scraper
Language: Python - Size: 1.63 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 25 - Forks: 5

nmaties/google-emails-scraper
Node.js Google email addresses scraper using puppeteer.
Language: JavaScript - Size: 15.6 KB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 1

benjaminvdb/DBRD
110k Dutch Book Reviews Dataset for Sentiment Analysis
Language: Python - Size: 34.2 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 30 - Forks: 3

SuvarneshKM/django-scraper
Language: Python - Size: 17.6 KB - Last synced at: 11 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

Vinay26k/python
Python Programming
Language: Python - Size: 272 KB - Last synced at: 11 months ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 4

dorzel/username-generator
Generate a username
Language: Python - Size: 31.3 KB - Last synced at: 10 days ago - Pushed at: over 7 years ago - Stars: 6 - Forks: 1

AnkushSinghGandhi/QuotesQuizGame
A simple web scraping project using Beutifull Soup
Language: Python - Size: 14.6 KB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 2

SunilBoopalan/Suicide-Depression-Classification-with-nlp
Deep Learning for Suicide and Depression Identification with Unsupervised Label Correction. 2021
Language: Python - Size: 1.29 MB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 1

phlo46/univ-financial-aid-crawler
A browser-automation, web crawler written in Javascript
Language: JavaScript - Size: 49.8 KB - Last synced at: 12 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

deepakprasad0089/scrap-to-markdown
Scraping gitbook contents and programs and generating separate markdown file
Language: Python - Size: 1.95 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

statarczuk16/Panacirce
Scrapes data from input Reddit subreddits and uses natural language process and an LDA model to determine what topics those subreddits are concerned with
Language: Python - Size: 240 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

faheel/file-extensions
JSON collection of scraped file extensions, along with their description and type, from FileInfo.com
Language: Python - Size: 223 KB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 6

erogluegemen/ResearchRover
The research paper scrape bot is designed to help researchers and students find academic papers by scraping websites. The bot uses web scraping techniques to extract relevant information from these websites and presents it to users in an organized format.
Language: Python - Size: 10.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

welike/wheaties
Eat your wheaties. A ruby gem for getting product information from merchant APIs (and scraping for additional data).
Language: Ruby - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

HarshCasper/Blind-App-Reviews
Scraped reviews of over 25 companies from the Blind App ⚡️
Size: 6.36 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 14 - Forks: 6

samirkt/raw_food_recognition
Food recognition system for raw cooking ingredients (i.e. fruits, vegetables, etc.)
Language: Python - Size: 22.5 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

frossm/quoter
Command line utility to display stock quotes and index data
Language: Java - Size: 5.52 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 19 - Forks: 4

arkon/uoft-enrolment-charts 📦
Script to generate charts of course enrolment numbers
Language: Python - Size: 266 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 2

sdl60660/cleveland_eviction_mapping
Mapping eviction filings in Cleveland by neighborhood using scraped data from the Cleveland Municipal Court website
Language: Python - Size: 95.3 MB - Last synced at: 11 months ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

deepavadakan/Pet-Shelter-Adoption-Website
Website that helps people find their perfect lovable dog or cat & actually browse current adoption listings to source where to get a desired breed. Adopt a dog or cat - or BOTH!
Language: Jupyter Notebook - Size: 33.5 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 0

ayaanzhaque/SDCNL
Deep Learning for Suicide and Depression Identification with Unsupervised Label Correction (ICANN 2021)
Language: Python - Size: 5.66 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 53 - Forks: 19

AtharvaTaras/India-CarMarket-Images
A dataset of images containing cars that are/were available for sale in the Indian market.
Size: 2.28 GB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

KenzoBH/Web-Scraping-and-EDA-iFood
Web Scraping and EDA from iFood website data.
Language: HTML - Size: 860 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 12 - Forks: 2

apang782/vroom1
Web Scraping and Visual Analysis of Used Car Prices
Language: HTML - Size: 880 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Miranda-Bai/anz_twitter
scraping #anz bank data from twitter by using twscrape package.
Language: Jupyter Notebook - Size: 365 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

naqushab/SearchEngineScrapy
Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com, Yandex.com
Language: Python - Size: 99.6 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 54 - Forks: 18

shine-jayakumar/Web-Scraping-With-Python
Script to extract customer reviews from a webpage while bypassing bot challenge
Language: Python - Size: 112 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

Merterm/Etymon
Find the origin of words in every language using a Deep Neural Network trained to create an etymological map.
Language: JavaScript - Size: 426 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 20 - Forks: 1

mwportfolio/ICT-Supplier-Analysis
Language: Jupyter Notebook - Size: 584 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

kztera/university-ranking 📦
Scrape, analyze and visualize data from timeshighereducation.com about World University Ranking with Python.
Language: Jupyter Notebook - Size: 20.9 MB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 0

infoaed/rk2019-data
Riigikogu valimiste 2019 piraat-avaandmed
Language: Python - Size: 281 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

warung-hytam/WEEBREAD
Read, and watch animanga
Language: Python - Size: 2.65 MB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

apple-fritter/cath0de
Scrape any YouTube user's page and build a comprehensive index of their videos, in TSV format. A Bash script.
Language: Shell - Size: 13.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

apple-fritter/url-scrape.files.sh
Scrape urls for file links.
Language: Shell - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

FoxJefisto/FootballTrackerWPF
Language: C# - Size: 113 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

fabio1623/mid-bootcamp-project
A data analysis project on the most popular podcasts on Spotify in Germany in December 2022, including scraped data, cleaned and enriched data, a Jupyter notebook, and images for a Tableau presentation.
Language: Jupyter Notebook - Size: 15.4 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 0

junguler/TPDNE_example_images
some hand-picked images from thispersondoesnotexist.com
Size: 252 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

derrmru/whats-in-the-news
Data Visualisation of News Content
Language: TypeScript - Size: 316 KB - Last synced at: 12 days ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

sumaiyakawsar/JobScraping
This is Scraping scripts for courses at a few Australian university websites
Language: Python - Size: 13.6 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

truongbo17/bookstory
Scrape Data PDF and Reup (Full UI/UX/Admin/Actions)
Language: JavaScript - Size: 37.5 MB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

dirediredock/wikigraph_infovis
This project is about a data-driven infographic of programming languages, a network graph visualization of how languages influence each other and develop over time. Project for 2022 CPSC547 (Information Visualization) at UBC.
Language: HTML - Size: 43.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

hwasiti/smart-image-scraper
Deep learning-based image dataset cleaning of Flickr. Scraped metadata saved in MongoDB. Web app designed & deployed to Heroku: https://smart-image-scraper.herokuapp.com/
Language: Python - Size: 277 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

superdukenet/superdukenet.github.io
superduke.net forum static archive generated using scraped data
Language: HTML - Size: 47.9 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

laurikjk/airports
🛬 All airports listed by IATA from english wikipedia as csv and the script used to scrape it
Language: Python - Size: 170 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

dbritto-dev/udacity-cloud-devops-engineer-capstone
Capstone Project for Cloud DevOps Engineer on Udacity
Language: Python - Size: 1.45 MB - Last synced at: 19 days ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

TrentBrunson/Mission-to-Mars
Web scraping data about the next Mission to Mars.
Language: Jupyter Notebook - Size: 1.52 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

saplanyanki/live-feed
Flexdashboard to predict geopolitic news and financial markets around the world
Language: R - Size: 3.65 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

malina/metascraper
Metascraper is a Crystal library for web scraping.
Language: Crystal - Size: 20.5 KB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 11 - Forks: 1

Swader/diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Language: PHP - Size: 353 KB - Last synced at: 14 days ago - Pushed at: almost 7 years ago - Stars: 53 - Forks: 20

BhavyaC16/FlairifyMe
FlairifyMe is a Reddit Flair Detector for r/india subreddit, that takes a post's URL as user input and predicts the flair for the post using a model generated by Logistic Regression.
Language: Python - Size: 56.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

aotantawy/quran-by-subject
Web scraper app to get Quran verses from web -and format them by their subjects in JSON -
Language: Java - Size: 350 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

futuresea713/Multiprocess-Scraping-APP
Scraping App Multiprocessing
Language: Python - Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 3

patrick-5546/ubc_course_explorer_data
Raw data and their update scripts for the UBC Course Explorer application
Language: Python - Size: 7.32 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

bacree3/showstopper
A web application to aid users with their streaming services. This is your one stop-shop for your streaming services. Users can search across a variety of platforms to help decide what they can watch and where they can watch it.
Language: PHP - Size: 290 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

SidJain1412/WebscrapingCSGO
Script to find prices for weapon skins and sort them into a convenient csv file. Change "weapon" value on line 3 for different weapons. Scrapes data from csgostash, which is updated very frequently, so prices are accurate. (in INR)
Language: Python - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

iboraham/indeed-web-scraping
The project contains UK data science job analysis provided through scraped data from indeed.co.uk. (Scraping Date: 21.07.2020)
Language: Jupyter Notebook - Size: 29.1 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

ndanevski1/Similar-Posts-using-NLP
In this mini-project I scrape the official Atom editor forum and apply Natural Language Processing techniques to find the most similar post in the forum to each one.
Language: Jupyter Notebook - Size: 262 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

FitzwilliamMuseum/thresholds
An archive repository for thresholds.org.uk
Language: HTML - Size: 234 MB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

infoaed/ep2019-data
Europarlamendi valimiste 2019 piraat-avaandmed
Language: Python - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

Dkreitzer/Mars_Weather_Dashboard
Mars Dashboard - Scrape numerous NASA websites to get the latest info on Mars!
Language: Jupyter Notebook - Size: 3.72 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

shubhamrajput/Scraping-Data
Javascript Query(Jquery) that can pull out the given data from any Skillshare profile teaching page.
Language: JavaScript - Size: 45.9 KB - Last synced at: almost 2 years ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 0

SzymonLisowiec/scrapie
Simple and light framework to scraping data.
Language: JavaScript - Size: 13.7 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

MrGeislinger/2018-olympics-pyeongchang-data
Scraped data of the 2018 Winter Olympics Games in Pyeongchang from www.olympic.org in an effort to make a tidy data set of all competitors (not just winners).
Language: Jupyter Notebook - Size: 202 MB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

hesscl/cl-spatial-epi
Craigslist Spatial Epidemiology Project
Language: HTML - Size: 59.7 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

AdrienAudouard/FireEmblemHeroesScrappedDatas
Fire Emblem Heroes Scrapped Datas
Size: 398 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

ndrberna/tvMarketMonitoring
Monitoring and visualization of television in B2C and C2C commercial portal
Language: CSS - Size: 2.27 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

eddowh/last-statements-nlp
Analyzing the last statements of executed offenders on Texas' Death Row.
Language: Jupyter Notebook - Size: 163 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

JamesSingleton/Scrape
Python Code to create a website scraper
Language: Python - Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

kidwellj/ccf_wordcloud
Wordcloud generated using R based on data scrapes from Scottish Gvt. Climate Challenge Fund applications
Language: R - Size: 247 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

nicholasbair/scrape_indeed
CLI gem for scraping job data from indeed => write to an xlsx file
Language: Ruby - Size: 53.7 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 1
