GitHub topics: crawled-data
tonywu7/feedly-link-aggregator
Scrapy project for collecting hyperlinks from RSS feeds using feedly's Streams API
Language: Python - Size: 6.37 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 20 - Forks: 3

mmourafiq/philo2vec
An implementation of word2vec applied to [stanford philosophy encyclopedia](http://plato.stanford.edu/)
Language: Python - Size: 34.4 MB - Last synced at: 2 months ago - Pushed at: almost 9 years ago - Stars: 35 - Forks: 7

thesagarsehgal/SwatchBharatUrbanCrawler
This is a Crawler built in Scrapy to crawl over the https://sbmurban.org/ website. This is the repository that crawls ASP.NET websites using Scrapy using the __VIEWSTATE.
Language: Python - Size: 1.24 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

huynhsamha/foody-crawler 📦
⚠️[Unauthorized Data Collecting] ⚠️Crawl https://www.foody.vn/ using NodeJS
Language: JavaScript - Size: 5.26 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 2

heysupratim/android-app-categories
A JSON having 19K Android package name entries with their Play Store Categories. Useful for people looking to create App Category Based things. Eg Smart Launcher
Size: 446 KB - Last synced at: 2 months ago - Pushed at: about 8 years ago - Stars: 6 - Forks: 0

huntertran/seco-storms-maker
Software Ecosystems Word-storms Maker - Generate a storm of word-cloud for projects in Software Ecosystem
Language: Python - Size: 238 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

nasa-jpl-memex/wisdom
General anomaly detection system with interface
Language: HTML - Size: 25.2 MB - Last synced at: over 2 years ago - Pushed at: about 9 years ago - Stars: 8 - Forks: 7

nasa-jpl-memex/ukhack
Code / scripts associated with the MEMEX UK Hack.
Language: Shell - Size: 66.8 MB - Last synced at: over 2 years ago - Pushed at: over 9 years ago - Stars: 2 - Forks: 3

airdipu/GenderGap
This is a project of Computational Social Science of music industry to see the gap of Gender depending on income, releases, booking of venue, work history and so on.
Language: Jupyter Notebook - Size: 58.8 MB - Last synced at: 5 days ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

ysh329/stock-newspaper-crawler
[UNMAINTAINED]Crawl 4 kinds of finance newspaper corpus (from CCSTOCK.CN).
Language: Python - Size: 1.55 MB - Last synced at: 8 months ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 1
