Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: web-crawling

breck7/measurementscrawlers

Crawlers for extracting measurements from the web for Scroll datasets

Language: TypeScript - Size: 188 KB - Last synced: about 21 hours ago - Pushed: 2 days ago - Stars: 4 - Forks: 0

omkarcloud/botasaurus

The All in One Framework to build Awesome Scrapers.

Language: Python - Size: 35.2 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 962 - Forks: 85

omkarcloud/botasaurus-starter

🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖

Language: TypeScript - Size: 385 KB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 13 - Forks: 4

godkingjay/selenium-twitter-scraper

This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.

Language: Jupyter Notebook - Size: 155 KB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 84 - Forks: 25

William-Fernandes252/astel

An asyncronous web crawling library for Python

Language: Python - Size: 1.02 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 0 - Forks: 0

spyboy-productions/omnisci3nt

Unveiling the Hidden Layers of the Web – A Comprehensive Web Reconnaissance Tool

Language: Jupyter Notebook - Size: 8.21 MB - Last synced: 7 days ago - Pushed: 8 days ago - Stars: 122 - Forks: 12

SarthakRana/Web-Scraping-in-Python3

This repo contains tutorials for scraping web pages - from simple html files to websites like Instagram, LinkedIn and Twitter.

Language: Python - Size: 6.84 KB - Last synced: 9 days ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

ScrapingAnt/amazon_scraper

Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt

Language: JavaScript - Size: 52.7 KB - Last synced: 11 days ago - Pushed: 2 months ago - Stars: 76 - Forks: 18

tddyer/mlb-statistics-web-crawler

A Scrapy web crawler that gathers lifetime batting statistics for all active players in Major League Baseball (MLB)

Language: Python - Size: 2.45 MB - Last synced: 12 days ago - Pushed: over 3 years ago - Stars: 2 - Forks: 0

crwlrsoft/crawler

Library for Rapid (Web) Crawler and Scraper Development

Language: PHP - Size: 845 KB - Last synced: about 14 hours ago - Pushed: about 2 months ago - Stars: 300 - Forks: 11

TurnerSoftware/InfinityCrawler

A simple but powerful web crawler library for .NET

Language: C# - Size: 326 KB - Last synced: 14 days ago - Pushed: 5 months ago - Stars: 239 - Forks: 35

omkarcloud/omkar-temp-mail

🚀 OMKAR TEMP MAIL HELPS YOU USE TEMPORARY EMAILS. 🤖

Language: Python - Size: 15.6 KB - Last synced: 14 days ago - Pushed: 3 months ago - Stars: 11 - Forks: 4

krisluczka/OSSE

Open Source Search Engine with built-in web/document crawler and an indexing method.

Language: C++ - Size: 58.6 KB - Last synced: 15 days ago - Pushed: 16 days ago - Stars: 1 - Forks: 0

spyboy-productions/PhantomCrawler

Boost website hits by generating requests from multiple proxy IPs.

Language: Python - Size: 1.48 MB - Last synced: 9 days ago - Pushed: 3 months ago - Stars: 43 - Forks: 9

SpeedyShot/capture

An easy-to-use library for the SpeedyShot Capture service.

Language: TypeScript - Size: 536 KB - Last synced: 18 days ago - Pushed: 20 days ago - Stars: 1 - Forks: 0

mike-gee/webtranspose

Web scraping API for building AI applications.

Language: Python - Size: 1.43 MB - Last synced: 6 days ago - Pushed: 4 months ago - Stars: 36 - Forks: 2

jgujerry/python-frameworks

Another curated list of Python frameworks

Language: Python - Size: 10.3 MB - Last synced: 21 days ago - Pushed: 21 days ago - Stars: 56 - Forks: 5

miroshnikov/scrapyteer

Web crawling & scraping framework for Node.js on top of headless Chrome browser

Language: TypeScript - Size: 384 KB - Last synced: 16 days ago - Pushed: 3 months ago - Stars: 18 - Forks: 0

gogoziyishi/Museum-Recommender-System

An advanced recommender system for U.S. museums, using English-language text analytics on TripAdvisor reviews to enhance the visitor experience.

Language: Jupyter Notebook - Size: 7.2 MB - Last synced: 21 days ago - Pushed: 21 days ago - Stars: 0 - Forks: 0

fintech-hub/bancocentralbrasil

💵 💰 :brazil: Informações sobre taxas oficiais diárias de Inflação, Selic, Poupança, Dólar, Dólar PTAX, Euro e Euro PTAX pelo site do Banco Central do Brasil

Language: Python - Size: 182 KB - Last synced: 19 days ago - Pushed: over 2 years ago - Stars: 120 - Forks: 34

Destroyer-official/Network-Information-Toolkit

🌐 Network Information Toolkit: Your all-in-one Python solution for network analysis. Explore IP addresses, DNS records, SSL certificates, and BGP data with ease. Stay efficient and secure with features like port scanning, whois lookup, and web crawling. Uncover valuable insights effortlessly. 🛠️🔍

Language: Python - Size: 512 KB - Last synced: 23 days ago - Pushed: 23 days ago - Stars: 2 - Forks: 0

scrapehero-code/amazon-scraper

A simple web scraper to extract Product Data and Pricing from Amazon

Language: Python - Size: 16.6 KB - Last synced: 22 days ago - Pushed: 11 months ago - Stars: 295 - Forks: 154

serpapi/clauneck

A tool for scraping emails, social media accounts, and much more information from websites using Google Search Results.

Language: Ruby - Size: 34.2 KB - Last synced: 21 days ago - Pushed: 2 months ago - Stars: 141 - Forks: 12

harr1424/Go-Crawl

A utility to crawl specified domains and download .zip files

Language: Go - Size: 8.79 KB - Last synced: 30 days ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

ayakashi-io/ayakashi

:zap: Ayakashi.io - The next generation web scraping framework

Language: TypeScript - Size: 1.24 MB - Last synced: 27 days ago - Pushed: 11 months ago - Stars: 197 - Forks: 8

mgunn001/Information-Visualization-Project Fork of maheshreddykukunooru/Information-Visualization-Project

An interface to visualize the past 10 years of NFL, including player comparisons

Language: JavaScript - Size: 3.09 MB - Last synced: about 1 month ago - Pushed: over 6 years ago - Stars: 1 - Forks: 0

apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Language: TypeScript - Size: 117 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 11,973 - Forks: 501

Joeri-Abbo/python-credly-scraper

This project is a set of Python scripts designed to crawl and extract data from the Credly platform, focusing on skills, organizations, and badges. The scripts allow users to perform searches using command-line arguments, predefined search terms, or skills listed in a JSON file. The collected data is then saved to JSON files for further analysis an

Language: Python - Size: 63.6 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 1

joe-stifler/crawler

Crawler is a Python package that crawls web pages and converts their content into Markdown format, making it easy to create documentation, notes, or other text-based representations. It features domain restrictions, flexible output options, and graph visualization.

Language: Python - Size: 271 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

my8100/scrapyd-cluster-on-heroku

Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO :point_right:

Language: Python - Size: 236 KB - Last synced: about 13 hours ago - Pushed: about 4 years ago - Stars: 122 - Forks: 94

MaxValue/Terpene-Profile-Parser-for-Cannabis-Strains

Parser and database to index the terpene profile of different strains of Cannabis from online databases

Language: Python - Size: 21.4 MB - Last synced: 19 days ago - Pushed: about 1 year ago - Stars: 107 - Forks: 20

Lawhy/Finance 📦

A collection of code and data used in Dr. Hang Zhou's project.

Language: Jupyter Notebook - Size: 153 MB - Last synced: about 2 months ago - Pushed: over 4 years ago - Stars: 1 - Forks: 0

helloMinji/WebCrawling-smartchoice

스마트초이스 국내 품질평가 결과 웹 크롤링 (주 내용 : option value, 테이블 가져오기)

Language: Python - Size: 34.2 KB - Last synced: about 2 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

helloMinji/WebCrawling-sendEmail-AUTO

자신이 가진 파일과 웹사이트의 내용을 비교해, 변경사항이 있으면 이를 메일로 안내 (주 내용 : Crawling, mail send used Outlook)

Language: Python - Size: 8.79 KB - Last synced: about 2 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

brianmadden/krawler

A web crawling framework written in Kotlin

Language: Kotlin - Size: 403 KB - Last synced: about 1 month ago - Pushed: almost 3 years ago - Stars: 130 - Forks: 16

scrapinghub/scrapy-training

Scrapy Training companion code

Language: Python - Size: 103 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 170 - Forks: 46

Kim-src/StockScraper

🚀 주식 정보 수집 프로그램(Toy-Project)

Language: Python - Size: 47.9 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0

Bazserpz/nstbrowser-automation-library

NSTBrowser is an advanced browser for web scraping and automation, offering proxy management and anti-detect features. Compatible with Puppeteer, Playwright, and Selenium, it excels in multi-accounting and bypassing web protections.

Size: 4.88 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 5 - Forks: 0

Bazserpz/Web-Scraping-Challenges

When you do webscraping, data scraping, data collection, you will end up with the need for solve a captcha or anti bot, this browser will help you to pass this and forget about this challenge.

Size: 2.93 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

lekhmanrus/real-shot-pdf

RealShotPDF is a Chrome extension designed to simplify the process of creating PDF documents from web content. The extension allows users to navigate through selected webpages, parse and display links in a tree view, and generate PDFs for the chosen pages. It operates locally without sending any data to external servers.

Language: TypeScript - Size: 406 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

ericdwkim/cash-depot-bot

An automation project that uses Selenium-Java to fetch CSVs from a site and dump them into a fileshare

Language: Java - Size: 502 KB - Last synced: 3 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

anxxos/scraper-metro-stations

En esta práctica se ha abordado el problema de construir un scraper de la página web del Consorcio Regional de Transportes de Madrid (CRTM) para obtener información sobre las estaciones de Metro y Metro Ligero.

Language: Python - Size: 162 KB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

ElektroStudios/FHM-Crawler-freehardmusic.com

Crawls download urls of albums from freehardmusic.com website

Language: Visual Basic .NET - Size: 10.5 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2 - Forks: 0

ScaleUnlimited/flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Language: Java - Size: 1.38 MB - Last synced: about 2 months ago - Pushed: about 5 years ago - Stars: 52 - Forks: 18

jonasjacek/robots.txt

Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.

Size: 135 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 80 - Forks: 39

rabbittx/Digikala-Crawler

Digikala Crawlerیک خزنده وب قدرتمند برای جمع‌آوری و تحلیل داده‌های دیجی‌کالا است. این ابزار به تجار و تحلیلگران بازار کمک می‌کند تا به بینش‌های دقیقی از رفتار بازار دست یابند، شامل استخراج داده‌های فروشندگان، محصولات و تحلیل قیمت. مناسب برای تقویت استراتژی‌های بازاریابی و فروش

Language: Python - Size: 6.27 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

arthur3486/born2crawl

A highly performant and versatile crawling engine, designed with scalability and extensibility in mind.

Language: Kotlin - Size: 624 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 8 - Forks: 0

prashantm1535/Complete-Web-Scraping

Web Scraping and Automation using Python and tools such as Selenium, BeautifulSoup, and Chromium

Language: Python - Size: 1000 Bytes - Last synced: 3 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

dchrostowski/autoproxy

Public proxy farm that automatically records and queues suitable proxy servers for web crawling

Language: Python - Size: 401 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 16 - Forks: 5

KitsuneSemCalda/Info-Elixir

A WebCrawler builded to recursive crawling

Language: Elixir - Size: 14.6 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 2 - Forks: 0

MrSeemsGood/Pandas-Selenium-web-crawling-app

PyQt5 app for Selenium-driven web-crawling, Pandas-driven data processing and interaction with Google Sheets and Drive

Language: Python - Size: 141 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

ZeroCool940711/new-frontera Fork of scrapinghub/frontera

A scalable frontier for web crawlers

Language: Python - Size: 7.04 MB - Last synced: 18 days ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

iBug/douban-spider 📦

An alternative solution to « Web Info 2019 » experiment 3

Language: Python - Size: 55.7 KB - Last synced: about 1 month ago - Pushed: over 4 years ago - Stars: 1 - Forks: 0

forhadsidhu/Codeforces_Problemsets_description_Extraction

Web-Scraping, Data collection (problem set description)from codeforces(programming contest platform)

Language: Python - Size: 8.79 KB - Last synced: 4 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

alu0101056944/gcceproject

Business Intelligence school project. Web Scraper with an Apache Hop workflow.

Language: JavaScript - Size: 2.18 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

2uanDM/ChatGPT-Function-Calling-with-Laptop-Seller-Chatbot

This repository keeps code for my hackathon competition in creating ChatGPT integrated chatbot for selling laptop

Language: Python - Size: 7.25 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 1

code-lion-com/go-unhar

Zero dependency golang module and CLI to handle HTTP Archive (HAR) files.

Language: Go - Size: 11.7 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

alyakhtar/Katastrophe

Command Line Tool to download torrents

Language: Python - Size: 322 KB - Last synced: 5 days ago - Pushed: over 7 years ago - Stars: 86 - Forks: 15

KosarTalei/NoSQL

Introduction to Database course project

Language: Python - Size: 13.7 KB - Last synced: 5 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

david-adds/glassesshop-spider

Scrape product url, image link, name and price across multiple pages from glassesshop website with scraPy and store to a SQLite database.

Language: Python - Size: 16.6 KB - Last synced: 5 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

SoheilKhodayari/JAW

JAW: A Graph-based Security Analysis Framework for Client-side JavaScript

Language: JavaScript - Size: 43.1 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 47 - Forks: 7

berntpopp/screen-scout

Automate the process of capturing screenshots of web pages

Language: JavaScript - Size: 8.56 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

crwlrsoft/robots-txt

Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping

Language: PHP - Size: 28.3 KB - Last synced: 18 days ago - Pushed: 7 months ago - Stars: 8 - Forks: 2

excusezmoi/memorizingVocabularyUsingForgettingCurve

A Python program helps you to memorize words based on the psychologist Ebbinghaus's forgetting curve.

Language: Python - Size: 329 KB - Last synced: 4 months ago - Pushed: 6 months ago - Stars: 2 - Forks: 0

saifalimz/sudobotz.com

Transforming Ideas into Intelligent Automation

Language: SCSS - Size: 11.8 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 2 - Forks: 0

gayathri-pan/rateGain

Welcome to my solution for the web scraping hackathon! In this challenge, I developed a program using Python and the Scrapy library to extract specific information from the "https://rategain.com/blog" webpage.

Language: Python - Size: 21 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

omkarcloud/dentalkart-scraper

🚀 SCRAPE 1000'S OF PRODUCTS FROM DENTALKART 🤖

Language: Python - Size: 903 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 2 - Forks: 1

omkarcloud/multiple-account-generation-template

🚀 THIS WEB SCRAPING TEMPLATE PROVIDES YOU WITH A GREAT STARTING POINT WHEN CREATING MULTIPLE ACCOUNTS ON A WEBSITE. 🤖

Language: Python - Size: 2.04 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 2 - Forks: 1

adil6572/Web-scraping-projects

This GitHub repository hosts a collection of my web scraping projects, showcasing various techniques and tools used to extract data from websites. Explore these projects to learn about web scraping, data extraction, and data analysis

Size: 19.5 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

yanbin43/Crawler-Selenium

Web crawler demo with python Selenium

Language: Jupyter Notebook - Size: 314 KB - Last synced: 7 months ago - Pushed: over 2 years ago - Stars: 3 - Forks: 0

jrbadiabo/Bet-on-Sibyl

Machine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)

Language: Jupyter Notebook - Size: 17 MB - Last synced: 6 months ago - Pushed: over 7 years ago - Stars: 233 - Forks: 91

shubhpawar/Web-Crawler-for-Drug-Interaction-Data

Crawling drug-drug interaction data from WebMD.com and Drugs.com.

Language: Python - Size: 4.88 KB - Last synced: 7 months ago - Pushed: about 6 years ago - Stars: 3 - Forks: 3

prakharchoudhary/fun_with_python

My adventures with python!!

Language: Jupyter Notebook - Size: 2.67 MB - Last synced: 7 months ago - Pushed: almost 6 years ago - Stars: 5 - Forks: 1

robertciotoiu/fundanl-home-alert

Sends notifications when new apartments are available on Funda.nl for a given search.

Language: Java - Size: 97.7 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

oxylabs/pricing-data-collection-from-ecommerce-stores

Appache Airflow DAGs for e-commerce pricing collection.

Language: Python - Size: 12.7 KB - Last synced: 29 days ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

Ankush-Chander/github-crawler

Crawl information from github in friendly manner.

Language: Python - Size: 17.6 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 0

shokoofa-ghods/Web-Crawling_Text-Proccessing

simple text processing program which crawls imdb and extracts keywords with TextRank algorithm and crawls Digikala special offers and extracts some feature and shows them on web using Django framework

Language: Python - Size: 1.41 MB - Last synced: 8 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

jLevere/async_crawling

learning to use python asyncio to make web requests

Language: Python - Size: 25.4 KB - Last synced: 8 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

AidaLog/Sitemap-Generator

CLI tool for sitemap generation

Language: Python - Size: 16.6 KB - Last synced: about 2 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 1

dstark5/gnews-scraper

GNewsScraper is a TypeScript package that scrapes article data from Google News based on a keyword or phrase. It returns the results as an array of JSON objects, making it convenient to access and use the scraped information

Language: TypeScript - Size: 153 KB - Last synced: 28 days ago - Pushed: 9 months ago - Stars: 5 - Forks: 3

Manu-sh/http_normalizer_parts

http url normalization utilities for web crawlers

Language: C++ - Size: 51.8 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

AN0NCER/pwa-sitemap

Репозиторий с веб-скрапером на Node.js и Puppeteer. Создайте sitemap.xml для индексации веб-сайтов.

Language: JavaScript - Size: 0 Bytes - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 0

talaatmagdyx/socials_regex

🪡 Social account detection and extraction in ruby, e.g. for crawling/scraping.

Language: Ruby - Size: 45.9 KB - Last synced: 1 day ago - Pushed: 5 months ago - Stars: 8 - Forks: 0

m1/smap

smap is a site-mapping engine written in Go.

Language: Go - Size: 9.77 KB - Last synced: 26 days ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0

zytedata/spidyquotes

Example site for web scraping tutorials

Language: Julia - Size: 223 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 30 - Forks: 15

tal95shah/OLX_Scraper

:radio: An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Language: Python - Size: 127 KB - Last synced: 8 months ago - Pushed: about 3 years ago - Stars: 17 - Forks: 7

1989ONCE/Discount-Expert

1101 Course IM2028 Python Final Project - Web Crawling of Lativ Website and Simply Data Analysis

Language: Jupyter Notebook - Size: 906 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

will-huynh/linkedin_jobs_crawler

A script made to investigate crawling techniques using LinkedIn. In this case, collects data on a job search for entries containing a job poster.

Language: Python - Size: 44.9 KB - Last synced: 9 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

0xFORK/WebScraper Fork of Prempeh-Gyan/WebScraper

Jsoup: API for Web Scraping / Web Crawling / HTML Parsing

Language: Java - Size: 138 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 1

joshchang0111/Fake-EmoReact-2021-Dataset-Collection

Code for data collection of the FakeEmoReact-2021 Challenge.

Language: Python - Size: 78.1 KB - Last synced: 9 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

oxgl/c4k

Kotlin Web Crawler library

Language: Kotlin - Size: 119 KB - Last synced: 9 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

alisoltanirad/web-scraping

Web Scraping Projects

Language: Python - Size: 28.3 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

ishmam-hossain/Website-Crawler

web-crawling using python from website with sitemap and pagination

Language: Python - Size: 1.95 KB - Last synced: 9 months ago - Pushed: about 6 years ago - Stars: 1 - Forks: 0

ahujaya/Wrangle-and-Analyze-Twitter-Data-Python

The dataset that I will be wrangling, analyzing and visualizing is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.

Language: Jupyter Notebook - Size: 25.2 MB - Last synced: 9 months ago - Pushed: almost 3 years ago - Stars: 1 - Forks: 0

ahujaya/Web-Logs-Exploratory-Data-Analysis-and-Web-Crawling-Python

Web Logs Exploratory Data Analysis & Web Crawling of citation information from Google Scholar

Language: Jupyter Notebook - Size: 3.02 MB - Last synced: 9 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

ahujaya/Web-Logs-Unsupervised-and-Supervised-Machine-Learning-Association-Rule-Mining-ARIMA-Prediction

Web Logs Data Unsupervised, Supervised Learning, Association Rule Mining & ARIMA Prediction. Web Crawling of citation information from Google Scholar

Language: Jupyter Notebook - Size: 4.48 MB - Last synced: 9 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

silvafj/domain-names-mining 📦

Web crawling for domain names data mining

Size: 454 MB - Last synced: 9 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

Code-Crusher-LLC/FeedFox

A Python tool that generates RSS/ATOM feeds from web pages with JS support using a headless browser. Allows customizable templates, bundling feeds, and is powered by GitHub Actions and GitHub Pages.

Language: Python - Size: 42 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

mirkomantovani/web-search-engine-UIC

CS 582 Information Retrieval at University of Illinois at Chicago. Multithreaded crawling of UIC domain, inverted index, page rank, SEO with Context Pseudo-Relevance Feedback

Language: Python - Size: 104 MB - Last synced: 10 months ago - Pushed: over 5 years ago - Stars: 14 - Forks: 4

kawsarlog/projectMapsData

🐍🗺️ This Python script empowers you to scrape data from Google Maps, enabling extraction of valuable information like addresses, reviews, and ratings. 📋🏢⭐

Language: Python - Size: 9.77 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0