GitHub topics: web-crawling
Kkkkk8S/crawler-scripts
crawler-scripts are a collection of lightweight scripts designed to automate web data extraction. These scripts support various websites and allow users to gather information efficiently without manual effort.
Language: Python - Size: 1.76 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
Language: Python - Size: 28.8 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 5,749 - Forks: 393

brightdata/brightdata-mcp
A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.
Language: JavaScript - Size: 63.5 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 707 - Forks: 99

Changwanseo/GenMine
GenBank Record downloader for taxonomists
Language: Python - Size: 439 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 7 - Forks: 0

serpapi/clauneck
A tool for scraping emails, social media accounts, and much more information from websites using Google Search Results.
Language: Ruby - Size: 34.2 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 181 - Forks: 12

lightfeed/browser-agent
Serverless AI browser agent
Language: TypeScript - Size: 5.67 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

platonai/PulsarRPA
PulsarRPA: An AI-Enabled, Super-Fast, Thread-Safe Browser Automation Solution! 💖
Language: Kotlin - Size: 30.4 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 883 - Forks: 128

apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Language: TypeScript - Size: 140 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 17,966 - Forks: 839

valpere/DataScrapexter
Universal web scraper built with Go featuring advanced anti-detection, ethical compliance, and configuration-driven operation for any website.
Size: 13.7 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

omkarcloud/botasaurus
The All in One Framework to Build Undefeatable Scrapers
Language: Python - Size: 63 MB - Last synced at: 7 days ago - Pushed at: 15 days ago - Stars: 2,004 - Forks: 176

scrapeway/best-web-scraping-api-benchmarks
what is the best web scraping API service? Research through benchmarks
Size: 143 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 1 - Forks: 0

TurnerSoftware/InfinityCrawler
A simple but powerful web crawler library for .NET
Language: C# - Size: 326 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 253 - Forks: 37

crwlrsoft/crawler
Library for Rapid (Web) Crawler and Scraper Development
Language: PHP - Size: 1.02 MB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 364 - Forks: 13

ayakashi-io/ayakashi
:zap: Ayakashi.io - The next generation web scraping framework
Language: TypeScript - Size: 1.24 MB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 214 - Forks: 9

DevWaqarAhmad/Web-Crawling-RAG-AI-Assistant
A Web Crawling RAG AI Assistant that scrapes data from websites, stores it, and uses Retrieval-Augmented Generation (RAG) with LLMs to provide intelligent, context-aware answers. Built with Streamlit for the frontend.
Language: Python - Size: 169 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

atymri/WebCrawler
WebCrawler is a C# console application that recursively scans a website starting from a given URL, collects all discovered links, and saves them to a file. It’s useful for site mapping, link analysis, and content discovery.
Language: C# - Size: 6.84 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

lewisakura/spiderboi
A web crawling library written in TypeScript.
Language: TypeScript - Size: 376 KB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

pinkpixel-dev/deep-research-mcp
A Model Context Protocol (MCP) compliant server designed for comprehensive web research. It uses Tavily's Search and Crawl APIs to gather detailed information on a given topic, then structures this data in a format perfect for LLMs to create high-quality markdown documents.
Language: JavaScript - Size: 201 KB - Last synced at: 28 days ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 1

frederik-uni/docker-cloudflare-bypasser
A simple api that runs within a container that returns user-agent and cookies
Language: Python - Size: 9.77 KB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

leogregianin/bancocentralbrasil 📦
💵 💰 :brazil: Informações sobre taxas oficiais diárias de Inflação, Selic, Poupança, Dólar, Dólar PTAX, Euro e Euro PTAX pelo site do Banco Central do Brasil
Language: Python - Size: 182 KB - Last synced at: 19 days ago - Pushed at: over 3 years ago - Stars: 124 - Forks: 33

AntraTripathi74/Web-Crawler-Chatbot
A sample QnA Chatbot that uses web crawling method to index the content of websites so that information can be retrieved later when needed.
Language: HTML - Size: 335 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ScrapingAnt/amazon_scraper
Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt
Language: JavaScript - Size: 52.7 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 19

HasData/find-urls-from-any-domain
This repository provides practical examples of website link scraping using Python and Node.js.
Language: JavaScript - Size: 328 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

maicolyoirs33/JPG-to-PDF
Simple JPG to PDF Conversion Using Pillow
Language: Python - Size: 870 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

dstark5/gnews-scraper
GNewsScraper is a TypeScript package that scrapes article data from Google News based on a keyword or phrase. It returns the results as an array of JSON objects, making it convenient to access and use the scraped information
Language: TypeScript - Size: 153 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 3

spyboy-productions/omnisci3nt
Omnisci3nt – See What They’ve Tried to Hide Extract deep intelligence from any domain. From subdomains to SSL certs, archived secrets to exposed ports — Omnisci3nt gives you the full picture in seconds.
Language: Python - Size: 8.27 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 253 - Forks: 40

spyboy-productions/PhantomCrawler
Boost website hits by generating requests from multiple proxy IPs.
Language: Python - Size: 1.48 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 67 - Forks: 10

sinanazem/web-crawling-jobinja
This project is a web scraper designed to extract information from the [Jobinja](https://jobinja.ir) website, a popular job listing website similar to Glassdoor and Indeed.
Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

crwlrsoft/robots-txt
Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping
Language: PHP - Size: 32.2 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 11 - Forks: 2

omkarcloud/botasaurus-starter
🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖
Language: TypeScript - Size: 397 KB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 25 - Forks: 9

18520339/web-scraping-with-scrapy
Python web scraping with Scrapy
Language: Python - Size: 479 KB - Last synced at: 3 days ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

alanindra/news-enricher
Extracts news article metadata (title, content, date, journalist, entities) from provided URLs.
Language: Python - Size: 31.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

heleusbrands/InSite
A lightning fast tool for crawling websites and compiling PDFs of their pages
Language: Python - Size: 24.4 KB - Last synced at: 25 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

ilarionkuleshov/fastcrawl
Fast and asynchronous web crawling and scraping library for Python.
Language: Python - Size: 235 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

ScrapingAnt/zoominfo_scraper
Zoominfo scraper with using of rotating proxies and headless Chrome from ScrapingAnt
Language: Python - Size: 7.81 KB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 33 - Forks: 9

my8100/scrapyd-cluster-on-heroku
Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO :point_right:
Language: Python - Size: 236 KB - Last synced at: 12 days ago - Pushed at: about 5 years ago - Stars: 123 - Forks: 87

jgujerry/python-frameworks
Another curated list of Python frameworks
Language: Python - Size: 13 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 61 - Forks: 4

omkarcloud/selenium-2captcha-recaptcha-solver-demo
🚀 FINAL CODE FOR TUTORIAL ON HOW TO SOLVE CAPTCHA IN SELENIUM USING 2CAPTCHA 🤖
Language: Python - Size: 5.86 KB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 2

maxmindlin/scout-lang
A web crawling programming language
Language: Rust - Size: 54.4 MB - Last synced at: 24 days ago - Pushed at: 10 months ago - Stars: 112 - Forks: 6

MaxValue/Terpene-Profile-Parser-for-Cannabis-Strains
Parser and database to index the terpene profile of different strains of Cannabis from online databases
Language: Python - Size: 21.4 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 118 - Forks: 18

N4rr34n6/MetadataHarvester
MetadataHarvester is an advanced file metadata extraction tool designed for cybersecurity professionals, researchers, and analysts.
Language: Python - Size: 21.5 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 1

godkingjay/selenium-twitter-scraper
This is a Twitter Scraper which uses Selenium for scraping tweets. It is capable of scraping tweets from home, user profile, hashtag, query or search, and advanced searches.
Language: Jupyter Notebook - Size: 160 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 246 - Forks: 63

cxcscmu/Craw4LLM
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
Language: Python - Size: 79.1 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 608 - Forks: 56

thesp0nge/nightcrawler-mitm
A python program that crawls a website and tries to stress it, polluting forms with bogus data
Language: Python - Size: 247 KB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 13 - Forks: 1

HRN-Projects/amazon-captcha-solver
A TensorFlow (Deep Learning - CNN) based solution for tackling captcha when collecting data from Amazon.
Language: Python - Size: 35.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 29 - Forks: 13

Fern-Aerell/Web-Crawling-To-TXT
A simple web crawling application that can browse URLs, extract text content, and save the results in TXT format.
Language: Python - Size: 315 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

jrbadiabo/Bet-on-Sibyl
Machine Learning Model for Sport Predictions (Football, Basketball, Baseball, Hockey, Soccer & Tennis)
Language: Jupyter Notebook - Size: 17 MB - Last synced at: 2 months ago - Pushed at: over 8 years ago - Stars: 263 - Forks: 94

scrapinghub/scrapy-training
Scrapy Training companion code
Language: Python - Size: 103 KB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 174 - Forks: 45

supergillis/crawler-ts
Crawler written in TypeScript using ES6 generators.
Language: TypeScript - Size: 60.5 KB - Last synced at: 10 days ago - Pushed at: about 4 years ago - Stars: 12 - Forks: 1

MapCon-RMC/MapCon
Sistema de Mapeamento de Conflitos (MapCon)
Language: JavaScript - Size: 1.43 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 1

omkarcloud/omkar-temp-mail
🚀 OMKAR TEMP MAIL HELPS YOU USE TEMPORARY EMAILS. 🤖
Language: Python - Size: 15.6 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 4

lekhmanrus/real-shot-pdf
RealShotPDF is a Chrome extension designed to simplify the process of creating PDF documents from web content. The extension allows users to navigate through selected webpages, parse and display links in a tree view, and generate PDFs for the chosen pages. It operates locally without sending any data to external servers.
Language: TypeScript - Size: 406 KB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 1

Solrikk/DataDigger
DataDigger is a powerful and intuitive web application designed to extract and analyze data from web pages.
Language: Go - Size: 38.1 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

joe-stifler/crawler
Crawler is a Python package that crawls web pages and converts their content into Markdown format, making it easy to create documentation, notes, or other text-based representations. It features domain restrictions, flexible output options, and graph visualization.
Language: Python - Size: 283 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 1

Mirtia/Inappropriate-YouTube 📦
This repository contains the scripts used to obtain channel YouTube features and analyze potential disturbing channels for the publication "YouTubers Not madeForKids: Detecting Channels Sharing Inappropriate Videos Targeting Children".
Language: Python - Size: 9.04 MB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

jonasjacek/robots.txt
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
Size: 135 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 87 - Forks: 38

oxylabs/pricing-data-collection-from-ecommerce-stores
Appache Airflow DAGs for e-commerce pricing collection.
Language: Python - Size: 16.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

osamikoyo/Geass
web crawler for you, with some api function and configuration
Language: Go - Size: 32.8 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

JeffBla/rent-house-crawler
This is a distributed web crawler using Scrapy, Redis, and Selenium. It is designed to handle various types of websites, including static, AJAX, and dynamic pages.
Language: Python - Size: 307 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

KadekM/scrawler
Scala web crawling and scraping using fs2 streams
Language: HTML - Size: 92.8 KB - Last synced at: 5 days ago - Pushed at: almost 8 years ago - Stars: 16 - Forks: 3

MohamedHmini/tweetsOLAPing
implementing an end-to-end tweets ETL/Analysis pipeline.
Language: Python - Size: 5.99 MB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 57 - Forks: 7

Adityasinghvats/web-crawler
This is a project which mimics the web crawlers used in large browser engines like Chromium and Gecko.
Language: JavaScript - Size: 48.8 KB - Last synced at: 18 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

IAmFarrokhnejad/Murkmaw
A web crawler using Rust.
Language: Rust - Size: 67.7 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

MartinCastroAlvarez/unlam-android-app
Calendar Android native app.
Language: Java - Size: 3.75 MB - Last synced at: 22 days ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

AmirEspahbodi/google-maps-scraper
google map scraper. extract title, phone, address, latitude and longitude, category, website URL, rating, reviews number, email, active_hours, reviews and first picture of listing
Language: Python - Size: 243 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

shokoofa-ghods/Web-Crawling_Text-Proccessing
simple text processing program which crawls imdb and extracts keywords with TextRank algorithm and crawls Digikala special offers and extracts some feature and shows them on web using Django framework
Language: Python - Size: 1.42 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

crwlrsoft/laravel-crawler
Laravel adapter for the crwlr/crawler package.
Language: PHP - Size: 8.79 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

mike-gee/webtranspose
Web scraping API for building AI applications.
Language: Python - Size: 1.43 MB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 41 - Forks: 2

miroshnikov/scrapyteer
Web crawling & scraping framework for Node.js on top of headless Chrome browser
Language: TypeScript - Size: 384 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 0

ruiiary/commercelaw-words
24-2 콘텐츠프로그래밍기초 기말 프로젝트 개인 과제
Language: Jupyter Notebook - Size: 1010 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

msoleimani96/minecraft-skin-scraper
Scraping Minecraft skins from two sources.
Language: Python - Size: 43 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

ahmed-alnassif/net-spider
Net-Spider is a web scraping tool designed to retrieve the source code for a web page, including front-end elements such as JavaScript, CSS, images, and fonts. It allows you to crawl and download the source code from a target website.
Language: Python - Size: 2.65 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

9dl/RobotsSniffer
Tool to analyze and parse website robots.txt for crawler rules.
Language: C# - Size: 48.8 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

islamhafez0/web-crawler
Language: Python - Size: 1.11 MB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

simonpierreboucher/Crawler
A robust, modular web crawler built in Python for extracting and saving content from websites. This crawler is specifically designed to extract text content from both HTML and PDF files, saving them in a structured format with metadata.
Language: Python - Size: 87.9 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

GoTrained/Scrapy-Craigslist
Web Scraping Craigslist's Engineering Jobs in NY with Scrapy
Language: Python - Size: 195 KB - Last synced at: 6 months ago - Pushed at: almost 8 years ago - Stars: 66 - Forks: 37

kunalPisolkar24/IR_Lab
Collection of practical codes for Savitribai Phule Pune University's Information Retrieval Lab (410247) .
Language: Jupyter Notebook - Size: 125 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

zytedata/spidyquotes
Example site for web scraping tutorials
Language: Julia - Size: 229 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 31 - Forks: 18

SuperBruceJia/dynamic-web-crawlering-python
This repo is mainly for dynamic web (Ajax Tech) crawling using Python, taking China's NSTL websites as an example.
Language: Python - Size: 12.8 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 3

SpeedyShot/capture
An easy-to-use library for the SpeedyShot Capture service.
Language: TypeScript - Size: 548 KB - Last synced at: about 6 hours ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

jonathandunn/common_crawl_corpus
Scripts for building a geo-located web corpus using Common Crawl data
Language: Python - Size: 323 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 11 - Forks: 0

breck7/crawlers
Crawlers for extracting measurements from the web for Scroll datasets
Language: JavaScript - Size: 140 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 0

chihiroanihr/COMP479-P4_F2022
Experiments with web crawling, scraping, and indexing a collection of web documents. Clustering the indexed data with k-means algorithm. Each resulting cluster is assigned a sentiment score using AFINN - a sentiment analysis script.
Language: Python - Size: 17.6 MB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Joeri-Abbo/python-credly-scraper
This project is a set of Python scripts designed to crawl and extract data from the Credly platform, focusing on skills, organizations, and badges. The scripts allow users to perform searches using command-line arguments, predefined search terms, or skills listed in a JSON file. The collected data is then saved to JSON files for further analysis an
Language: Python - Size: 63.7 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 2

kluhan/kraken
Kraken is a generic, mid-scale web crawler specifically built to crawl vertical data-sources, like Youtube or the Google Play Store.
Language: Python - Size: 92.8 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

0MeMo07/Web-Crawler
Web Crawler with Python
Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 0

LikithMeruvu/Framework-Docs-AI
Framework Docs AI is a powerful SaaS solution for managing framework documentation. It automatically scrapes documentation, builds a comprehensive knowledge base, and uses advanced language models to provide accurate responses to user queries. Enhance productivity and streamline your documentation process with Framework Docs AI.
Language: Python - Size: 62 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

omkarcloud/puppeteer-captcha-solving-tutorial
🚀 LEARN HOW TO SOLVE CAPTCHA IN PUPPETEER USING CAPSOLVER 🤖
Language: Python - Size: 2.38 MB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

omkarcloud/dentalkart-scraper
🚀 SCRAPE 1000'S OF PRODUCTS FROM DENTALKART 🤖
Language: Python - Size: 908 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 2

omkarcloud/web-scraping-template
🚀 THIS WEB SCRAPING TEMPLATE PROVIDES YOU WITH A GREAT STARTING POINT WHEN CREATING WEB SCRAPING BOTS. 🤖
Language: Python - Size: 104 KB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 3

mustafadalga/website-crawler
Hedef web sitesini tarayarak linklerini listeleyen bir web crawler scripti || A web crawler script that lists links by scanning the target website.
Language: Python - Size: 19.5 KB - Last synced at: 5 days ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 3

alyakhtar/Katastrophe
Command Line Tool to download torrents
Language: Python - Size: 322 KB - Last synced at: 7 months ago - Pushed at: over 8 years ago - Stars: 85 - Forks: 12

SavinRazvan/pagerank
This project implements the PageRank algorithm to rank web pages by importance using two approaches: a sampling method with the Markov Chain random surfer model and an iterative method with a recursive mathematical expression.
Language: Jupyter Notebook - Size: 1.11 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

PPS-22-Scooby/PPS-22-Scooby
Scala application that allows web crawling and web scraping of web pages given as input with the use of special rules passed to it through the use of a DSL.
Language: Scala - Size: 4.3 MB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 7 - Forks: 1

ElektroStudios/FHM-Crawler-freehardmusic.com
Crawls download urls of albums from freehardmusic.com website
Language: Visual Basic .NET - Size: 10.5 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 2

krisluczka/OSSE
Open Source Search Engine with built-in web/document crawler and an indexing method.
Language: C++ - Size: 58.6 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Sushanta-Das/Domain_Specific_Search_Engine Fork of mallickboy/Domain_Specific_Search_Engine
A search engine for Python language , developed with the help of web crawling, Socket programming, Vector database. User will get result of a search query based on vector similarity search in vector database.
Language: Jupyter Notebook - Size: 13.8 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

feluelle/pastebin-crawler-lib
A library for web crawling http://pastebin.com
Language: C# - Size: 11.7 KB - Last synced at: 14 days ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

AusBoone/Web-Vulnerability-Scanner
Aims to identify common security vulnerabilities in web applications.
Language: Python - Size: 9.77 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

ScrapingAnt/alibaba_scraper
Alibaba scraper with using of rotating proxies and headless Chrome from ScrapingAnt
Language: Python - Size: 152 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 3
