An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: website-crawler

martech-engineer/WebKnoGraph

WebKnoGraph is an open research project that uses data processing, vector embeddings, and graph algorithms to optimize internal linking at scale. Built for both academic and industry use, it offers THE FIRST FULLY transparent, AI-driven framework for improving SEO and site navigation through reproducible methods.

Language: Jupyter Notebook - Size: 367 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 11 - Forks: 3

zebbern/ReconX

🕷️ | ReconX is a Live-Website Crawler made to gather critical information with an option to take a picture of each site crawled!

Language: Python - Size: 57.6 KB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 5 - Forks: 0

sammwyy/SpearCopy

A universal and local phishing toolkit for audit purposes

Language: Python - Size: 6.84 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 21 - Forks: 2

flulemon/sneakpeek

Sneakpeek is a framework that helps to quickly and conviniently develop scrapers. It’s the best choice for scrapers that have some specific complex scraping logic that needs to be run on a constant basis

Language: Python - Size: 19.7 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 37 - Forks: 0

X-SLAYER/Website-Cloner

It allows you to download a website from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer.

Language: Visual Basic .NET - Size: 1.11 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 307 - Forks: 89

oxylabs/web-scraping-php

A tutorial and code samples of web scraping with PHP

Language: PHP - Size: 26.4 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 9 - Forks: 3

Deependra-Patel/websiteCrawler

Crawls a website to generate insights

Language: Go - Size: 11.7 KB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

reineimi/va2crawl

Website crawler, validator and SEO optimizer

Language: Shell - Size: 16.6 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

1970Mr/link-crawler

Web Link Crawler: A Python script to crawl websites and collect links based on a regex pattern. Efficient and customizable.

Language: Python - Size: 32.2 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 2

ursnj/seo-master

SEO Master is a powerful all-in-one tool developed to boost your website's visibility and rankings. With features like automatic sitemap generation, customizable robots.txt creation, SEO-optimized metadata, Image assets generation and seamless integration with major search engines.

Language: TypeScript - Size: 162 KB - Last synced at: 25 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

chandrasekharan98/Multisite-Python-Crawler

An almost generic web crawler built using Scrapy and Python 3.7 to recursively crawl entire websites.

Language: Python - Size: 15.6 KB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 5

MLArtist/WebScraper

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

Language: Python - Size: 43.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 61 - Forks: 14

mishqatabid/Domain-Email-Harvesting-Tool

Email Harvesting Tool designed to efficiently gather and validate emails from specified websites

Language: Python - Size: 52.7 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

foomo/walker

Crawls website and collect SEO relevant data

Language: Go - Size: 188 KB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

vlmaier/marvel-snap-scrapr

Scraper for https://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.

Language: Python - Size: 31.3 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 5

sergeymusenko/simple-crawler

Simple website crawler to get Meta tags and <H1> on Python

Language: Python - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MattMoony/image-grabber

Grabs images off webpages.

Language: Python - Size: 1.95 KB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

JohnDiGriz/WebstoreParser

Parses data using json file as instruction and writes to SQL server database

Language: C# - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

spypunk/sponge

sponge is a website crawler and links downloader command-line tool

Language: Kotlin - Size: 267 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

vlOd2/LightshotScraper

The most advanced Lightshot (or prnt.sc) scraper ever!

Language: Java - Size: 3.35 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 3

AmaanHaider/News-crawler

Language: JavaScript - Size: 3.71 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

dinocajic/bash-crawler

Created a website-crawler in bash. Note, it's for a specific website and will not work unless you know the site.

Language: Shell - Size: 19.5 KB - Last synced at: about 2 years ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

vlOd2/ImgurScraper

The most advanced Imgur scraper ever!

Language: Java - Size: 189 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

radityaharya/sitesweeper

Sitesweeper is a python package to help you automate your web scraping process, outputting pages to a file

Language: Python - Size: 9.77 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ZKAW/website-crawler

Recursive website crawler

Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

Dyzio18/java-web-bot-library

Java website crawler - library for analyze and testing websites

Language: Java - Size: 885 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

Mediashare/crawler

:dizzy: Crawl urls from a webpage and provide a DomCrawler with Scraper Library

Language: PHP - Size: 40 KB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 3 - Forks: 1

Hem1700/Website-crawler

Language: Python - Size: 6.07 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

shubham-gaur/Crawler

Crawler for "www.mydala.com"

Language: Python - Size: 37.1 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0