An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: robots-txt

Zorger27/Reactorium-3D

🔮 Reactorium 3D is a “React laboratory in the space of three dimensions,” where React, Three.js, and React Three Fiber merge to create interactive worlds filled with shapes, motion, and light. 🧪 Each application here is a small experiment in a three-dimensional environment!

Language: JavaScript - Size: 23.8 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

rishijha/sitemap-harvester

🗺️ Harvest URLs and metadata from website sitemaps efficiently with this fast Python tool. Get organized insights for your digital projects.

Size: 32.2 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

meysam81/sitemap-harvester

Crawl sitemap of a given website and export metadata of its pages recursively into CSV format.

Language: Python - Size: 43.9 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

j-plugins/robots-txt-plugin

Intellij IDEA Plugin for Robots.txt technology

Language: Java - Size: 2.51 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 1

ngUnick/theofanis_papadopoulos_plastic_surgeon

Informational, accessibility-first website for a plastic surgeon in Greece. Static HTML/CSS/JS with strong SEO (sitemap/robots/schema), Core Web Vitals optimizations, and content designed to comply with Greek medical advertising ethics.

Language: HTML - Size: 5.96 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

seantomburke/sitemapper

Parse through any sitemap in Node.js

Language: TypeScript - Size: 1.56 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 124 - Forks: 78

elchiconube/generate-robots-txt

The right robots.txt file for your project

Language: TypeScript - Size: 3.13 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 25 - Forks: 3

eliashaeussler/typo3-sitemap-robots

🤖 Extension for TYPO3 CMS to inject XML sitemaps into robots.txt

Language: PHP - Size: 1.42 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

thedaviddias/llms-txt-hub

🤖 The largest directory for AI-ready documentation and tools implementing the proposed llms.txt standard

Language: TypeScript - Size: 42.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 593 - Forks: 343

Zorger27/zorger27.github.io

✨ This modern and adaptive Landing Page succinctly and visually presents my professional information. 🌟 The site showcases my skills in creating modern, responsive and functional web applications, highlighting attention to detail and quality. 🚀

Language: HTML - Size: 23.4 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Cryptocurrencies

💎 Cryptocurrencies in Real-Time — Interactive and Visual! 🚀 This web application transforms cryptocurrency rate monitoring into an engaging experience. 📈 Using the CoinGecko API, it displays up-to-date cryptocurrency rates in USD, offering users flexibility and interactivity.

Language: Vue - Size: 11.5 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Vue-Threejs-2

🪐 CuboVerse (a combination of the Spanish word "cubo" (meaning "cube") and the English word "universe" (meaning "universe"), i.e., "the universe of cubes") is an interactive 3D platform that allows users to observe and control various cube models in a virtual space.

Language: Vue - Size: 15.9 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

Zorger27/Currencies

💹 This modern web application is designed to provide a convenient and visual display of current foreign exchange rates. The app retrieves real-time data from the National Bank of Ukraine API, displaying exchange rates relative to the Ukrainian hryvnia (UAH) in innovative and interactive formats.

Language: Vue - Size: 9.65 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Weather

☀️ Custom-built weather forecasting web app that delivers real-time data from OpenWeather for any city worldwide. 🌈 Whether you're a tech enthusiast or just curious about the weather, this app has something for everyone! ⛄️

Language: Vue - Size: 8.6 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

Zorger27/3dConfigurator

👑 A 3D configurator is an innovative online technology that enables users to interact with 3D product models in real-time. 💎 It’s a powerful tool for businesses that allows your customers to customize products to their preferences. 🌈

Language: Vue - Size: 15.5 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2 - Forks: 1

Zorger27/React-JS-Template

🌐 A convenient and flexible starter template for building modern single-page applications (SPA), powered by a cutting-edge tech stack (React 19). It comes with offline support (PWA) and fast loading, delivering excellent performance even on slow networks. 🌟

Language: JavaScript - Size: 10.2 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Reactorium

⚛️ Reactorium (a blend of "React" (the UI library) and "laboratorium" (Latin for "laboratory, a place for experiments"), meaning "React laboratory") — a secret laboratory of React experiments. The project demonstrates how modern technologies can be used to build complete and diverse solutions. 🚀

Language: JavaScript - Size: 13.3 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Couture-Metaverse

✨ A Couture Metaverse 3D is a unique platform for creating and customizing 3D models! 👗 Choose colors, add textures, apply branding, and create stylish looks! 🕶️ Experiment, mix and customize – your design, your rules! 🚀

Language: Vue - Size: 70.3 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Vue-Threejs-Template

🌐 A convenient and powerful starter template for creating modern Single Page Applications (SPA) with 3D graphics integration. 🚀 Thanks to its flexibility and scalability, this solution is perfect for any web project that supports 3D graphics! 💻🌟

Language: Vue - Size: 9.75 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Vue-JS-Template

🌐 A convenient and flexible starter template for building modern single-page applications (SPA). 🚀 With its intuitive structure and code flexibility, this template is perfect for quickly starting new projects or developing complex web applications! 🌟

Language: Vue - Size: 4.51 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Vue-Threejs-1

🌐 A web application with a dynamic and interactive 3D model, built using the advanced Three.js library. 🎲 The colorful rotating cube allows users to take full control of the model.

Language: Vue - Size: 16.1 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/Vue-TS-Template

🌐 A powerful starter template for developing Single Page Applications (SPA), perfectly suited for modern web projects. 🚀 Easy customization and scalability make this template the perfect foundation for building complex and feature-rich applications.

Language: Vue - Size: 11.8 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

Zorger27/CV-Vue

👨‍💻 This is a modern single-page application (SPA) that provides detailed and visually appealing information about me covers everything that matters: Detailed resume and Extra section (examples of completed works). 🚀

Language: Vue - Size: 172 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

temoto/robotstxt

The robots.txt exclusion protocol implementation for Go language

Language: Go - Size: 90.8 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 277 - Forks: 57

janhq/OpenCrawl

🌐 OpenCrawl: An ethical, high-performance web crawler built for scale A powerful web crawling library that respects robots.txt and rate limits while leveraging Kafka for high-throughput data processing. Built with ethics and efficiency in mind.

Language: Python - Size: 171 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 11 - Forks: 1

commoncrawl/robotstxt-experiments

How is the Robots Exclusion Protocol (robots.txt) used in the WWW? This projects tries to get some insights mining Common Crawl's robots.txt captures of the years 2016 – 2024.

Language: Jupyter Notebook - Size: 4.68 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

nuxt-modules/robots

Tame the robots crawling and indexing your Nuxt site.

Language: TypeScript - Size: 4.96 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 497 - Forks: 46

holysoles/bot-wrangler-traefik-plugin

A Traefik Middleware Plugin that helps you wrangle those pesky LLM data scraping bots..

Language: Go - Size: 176 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 5 - Forks: 0

WaluigiBSOD/waluigibsod.github.io

My (small and raw) personal website.

Language: HTML - Size: 51.8 KB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

AleksandrHovhannisyan/eleventy-plugin-robotstxt

Generate a robots.txt file for your Eleventy site

Language: JavaScript - Size: 72.3 KB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 9 - Forks: 0

tymrtn/ai-license-wp

Official Wordpress plugin for licensing your site's content automatically via copyright.sh

Language: PHP - Size: 1000 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2 - Forks: 0

alejo7774/seo-analyzer

SEO Analyzer es una app web en Flask que analiza sitios web desde el enfoque SEO. Genera informes visuales sobre títulos, descripciones, encabezados, enlaces, imágenes y palabras clave.

Language: Python - Size: 133 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

hrbrmstr/spiderbar

Lightweight R wrapper around rep-cpp for robot.txt (Robots Exclusion Protocol) parsing and path testing in R

Language: C++ - Size: 88.9 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 2

EngincanV/SeoHelper

This package helps you to add meta-tags, sitemap.xml and robots.txt into your project easily.

Language: C# - Size: 35.2 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 23 - Forks: 2

eliasdabbas/robotstxt_app

Visual App for Testing URLs and User-agents blocked by robots.txt Files

Language: Python - Size: 24.4 KB - Last synced at: 8 days ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

GateNLP/ultimate-sitemap-parser

Ultimate Website Sitemap Parser

Language: Python - Size: 525 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 225 - Forks: 69

itgalaxy/robotstxt-webpack-plugin

A webpack plugin to generate a robots.txt file

Language: JavaScript - Size: 784 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 33 - Forks: 7

samclarke/robots-parser

NodeJS robots.txt parser with support for wildcard (*) matching.

Language: JavaScript - Size: 506 KB - Last synced at: 18 days ago - Pushed at: about 1 year ago - Stars: 159 - Forks: 19

samber/the-great-gpt-firewall

🤖 A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs

Language: Python - Size: 6.12 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 93 - Forks: 7

arvid-berndtsson/robots-txt-analyzer

Modern robots.txt analyzer with instant analysis, security recommendations, and export capabilities. Built with Qwik and deployed on Cloudflare Pages.

Language: TypeScript - Size: 727 KB - Last synced at: 20 days ago - Pushed at: 26 days ago - Stars: 1 - Forks: 0

joshmsimpson/robots-txt

An intuitive Rust library for parsing and querying robots.txt files.

Language: Rust - Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alloy-lab/seo

A comprehensive, framework-agnostic SEO package for modern web applications with React hooks, Next.js integration, SvelteKit support, Payload CMS plugin, and interactive playground.

Language: TypeScript - Size: 314 KB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

mdreizin/gatsby-plugin-robots-txt

Gatsby plugin that automatically creates robots.txt for your site

Language: JavaScript - Size: 3.92 MB - Last synced at: 4 days ago - Pushed at: almost 2 years ago - Stars: 105 - Forks: 27

t1gor/Robots.txt-Parser-Class

Php class for robots.txt parse

Language: PHP - Size: 658 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 85 - Forks: 28

mvp-kit/vite-sitemap-plugin

A Vite plugin for automatic sitemap generation from TanStack Router route tree.

Language: TypeScript - Size: 57.6 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

spatie/robots-txt

Determine if a page may be crawled from robots.txt, robots meta tags and robot headers

Language: PHP - Size: 161 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 249 - Forks: 45

GeekInTheNorth/Stott.Optimizely.RobotsHandler

An admin extension for Optimizely CMS 12+ for managing robots.txt on a per site basis.

Language: C# - Size: 154 MB - Last synced at: about 18 hours ago - Pushed at: about 20 hours ago - Stars: 3 - Forks: 3

SCHEMATXT/SCHEMATXT

SCHEMA.TXT official repo.

Size: 16.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

OwenOrcan/YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

Language: Python - Size: 221 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 20 - Forks: 0

s-thom/create-robots-txt-action

An action to create a robots.txt file from different sources

Language: TypeScript - Size: 2.64 MB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

daddiofaddio/sitemap-extract

Processes XML sitemaps and extracts URLs. Includes features such as support for both plain XML and compressed XML files, multiple input sources, protection against anti-bot measures, multi-threading, and automatic processing of nested sitemaps.

Language: Python - Size: 63.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 3

alextim/astro-lib

Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.

Language: TypeScript - Size: 1.34 MB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 125 - Forks: 7

dindicoelho/scraper-etico

Ethical web scraping library for public price monitoring with automatic robots.txt compliance

Language: Python - Size: 109 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Cmastris/robotstxt-change-monitor

Monitor and report changes across one or more robots.txt files.

Language: Python - Size: 158 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 1

beb7/gflare-tk

Open-Source Python Based SEO Web Crawler

Language: Python - Size: 39 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 181 - Forks: 20

ihuzaifashoukat/llmoptimizer

Generate clean, AI-ready llms.txt files for your website or docs. Supports crawling, sitemaps, static builds, and framework-aware adapters (Next.js, Vite, Nuxt, Astro, Remix). Includes Markdown/MDX docs mode and robots.txt generator for LLM and search crawlers.

Language: TypeScript - Size: 107 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

DilfaThayyil/Crawler

A lightweight web crawler that automates URL extraction and analyzes content with robots.txt and sitemap.xml support.

Language: JavaScript - Size: 330 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Johan4954/SubProbe

⚡ JavaScript-aware crawler for security researchers and bug bounty hunters. Extract hidden endpoints and internal subdomains through static and semantic analysis of JS files. Lightweight. Fast. Sneaky.

Language: JavaScript - Size: 3.01 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 2

advanced-astro/rocketbase

🚀 This Astro template offers more than 'Just the Basics', providing a superior option for starting your next project wit best practices and a set of essential integrations already built-in.

Language: Astro - Size: 722 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 4 - Forks: 0

samclarke/robotstxt

Go robots.txt parser

Language: Go - Size: 17.6 KB - Last synced at: 29 days ago - Pushed at: almost 8 years ago - Stars: 15 - Forks: 7

glyn/nginx_robot_access

NGINX robot access module

Language: Rust - Size: 109 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

crawler-commons/crawler-commons

A set of reusable Java components that implement functionality common to any web crawler

Language: Java - Size: 4.11 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 244 - Forks: 85

peaceiris/docker-images

A collection of Docker images: robotstxt, linuxbrew, gcloud, and psql

Language: Makefile - Size: 76.2 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

nasa-gcn/remix-seo Fork of balavishnuvj/remix-seo

Collection of SEO utilities like sitemap, robots.txt, etc. for a Remix application. Forked from https://github.com/balavishnuvj/remix-seo

Language: TypeScript - Size: 72.3 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 95 - Forks: 11

k3ldar/.NetCorePluginManager

.Net Core Plugin Manager, extend web applications using plugin technology enabling true SOLID and DRY principles when developing applications

Language: C# - Size: 15.9 MB - Last synced at: 15 days ago - Pushed at: 3 months ago - Stars: 31 - Forks: 19

codeAdrian/egghead-challenge-awesome-seo

Awesome SEO snippets - Head markup, robots.txt, sitemap.xml & Google Schema

Size: 9.77 KB - Last synced at: 16 days ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 0

Innmind/Robots.txt

Robots.txt parser

Language: PHP - Size: 213 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

abdellahrk/SeoBundle

A complete SEO solution for Symfony projects. This bundle handles meta tags, Open Graph, Twitter Cards, canonical URLs, sitemaps, and more—helping your app stay search-engine friendly and socially shareable out of the box.

Language: PHP - Size: 4.67 MB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 24 - Forks: 1

eliasdabbas/advertools

advertools - online marketing productivity and analysis tools

Language: Python - Size: 23.1 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 1,255 - Forks: 232

cokeposada/astro-full-starter

A base Astro project to start building your website quickly and efficiently.

Language: Astro - Size: 931 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 11 - Forks: 2

scrapy/protego

A pure-Python robots.txt parser with support for modern conventions.

Language: DIGITAL Command Language - Size: 3.45 MB - Last synced at: 26 days ago - Pushed at: 3 months ago - Stars: 70 - Forks: 28

benwebber/texting-robots-py

Python binding for the Texting Robots robots.txt parser

Language: Python - Size: 22.5 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

bnomei/kirby3-robots-txt

Manage the robots.txt from the Kirby config file

Language: PHP - Size: 240 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 21 - Forks: 0

ACP-CODE/astro-robots

A reliable robots.txt generator for Astro projects, offering zero-config setup and Verified Bots support.

Language: TypeScript - Size: 240 KB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 16 - Forks: 3

toprak-coder/RoboHunt

RoboHunt is a high-performance, concurrent robots.txt scanner designed for security professionals and bug bounty hunters. It efficiently scans a list of subdomains to find accessible directories and paths listed in their robots.txt files, helping you uncover potential vulnerabilities and expand your attack surface.

Language: Go - Size: 4.88 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

chrisakroyd/robots-txt-parser

A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.

Language: JavaScript - Size: 71.3 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 9

healsdata/ai-training-opt-out

Known tags and settings suggested to opt out of having your content used for AI training.

Language: HTML - Size: 40 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 151 - Forks: 4

php-middleware/block-robots

Middleware to avoid search engine indexing with PSR-7 using robots.txt and X-Robots-Tag

Language: PHP - Size: 13.7 KB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

kevinbenabdelhak/WP-Robots-txt-Editor

WP Robots txt Editor est un plugin WordPress idéal pour modifier le fichier robots.txt à partir d'une simple page d'option. Générez un robots.txt par défaut et accédez à de nombreuses fonctionnalités comme le choix des publications, catégories, pages parents/enfants, et plus encore..

Language: PHP - Size: 30.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

itgalaxy/generate-robotstxt

Generator robots.txt for node js

Language: JavaScript - Size: 2.86 MB - Last synced at: 27 days ago - Pushed at: almost 3 years ago - Stars: 65 - Forks: 8

mdthansil/sharp-seo-tools

Sharp SEO Tools is collection of free web tools completely written in Javascript (19 tools available), feel free to use

Language: JavaScript - Size: 1.66 MB - Last synced at: 22 days ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

liameno/librengine

Privacy Web Search Engine (not meta, own crawler)

Language: C++ - Size: 21.6 MB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 72 - Forks: 4

PuerkitoBio/fetchbot

A simple and flexible web crawler that follows the robots.txt policies and crawl delays.

Language: Go - Size: 2.02 MB - Last synced at: 4 months ago - Pushed at: over 4 years ago - Stars: 790 - Forks: 93

simonw/datasette-block-robots

Datasette plugin that blocks robots and crawlers using robots.txt

Language: Python - Size: 20.5 KB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 0

p0dalirius/RobotsValidator

A python script to check if URLs are allowed or disallowed by a robots.txt file.

Language: Python - Size: 190 KB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 22 - Forks: 2

d-prokhorenko/web-security-presentation

Web Security Presentation

Language: JavaScript - Size: 2.48 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

tractorcow/silverstripe-robots

Simple robots generation module for Silverstripe (SS 4 and above)

Language: PHP - Size: 27.3 KB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 13 - Forks: 6

TurnerSoftware/InfinityCrawler

A simple but powerful web crawler library for .NET

Language: C# - Size: 326 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 253 - Forks: 37

gigachad80/TxtRipper

Hunt robots.txt via CLI

Language: Ruby - Size: 60.5 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

ian-wt-blog-examples/django-robots-txt

A django project that serves a robots.txt file.

Language: Python - Size: 10.7 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

Pytlicek/AI-Data-Guard

AI Data Guard

Language: Python - Size: 53.7 KB - Last synced at: 21 days ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 0

ilgiz87/htaccess-robots.txt-joomla

Заготовка файлов .htaccess и robots.txt для Joomla

Size: 18.6 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

thefrosty/wp-block-ai-scrapers

Block all known AI Data Scrapers.

Language: PHP - Size: 350 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

jwmorley73/jwm.robotstxt

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Language: Python - Size: 160 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

PuerkitoBio/gocrawl

Polite, slim and concurrent web crawler.

Language: Go - Size: 410 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 2,048 - Forks: 193

sushantrahate/php-static-website-boilerplate

A lightweight, static website built with PHP, HTML, CSS, and JavaScript.

Language: PHP - Size: 8.79 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

alexrudy/roboto

Library for robots.txt files in Rust

Language: Rust - Size: 29.3 KB - Last synced at: 28 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

zvdy/parsero-go

Parsero is a free script written in Golang which reads the Robots.txt file of a web server and looks at the Disallow entries. The Disallow entries tell the search engines what directories or files hosted on a web server mustn't be indexed.

Language: Go - Size: 32.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

simplecto/sitemap_grabber

A python library to recursively crawl every sitemap.xml for a website. Also handles robots.txt and other well-knowns.

Language: Python - Size: 51.8 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

crwlrsoft/robots-txt

Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping

Language: PHP - Size: 32.2 KB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 11 - Forks: 2