Topic: "robots-txt"
PuerkitoBio/gocrawl
Polite, slim and concurrent web crawler.
Language: Go - Size: 410 KB - Last synced at: 11 days ago - Pushed at: about 4 years ago - Stars: 2,048 - Forks: 193

eliasdabbas/advertools
advertools - online marketing productivity and analysis tools
Language: Python - Size: 23.1 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1,234 - Forks: 226

PuerkitoBio/fetchbot
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
Language: Go - Size: 2.02 MB - Last synced at: 3 days ago - Pushed at: about 4 years ago - Stars: 789 - Forks: 93

nuxt-modules/robots
Tame the robots crawling and indexing your Nuxt site.
Language: TypeScript - Size: 4.49 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 477 - Forks: 40

temoto/robotstxt
The robots.txt exclusion protocol implementation for Go language
Language: Go - Size: 94.7 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 274 - Forks: 56

TurnerSoftware/InfinityCrawler
A simple but powerful web crawler library for .NET
Language: C# - Size: 326 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 252 - Forks: 37

crawler-commons/crawler-commons
A set of reusable Java components that implement functionality common to any web crawler
Language: Java - Size: 3.73 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 244 - Forks: 80

spatie/robots-txt
Determine if a page may be crawled from robots.txt, robots meta tags and robot headers
Language: PHP - Size: 134 KB - Last synced at: 19 days ago - Pushed at: 27 days ago - Stars: 236 - Forks: 42

GateNLP/ultimate-sitemap-parser
Ultimate Website Sitemap Parser
Language: Python - Size: 409 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 217 - Forks: 69

thedaviddias/llms-txt-hub
🤖 The largest directory for AI-ready documentation and tools implementing the proposed llms.txt standard
Language: TypeScript - Size: 36.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 211 - Forks: 64

alexjc/weboptout
Opt-Out tool to check Copyright reservations in a way that even machines can understand.
Language: Python - Size: 75.2 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 194 - Forks: 1

beb7/gflare-tk
Open-Source Python Based SEO Web Crawler
Language: Python - Size: 39 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 172 - Forks: 18

samclarke/robots-parser
NodeJS robots.txt parser with support for wildcard (*) matching.
Language: JavaScript - Size: 506 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 156 - Forks: 19

healsdata/ai-training-opt-out
Known tags and settings suggested to opt out of having your content used for AI training.
Language: HTML - Size: 40 KB - Last synced at: 7 months ago - Pushed at: 12 months ago - Stars: 130 - Forks: 3

seantomburke/sitemapper
Parse through any sitemap in Node.js
Language: TypeScript - Size: 1.75 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 119 - Forks: 75

alextim/astro-lib
Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.
Language: TypeScript - Size: 1.34 MB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 118 - Forks: 6

jimsmart/grobotstxt
grobotstxt is a native Go port of Google's robots.txt parser and matcher library.
Language: Go - Size: 238 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 110 - Forks: 7

mdreizin/gatsby-plugin-robots-txt
Gatsby plugin that automatically creates robots.txt for your site
Language: JavaScript - Size: 3.92 MB - Last synced at: about 6 hours ago - Pushed at: over 1 year ago - Stars: 106 - Forks: 27

nasa-gcn/remix-seo Fork of balavishnuvj/remix-seo
Collection of SEO utilities like sitemap, robots.txt, etc. for a Remix application. Forked from https://github.com/balavishnuvj/remix-seo
Language: TypeScript - Size: 69.3 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 91 - Forks: 10

samber/the-great-gpt-firewall
🤖 A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs
Language: Python - Size: 3.32 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 90 - Forks: 6

jonasjacek/robots.txt
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
Size: 135 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 87 - Forks: 38

t1gor/Robots.txt-Parser-Class
Php class for robots.txt parse
Language: PHP - Size: 658 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 83 - Forks: 28

LexiestLeszek/scrapeGPT
ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.
Language: Python - Size: 62.5 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 12

liameno/librengine
Privacy Web Search Engine (not meta, own crawler)
Language: C++ - Size: 21.6 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 71 - Forks: 4

ekalinin/robots.js
Parser for robots.txt for node.js
Language: JavaScript - Size: 71.3 KB - Last synced at: 19 days ago - Pushed at: about 4 years ago - Stars: 67 - Forks: 21

itgalaxy/generate-robotstxt
Generator robots.txt for node js
Language: JavaScript - Size: 2.86 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 66 - Forks: 8

scrapy/protego
A pure-Python robots.txt parser with support for modern conventions.
Language: DIGITAL Command Language - Size: 3.42 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 65 - Forks: 28

MLArtist/WebScraper
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
Language: Python - Size: 43.9 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 61 - Forks: 14

kyr0/astro-launchpad
An Astro project template for decent projects: auth, i18next, Bootstrap, sitemap, webworker, robots.txt, preact, react, endpoints, endpoint clients, OAuth, various Astro features and data loading preconfigured
Language: CSS - Size: 11.6 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 13

mhmdiaa/waybackrobots
Enumerate old versions of robots.txt paths using Wayback Machine for content discovery
Language: Go - Size: 4.88 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 47 - Forks: 5

akashblackhat/dark_web.py
Dark Web Informationgathering Footprinting Scanner and Recon Tool Release. Dark Web is an Information Gathering Tool I made in python 3. To run Dark Web, it only needs a domain or ip. Dark Web can work with any Linux distros if they support Python 3. Author: AKASHBLACKHAT(help for ethical hackers)
Language: Python - Size: 115 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 47 - Forks: 5

jirkapinkas/jsitemapgenerator
Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming
Language: Java - Size: 256 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 12

itgalaxy/robotstxt-webpack-plugin
A webpack plugin to generate a robots.txt file
Language: JavaScript - Size: 784 KB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 35 - Forks: 7

k3ldar/.NetCorePluginManager
.Net Core Plugin Manager, extend web applications using plugin technology enabling true SOLID and DRY principles when developing applications
Language: C# - Size: 15.3 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 31 - Forks: 19

TurnerSoftware/RobotsExclusionTools
A "robots.txt" parsing and querying library for .NET
Language: C# - Size: 226 KB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 29 - Forks: 9

LuXDAmore/nuxt-humans-txt
🧑🏻👩🏻 "We are people, not machines" - An initiative to know the creators of a website. Contains the information about humans to the web building - A Nuxt Module to statically integrate and generate a humans.txt author file - Based on the HumansTxt Project.
Language: JavaScript - Size: 3.9 MB - Last synced at: 22 days ago - Pushed at: over 3 years ago - Stars: 29 - Forks: 1

VIPnytt/RobotsTxtParser
An extensible robots.txt parser and client library, with full support for every directive and specification.
Language: PHP - Size: 526 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 6

EngincanV/SeoHelper
This package helps you to add meta-tags, sitemap.xml and robots.txt into your project easily.
Language: C# - Size: 35.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 24 - Forks: 2

abdellahrk/SeoBundle
A complete SEO solution for Symfony projects. This bundle handles meta tags, Open Graph, Twitter Cards, canonical URLs, sitemaps, and more—helping your app stay search-engine friendly and socially shareable out of the box.
Language: PHP - Size: 4.67 MB - Last synced at: 21 days ago - Pushed at: about 2 months ago - Stars: 23 - Forks: 0

elchiconube/generate-robots-txt
The right robots.txt file for your project
Language: TypeScript - Size: 1.6 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 22 - Forks: 4

p0dalirius/RobotsValidator
A python script to check if URLs are allowed or disallowed by a robots.txt file.
Language: Python - Size: 190 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 21 - Forks: 2

bnomei/kirby3-robots-txt
Manage the robots.txt from the Kirby config file
Language: PHP - Size: 189 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 20 - Forks: 0

OwenOrcan/YiraBot-Crawler
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.
Language: Python - Size: 221 KB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 19 - Forks: 0

brutuscat/medusa Fork of chriskite/anemone 📦
- THIS IS AN OLD FORK - Checkout Medusa Crawler gem instead "medusa-crawler"
Language: Ruby - Size: 319 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 9

mguinea/laravel-robots
Laravel package to manage robots
Language: PHP - Size: 524 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 4

stormid/robotify-netcore
Provides robots.txt middleware for .NET core
Language: C# - Size: 70.3 KB - Last synced at: 27 days ago - Pushed at: almost 3 years ago - Stars: 16 - Forks: 5

fooock/robots.txt
:robot: robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API
Language: Java - Size: 1.92 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 2

samclarke/robotstxt
Go robots.txt parser
Language: Go - Size: 17.6 KB - Last synced at: 12 months ago - Pushed at: over 7 years ago - Stars: 16 - Forks: 7

ACP-CODE/astro-robots
A reliable robots.txt generator for Astro projects, offering zero-config setup and Verified Bots support.
Language: TypeScript - Size: 240 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 14 - Forks: 3

ravern/gollum
Robots.txt parser and fetcher for Elixir
Language: Elixir - Size: 29.3 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 14 - Forks: 11

momenbasel/pyrobots
a tool that gets all paths at robots.txt and opens it in the browser.
Language: Python - Size: 462 KB - Last synced at: 7 days ago - Pushed at: almost 6 years ago - Stars: 14 - Forks: 7

tractorcow/silverstripe-robots
Simple robots generation module for Silverstripe (SS 4 and above)
Language: PHP - Size: 24.4 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 6

chrisakroyd/robots-txt-parser
A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.
Language: JavaScript - Size: 71.3 KB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 9

marcortola/behat-seo-contexts
Behat extension for testing some On-Page SEO factors: meta title/description, canonical, hreflang, meta robots, robots.txt, redirects, sitemap validation, HTML validation, performance...
Language: PHP - Size: 150 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 4

ProgressPlanner/eco-friendly-robots-txt
Optimizes your site's robots.txt to reduce server load and CO2 footprint by blocking unnecessary crawlers while allowing major search engines and specific tools.
Language: PHP - Size: 64.5 KB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 12 - Forks: 0

alexander-irbis/robots_txt
Lightweight robots.txt parser and generator written in Rust.
Language: Rust - Size: 42 KB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 5

crwlrsoft/robots-txt
Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping
Language: PHP - Size: 32.2 KB - Last synced at: 27 days ago - Pushed at: 4 months ago - Stars: 11 - Forks: 2

ChrisWinters/multisite-robotstxt-manager
A Multisite Robots.txt Manager - Quickly and easily manage all robots.txt files on a WordPress Multisite Website Network.
Language: PHP - Size: 2.36 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 11 - Forks: 6

schliesser/sitecrawler
TYPO3 sitemap crawler
Language: PHP - Size: 80.1 KB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 1

Lexxrt/Blue
🕵️♂️ɪɴғᴏʀᴍᴀᴛɪᴏɴ ɢᴀᴛʜᴇʀɪɴɢ ᴛᴏᴏʟ🕵️♂️
Language: Python - Size: 657 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 3

cansin/next-with-sitemap 📦
Higher order Next.js config to generate sitemap.xml and robots.txt
Language: JavaScript - Size: 1.27 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 3

AleksandrHovhannisyan/eleventy-plugin-robotstxt
Generate a robots.txt file for your Eleventy site
Language: JavaScript - Size: 69.3 KB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 9 - Forks: 0

hrbrmstr/spiderbar
Lightweight R wrapper around rep-cpp for robot.txt (Robots Exclusion Protocol) parsing and path testing in R
Language: C++ - Size: 88.9 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 9 - Forks: 2

Cmastris/robotstxt-change-monitor
Monitor and report changes across one or more robots.txt files.
Language: Python - Size: 76.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 1

Sobak/scrawler
Declarative, scriptable web robot (crawler) and scrapper
Language: PHP - Size: 248 KB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 1

ecnepsnai/Robots.txt-Block-AI
A robots.txt to ask AI from stealing your content
Size: 1000 Bytes - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

PhrozenByte/pico-robots
This is Pico's official robots plugin to add a robots.txt and sitemap.xml to your website. Pico is a stupidly simple, blazing fast, flat file CMS.
Language: PHP - Size: 18.6 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 1

folini/Page-Auditor
"Page Auditor for Technical SEO" is an open source Google Chrome Extension created by Franco Folini. Once you added Page Auditor to your browser it will let you explore and analyze Structured Data, JavaScript scripts, Meta-Tags, Robots.txt and Sitemap.xml files from any webpage. All these elements are critical to improve the on-page SEO.
Language: TypeScript - Size: 9.64 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

adileo/MicroFrontier
A lightweight crawler frontier implementation in TypeScript using Redis.
Language: TypeScript - Size: 263 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

codeAdrian/egghead-challenge-awesome-seo
Awesome SEO snippets - Head markup, robots.txt, sitemap.xml & Google Schema
Size: 9.77 KB - Last synced at: 25 days ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 0

bnomei/kirby-robots-writer 📦
Robots for Kirby CMS
Language: PHP - Size: 9.77 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 2

cokeposada/astro-full-starter
A base Astro project to start building your website quickly and efficiently.
Language: Astro - Size: 862 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

b4dnewz/robots-parse
A lightweight and simple robots.txt parser in node
Language: TypeScript - Size: 647 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

callumbwhyte/friendly-robots
A friendly tool for creating dynamic robots.txt files in Umbraco
Language: C# - Size: 109 KB - Last synced at: 27 days ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

glyn/nginx_robot_access
NGINX robot access module
Language: Rust - Size: 109 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

alejsanc/nexttypes
NextTypes is a standards based information storage, processing and transmission system that integrates the characteristics of other systems such as databases, programming languages, communication protocols, file systems, document managers, operating systems, frameworks, file formats and hardware in a single tightly integrated system using a common data types system.
Language: Java - Size: 9.56 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 6 - Forks: 0

Cyb3r3x3r/Chanakya
Scan websites for multiple things like honeypot, whois , port scan etc...
Language: Python - Size: 36.1 KB - Last synced at: 27 days ago - Pushed at: 8 months ago - Stars: 6 - Forks: 1

A3onn/mapptth
A simple to use multi-threaded web-crawler written in C with libcURL and Lexbor.
Language: C - Size: 205 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

serpwings/pyrobotstxt
pyrobotstxt: Python Package for robots.txt Files
Language: Python - Size: 2.75 MB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

stovv/next-strapi-sitemap
Generate sitemap and robots.txt for NextJS used web hook from STRAPI
Language: JavaScript - Size: 82 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 2

php-middleware/block-robots
Middleware to avoid search engine indexing with PSR-7 using robots.txt and X-Robots-Tag
Language: PHP - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 0

herrbischoff/robots.txt
A sane, minimal robots.txt file (for the western world)
Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

emacs-php/robots-txt-mode
Emacs major mode for editing robots.txt
Language: Emacs Lisp - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 2

aafeher/go-sitemap-parser
Go language library for parsing Sitemaps
Language: Go - Size: 103 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 1

Pytlicek/AI-Data-Guard
AI Data Guard
Language: Python - Size: 53.7 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

matheuscostadesign/guia-seo
Check-list para reunir as principais tags a serem adicionadas na criação de páginas HTML para que os motores de busca façam a indexação do site de forma orgânica.
Size: 81.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

ali-habibzadeh/top10-seo-list-for-developers
The top 10 things developers need to know about SEO
Language: HTML - Size: 182 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 3

eliasdabbas/robotstxt_app
Visual App for Testing URLs and User-agents blocked by robots.txt Files
Language: Python - Size: 24.4 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

simonw/datasette-block-robots
Datasette plugin that blocks robots and crawlers using robots.txt
Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

lucas-bogos/virtual-store
Ecommerce construido para um projeto integrador da faculdade
Language: SCSS - Size: 1.49 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

hufslion9th/MissingSemester_Crawling
2021 HUFS Missing Semester : Crawling
Language: Python - Size: 6.35 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

austinsonger/sitemapsandrobotsaroundtheweb
Sitemaps and Robots.txt for websites around the world.
Size: 137 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

larevanchedessites/google-robotstxt-ruby
🤖 Ruby gem wrapper around Google Robotstxt Parser C++ library
Language: Ruby - Size: 18.6 KB - Last synced at: 4 days ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 3

amandeepmittal/robotize
Generates a robots.txt
Language: JavaScript - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

advanced-astro/rocketbase
🚀 This Astro template offers more than 'Just the Basics', providing a superior option for starting your next project wit best practices and a set of essential integrations already built-in.
Language: Astro - Size: 628 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 3 - Forks: 0

holysoles/bot-wrangler-traefik-plugin
A Traefik Middleware Plugin that helps you wrangle those pesky LLM data scrapers.
Language: Go - Size: 156 KB - Last synced at: about 16 hours ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

middlewares/robots
PSR-15 middleware to enable/disable the robots of the search engines
Language: PHP - Size: 36.1 KB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 1

GeekInTheNorth/Stott.Optimizely.RobotsHandler
An admin extension for Optimizely CMS 12+ for managing robots.txt on a per site basis.
Language: C# - Size: 118 MB - Last synced at: 26 days ago - Pushed at: 8 months ago - Stars: 3 - Forks: 3

muratgozel/robotstxt-util
RFC 9309 spec compliant robots.txt builder and parser. 🦾 No dependencies, fully typed.
Language: TypeScript - Size: 136 KB - Last synced at: 29 days ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

AntoineGagne/robots
A parser for robots.txt with support for wildcards. See also RFC 9309.
Language: Erlang - Size: 30.3 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2
